The problem of storing information about structure of binary data seems quite prevalent nowadays. As tendency to use non-homogenous programming platforms increases, a need arises to store information about data (meta-information) separately from code and in a format that facilitates extensibility and portability. Such architecture could be illustrated using the following components: binary data, meta data, meta data interpreter and binary data parser. When it comes to efficiency, binary parser turns out to play the key role. No matter how well one describes the data, one can’t dream about efficiently processing heavy loads of it unless the parser is as close as possible to doing plain old fread(&some_struct, 1, sizeof(struct some_struct), f). I’ve recently seen a solution where meta data interpreter was limited to grouping meta information into models of structures and then kept looking up meta-information about fields, arrays, etc. as it went on through binary stream. Obviously it worked but it definitely was a suboptimal solution to say the least. I’ve prepared a draft implementation of binary parser, where data model can be constructed at run-time, yet use of conditional constructs related to meta-information during parsing is eliminated. I’m curious myself whether performance boost will be as significant as I suspect. As soon as I perform some measurements, I’ll get back to the topic.
Source code (BSD licensed):
Efficient Meta Information Based Binary Parser