Below is a description of the data format and possible entries. The enum values (all caps) are taken care of in the code. The datatypes are also specified in an enum. If you have access to AIO code, look at the files bw-utils.[ch], br-utils.[ch], binpack-general.h, binpack-utils.[ch] for the actual code and utility functions. The bpdump utility provides a clean example of reading the files and outputting the various bits of the elements.
Each binary file consists of a series of elements. Each element consists of an element size followed by a series of tags. The element length includes the length itself. Tag lengths include just the payload. The path represents the path within an HDF-5 file for that item. For attributes, if it is a dataset attribute, it contains the path and name. For group attributes, it is just a path for a group.
Tag values:
NULL_TAG = 0
GRP_TAG = 1
DST_TAG = 2
SCR_TAG = 3
DIR_TAG = 4
VAL_TAG = 5
DSTVAL_TAG = 6
DSTATRN_TAG = 7
DSTATRS_TAG = 8
GRPATRN_TAG = 9
GRPATRS_TAG = 10
Datatypes:
bp_char = 0
bp_short = 1
bp_int = 2
bp_long = 3
bp_longlong = 4
bp_float = 5
bp_double = 6
bp_longdouble = 7
bp_pointer = 8
bp_string = 9
bp_complex = 10
bp_uchar = 50
bp_ushort = 51
bp_uint = 52
bp_ulong = 53
bp_ulonglong = 54
There are 4 types of elements: Scalars, Datasets, Dataset Attributes, and Group Attributes
Scalars have the following format:
4 bytes: length of element
4 bytes: tag (SCR_TAG)
4 bytes: tag length (n)
(n) bytes: name string
4 bytes: tag (DIR_TAG)
4 bytes: tag length (m)
(m) bytes: path string
4 bytes: tag (VAL_TAG)
4 bytes: length (o)
4 bytes: datatype
(o) bytes: data binary
Datasets have the following format:
4 bytes: length of element
4 bytes: tag (DST_TAG)
4 bytes: tag length (n)
(n) bytes: name string
4 bytes: tag (DIR_TAG)
4 bytes: tag length (m)
(m) bytes: path string
4 bytes: tag (DSTVAL_TAG)
4 bytes: size (p)
4 bytes: rank (q) [dimension count]
(3*q*4) bytes: dimensions
4 bytes: datatype
(p) bytes: data binary
Dataset attributes have the following format:
4 bytes: length of element
4 bytes: tag (DSTATRN_TAG or DSTATRS_TAG) [scalar or a string]
4 bytes: tag length (n)
(n) bytes: attribute name
4 bytes: tag (DIR_TAG)
4 bytes: tag length (m)
(m) bytes: path and dataset name
4 bytes: tag (VAL_TAG)
4 bytes: tag length (o)
4 bytes: datatype
(o) bytes: value
Group attributes have the following format:
4 bytes: length of element
4 bytes: tag (GRPATRN_TAG or GRPATRS_TAG) [scalar or a string]
4 bytes: tag length (n)
(n) bytes: attribute name
4 bytes: tag (DIR_TAG)
4 bytes: tag length (m)
(m) bytes: path
4 bytes: tag (VAL_TAG)
4 bytes: tag length (o)
4 bytes: datatype
(o) bytes: value
Since the attributes contain the path and name for the element or group they are attached to, they can come at any point in the series of elements. In general, they seem to come immediately after the element. Right now, the only attributes are dataset string attributes.