HDF5 documents and links 
Introduction to HDF5 
HDF5 User's Guide 
HDF5 Reference Manual 
HDF5 Application Developer's Guide 

HDF5 File Format Specification Version 2.0

  1. Introduction
  2. Disk Format Level 0 - File Metadata
    1. Disk Format Level 0A - Format Signature and Superblock
    2. Disk Format Level 0B - File Driver Info
    3. Disk Format Level 0C - Superblock Extension
  3. Disk Format Level 1 - File Infrastructure
    1. Disk Format Level 1A - B-Trees
      1. Version 1 B-Trees (B-link trees)
      2. Version 2 B-Trees
    2. Disk Format Level 1B - Group Symbol Table
    3. Disk Format Level 1C - Group Symbol Table Entry
    4. Disk Format Level 1D - Local Heaps
    5. Disk Format Level 1E - Global Heap
    6. Disk Format Level 1F - Fractal Heap
    7. Disk Format Level 1G - Free-Space Manager
    8. Disk Format Level 1H - Shared Object Header Message Table
  4. Disk Format Level 2 - Data Objects
    1. Disk Format Level 2A - Data Object Headers
      1. Disk Format Level 2A1 - Data Object Header Prefix
        1. Version 1 Data Object Header Prefix
        2. Version 2 Data Object Header Prefix
      2. Disk Format Level 2A2 - Data Object Header Messages
        1. NIL
 
  1. Disk Format Level 2 - Data Objects (Continued)
    1. Disk Format Level 2A - Data Object Headers(Continued)
      1. Disk Format Level 2A2 - Data Object Header Messages(Continued)
        1. Dataspace
        2. Link Info
        3. Datatype
        4. Data Storage - Fill Value (Old)
        5. Data Storage - Fill Value
        6. Link Message
        7. Data Storage - External Data Files
        8. Data Storage - Layout
        9. Bogus
        10. Group Info
        11. Data Storage - Filter Pipeline
        12. Attribute
        13. Object Comment
        14. Object Modification Time (Old)
        15. Shared Message Table
        16. Object Header Continuation
        17. Symbol Table
        18. Object Modification Time
        19. B-tree 'K' Values
        20. Driver Info
        21. Attribute Info
        22. Object Reference Count
    2. Disk Format: Level 2B - Data Object Data Storage
  2. Appendix


Introduction

 
HDF5 Groups
 
  Figure 1: Relationships among the HDF5 root group, other groups, and objects
 
  HDF5 Objects  
  Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces
 

The format of an HDF5 file on disk encompasses several key ideas of the HDF4 and AIO file formats as well as addressing some shortcomings therein. The new format is more self-describing than the HDF4 format and is more uniformly applied to data objects in the file.

An HDF5 file appears to the user as a directed graph. The nodes of this graph are the higher-level HDF5 objects that are exposed by the HDF5 APIs:

At the lowest level, as information is actually written to the disk, an HDF5 file is made up of the following objects:

The HDF5 library uses these low-level objects to represent the higher-level objects that are then presented to the user or to applications through the APIs. For instance, a group is an object header that contains a message that points to a local heap (for storing the links to objects in the group) and to a B-tree (which indexes the links). A dataset is an object header that contains messages that describe datatype, dataspace, layout, filters, external files, fill value, etc with the layout message pointing to either a raw data chunk or to a B-tree that points to raw data chunks.

This Document

This document describes the lower-level data objects; the higher-level objects and their properties are described in the HDF5 User's Guide.

Three levels of information comprise the file format. Level 0 contains basic information for identifying and defining information about the file. Level 1 information contains the information about the pieces of a file shared by many objects in the file (such as a B-trees and heaps). Level 2 is the rest of the file and contains all of the data objects, with each object partitioned into header information, also known as metadata, and data.

The sizes of various fields in the following layout tables are determined by looking at the number of columns the field spans in the table. There are three exceptions: (1) The size may be overridden by specifying a size in parentheses, (2) the size of addresses is determined by the Size of Offsets field in the superblock and is indicated in this document with a superscripted 'O', and (3) the size of length fields is determined by the Size of Lengths field in the superblock and is indicated in this document with a superscripted 'L'.

Values for all fields in this document should be treated as unsigned integers, unless otherwise noted in the description of a field. Additionally, all metadata fields are stored in little-endian byte order.

All checksums used in the format are computed with the Jenkins' lookup3 algorithm.

Various tables in this document aligned with "This space inserted only to align table nicely". These entries in the table are just to make the table presentation nicer and do not represent any values or padding in the file.



Disk Format: Level 0 - File Metadata

Disk Format: Level 0A - Format Signature and Superblock

The superblock may begin at certain predefined offsets within the HDF5 file, allowing a block of unspecified content for users to place additional information at the beginning (and end) of the HDF5 file without limiting the HDF5 library's ability to manage the objects within the file itself. This feature was designed to accommodate wrapping an HDF5 file in another file format or adding descriptive information to an HDF5 file without requiring the modification of the actual file's information. The superblock is located by searching for the HDF5 format signature at byte offset 0, byte offset 512 and at successive locations in the file, each a multiple of two of the previous location, i.e. 0, 512, 1024, 2048, etc.

The superblock is composed of the format signature, followed by a superblock version number and information that is specific to each version of the superblock. Currently, there are three versions of the superblock format. Version 0 is the default format, while version 1 is basically the same as version 0 with additional information when a non-default B-tree 'K' value is stored. Version 2 is the latest format, with some fields eliminated or compressed and with superblock extension and checksum support.

Version 0 and 1 of the superblock are described below:

Superblock (Versions 0 and 1)
byte byte byte byte

Format Signature (8 bytes)

Version # of Superblock Version # of File Free-space Storage Version # of Root Group Symbol Table Entry Reserved (zero)
Version # of Shared Header Message Format Size of Offsets Size of Lengths Reserved (zero)
Group Leaf Node K Group Internal Node K
File Consistency Flags
Indexed Storage Internal Node K1 Reserved (zero)1

Base AddressO


Address of File Free-space InfoO


End of File AddressO


Driver Information Block AddressO

Root Group Symbol Table Entry
(Items marked with an 'O' in the above table are
of the size specified in "Size of Offsets.")
(Items marked with an '1' in the above table are
new in version 1 of the superblock)

Field Name Description
Format Signature

This field contains a constant value and can be used to quickly identify a file as being an HDF5 file. The constant value is designed to allow easy identification of an HDF5 file and to allow certain types of data corruption to be detected. The file signature of an HDF5 file always contains the following values:

Decimal: 137 72 68 70 13 10 26 10
Hexadecimal: 89 48 44 46 0d 0a 1a 0a
ASCII C Notation: \211 H D F \r \n \032 \n

This signature both identifies the file as an HDF5 file and provides for immediate detection of common file-transfer problems. The first two bytes distinguish HDF5 files on systems that expect the first two bytes to identify the file type uniquely. The first byte is chosen as a non-ASCII value to reduce the probability that a text file may be misrecognized as an HDF5 file; also, it catches bad file transfers that clear bit 7. Bytes two through four name the format. The CR-LF sequence catches bad file transfers that alter newline sequences. The control-Z character stops file display under MS-DOS. The final line feed checks for the inverse of the CR-LF translation problem. (This is a direct descendent of the PNG file signature.)

This field is present in version 0+ of the superblock.

Version Number of the Superblock

This value is used to determine the format of the information in the superblock. When the format of the information in the superblock is changed, the version number is incremented to the next integer and can be used to determine how the information in the superblock is formatted.

Values of 0, 1 and 2 are defined for this field. (The format of version 2 is described below, not here)

This field is present in version 0+ of the superblock.

Version Number of the File Free-Space Information

This value is used to determine the format of the information in the file's free-space Information.

The only value currently valid in this field is '0', which indicates that the free space index is formatted as described below.

This field is present in version 0 and 1 of the superblock.

Version Number of the Root Group Symbol Table Entry

This value is used to determine the format of the information in the Root Group Symbol Table Entry. When the format of the information in that field is changed, the version number is incremented to the next integer and can be used to determine how the information in the field is formatted.

The only value currently valid in this field is '0', which indicates that the root group symbol table entry is formatted as described below.

This field is present in version 0 and 1 of the superblock.

Version Number of the Shared Header Message Format

This value is used to determine the format of the information in a shared object header message. Since the format of the shared header messages differs from the other private header messages, a version number is used to identify changes in the format.

The only value currently valid in this field is '0', which indicates that shared header messages are formatted as described below.

This field is present in version 0 and 1 of the superblock.

Size of Offsets

This value contains the number of bytes used to store addresses in the file. The values for the addresses of objects in the file are offsets relative to a base address, usually the address of the superblock signature. This allows a wrapper to be added after the file is created without invalidating the internal offset locations.

This field is present in version 0+ of the superblock.

Size of Lengths

This value contains the number of bytes used to store the size of an object.

This field is present in version 0+ of the superblock.

Group Leaf Node K

Each leaf node of a group B-tree will have at least this many entries but not more than twice this many. If a group has a single leaf node then it may have fewer entries.

This value must be greater than zero.

See the description of B-trees below.

This field is present in version 0 and 1 of the superblock.

Group Internal Node K

Each internal node of a group B-tree will have at least this many entries but not more than twice this many. If the group has only one internal node then it might have fewer entries.

This value must be greater than zero.

See the description of B-trees below.

This field is present in version 0 and 1 of the superblock.

File Consistency Flags

This value contains flags to indicate information about the consistency of the information contained within the file. Currently, the following bit flags are defined:

  • Bit 0 set indicates that the file is opened for write-access.
  • Bit 1 set indicates that the file has been verified for consistency and is guaranteed to be consistent with the format defined in this document.
  • Bits 2-31 are reserved for future use.
Bit 0 should be set as the first action when a file is opened for write access and should be cleared only as the final action when closing a file. Bit 1 should be cleared during normal access to a file and only set after the file's consistency is guaranteed by the library or a consistency utility.

This field is present in version 0+ of the superblock.

Indexed Storage Internal Node K

Each internal node of an indexed storage B-tree will have at least this many entries but not more than twice this many. If the index storage B-tree has only one internal node then it might have fewer entries.

This value must be greater than zero.

See the description of B-trees below.

This field is present in version 1 of the superblock.

Base Address

This is the absolute file address of the first byte of the HDF5 data within the file. The library currently constrains this value to be the absolute file address of the superblock itself when creating new files; future versions of the library may provide greater flexibility. When opening an existing file and this address does not match the offset of the superblock, the library assumes that the entire contents of the HDF5 file have been adjusted in the file and adjusts the base address and end of file address to reflect their new positions in the file. Unless otherwise noted, all other file addresses are relative to this base address.

This field is present in version 0+ of the superblock.

Address of Global Free-space Index

Free-space management is not yet defined in the HDF5 file format and is not handled by the library. Currently this field always contains the undefined address.

This field is present in version 0 and 1 of the superblock.

End of File Address

This is the absolute file address of the first byte past the end of all HDF5 data. It is used to determine whether a file has been accidently truncated and as an address where file data allocation can occur if space from the free list is not used.

This field is present in version 0+ of the superblock.

Driver Information Block Address

This is the relative file address of the file driver information block which contains driver-specific information needed to reopen the file. If there is no driver information block then this entry should be the undefined address.

This field is present in version 0 and 1 of the superblock.

Root Group Symbol Table Entry

This is the symbol table entry of the root group, which serves as the entry point into the group graph for the file.

This field is present in version 0 and 1 of the superblock.

Version 2 of the superblock is described below:

Superblock (Version 2)
byte byte byte byte

Format Signature (8 bytes)

Version # of Superblock Size of Offsets Size of Lengths File Consistency Flags

Base AddressO


Superblock Extension AddressO


End of File AddressO


Root Group Object Header AddressO

Superblock Checksum
(Items marked with an 'O' in the above table are
of the size specified in "Size of Offsets.")

Field Name Description
Format Signature

This field is the same as described for versions 0 and 1 of the superblock.

Version Number of the Superblock

This field has a value of 2 and has the same meaning as for versions 0 and 1.

Size of Offsets

This field is the same as described for versions 0 and 1 of the superblock.

Size of Lengths

This field is the same as described for versions 0 and 1 of the superblock.

File Consistency Flags

This field is the same as desribed for versions 0 and 1 except that it is smaller (the number of reserved bits has been reduced from 30 to 6).

Base Address

This field is the same as described for versions 0 and 1 of the superblock.

Superblock Extension Address

The field is the address of the object header for the superblock extension. If there is no extension then this entry should be the undefined address.

End of File Address

This field is the same as described for versions 0 and 1 of the superblock.

Root Group Object Header Address

This is the address of the root group object header, which serves as the entry point into the group graph for the file.

Superblock Checksum

The checksum for the superblock.


Disk Format: Level 0B - File Driver Info

The driver information block is an optional region of the file which contains information needed by the file driver to reopen a file. The format is described below:

Driver Information Block
byte byte byte byte
Version Reserved
Driver Information Size

Driver Identification (8 bytes)



Driver Information (variable size)



Field Name Description
Version

The version number of the Driver Information Block. This document describes version 0.

Driver Information Size

The size in bytes of the Driver Information field.

Driver Identification

This is an eight-byte ASCII string without null termination which identifies the driver and/or version number of the Driver Information Block. The predefined driver encoded in this field by the HDF5 library is identified by the letters NCSA followed by the first four characters of the driver name. If the Driver Information block is not the original version then the last letter(s) of the identification will be replaced by a version number in ASCII, starting with 0.

Identification for user-defined drivers is also eight-byte long. It can be arbitrary but should be unique to avoid the four character prefix "NCSA".

Driver Information Driver information is stored in a format defined by the file driver (see description below).

The two drivers encoded in the Driver Identification field are as follows:
The format of the Driver Information field for the above two drivers are described below:

Multi Driver Information
byte byte byte byte
Member Mapping Member Mapping Member Mapping Member Mapping
Member Mapping Member Mapping Reserved Reserved

Address of Member File 1


End of Address for Member File 1


Address of Member File 2


End of Address for Member File 2


... ...


Address of Member File N


End of Address for Member File N


Name of Member File 1 (variable size)


Name of Member File 2 (variable size)


... ...


Name of Member File N (variable size)


Field Name Description
Member Mapping These fields are integer values from 1 to 6 indicating how the data can be mapped to or merged with another type of data.
Member Mapping Description
1 The superblock data.
2 The B-tree data.
3 The raw data.
4 The global heap data.
5 The local heap data.
6 The object header data.

For example, if the third field has the value 3 and all the rest have the value 1, it means there are two files: one for raw data, and one for superblock, B-tree, global heap, local heap, and object header.
Reserved

These fields are reserved and should always be zero.

Address of Member File N

This field Specifies the virtual address at which the member file starts.

N is the number of member files.

End of Address for Member File N

This field is the end of the allocated address for the member file.

Name of Member File N

This field is the null-terminated name of the member file and its length should be multiples of 8 bytes. Additional bytes will be padded with NULLs. The default naming convention is %s-X.h5, where X is one of the letters s (for superblock), b (for B-tree), r (for raw data), g (for global heap), l (for local heap), and o (for object header). The name of the whole HDF5 file will substitute the %s in the string.


Family Driver Information
byte byte byte byte

Size of member file


Field Name Description
Size of member file

This field is the size of the member file in the family of files.


Disk Format: Level 0C - Superblock Extension

The superblock extension is used to store superblock metadata which is either optional, or added after the version of the superblock was defined. Superblock extensions may only exist when version 2+ of superblock is used. A superblock extension is an object header which may hold the following messages:



Disk Format: Level 1 - File Infrastructure

Disk Format: Level 1A - B-Trees and B-Tree Nodes

B-Trees allow flexible storage for objects which tend to grow in ways that cause the object to be stored discontiguously. B-trees are described in various algorithms books including "Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest. B-trees are used in several places in the HDF5 file format, when an index is needed for another data structure.

The version 1 B-tree structure described below is the original index structure, but are limited by some bugs in our implementation (mainly in how they handle deleting records). The version 1 B-trees are being phased out in favor of the version 2 B-trees described below, although both types of structures may be found in the same file, depending on application settings when creating the file.

Disk Format: Level 1A1 - Version 1 B-trees

Version 1 B-trees in HDF5 files an implementation of the B-link tree, in which the sibling nodes at a particular level in the tree are stored in a doubly-linked list, is described in the "Efficient Locking for Concurrent Operations on B-trees" paper by Phillip Lehman and S. Bing Yao as published in the ACM Transactions on Database Systems, Vol. 6, No. 4, December 1981.

The B-link trees implemented by the file format contain one more key than the number of children. In other words, each child pointer out of a B-tree node has a left key and a right key. The pointers out of internal nodes point to sub-trees while the pointers out of leaf nodes point to symbol nodes and raw data chunks. Aside from that difference, internal nodes and leaf nodes are identical.

B-link Tree Nodes
byte byte byte byte
Signature
Node Type Node Level Entries Used

Address of Left SiblingO


Address of Right SiblingO

Key 0 (variable size)

Address of Child 0O

Key 1 (variable size)

Address of Child 1O

...
Key 2K (variable size)

Address of Child 2KO

Key 2K+1 (variable size)
(Items marked with an 'O' the above table are
of the size specified in "Size of Offsets.")

Field Name Description
Signature

The ASCII character string "TREE" is used to indicate the beginning of a B-link tree node. This gives file consistency checking utilities a better chance of reconstructing a damaged file.

Node Type

Each B-link tree points to a particular type of data. This field indicates the type of data as well as implying the maximum degree K of the tree and the size of each Key field.

Node Type Description
0 This tree points to group nodes.
1 This tree points to raw data chunk nodes.
Node Level

The node level indicates the level at which this node appears in the tree (leaf nodes are at level zero). Not only does the level indicate whether child pointers point to sub-trees or to data, but it can also be used to help file consistency checking utilities reconstruct damanged trees.

Entries Used

This determines the number of children to which this node points. All nodes of a particular type of tree have the same maximum degree, but most nodes will point to less than that number of children. The valid child pointers and keys appear at the beginning of the node and the unused pointers and keys appear at the end of the node. The unused pointers and keys have undefined values.

Address of Left Sibling

This is the relative file address of the left sibling of the current node. If the current node is the left-most node at this level then this field is the undefined address.

Address of Right Sibling

This is the relative file address of the right sibling of the current node. If the current node is the right-most node at this level then this field is the undefined address.

Keys and Child Pointers

Each tree has 2K+1 keys with 2K child pointers interleaved between the keys. The number of keys and child pointers actually containing valid values is determined by the node's Entries Used field. If that field is N then the B-link tree contains N child pointers and N+1 keys.

Key

The format and size of the key values is determined by the type of data to which this tree points. The keys are ordered and are boundaries for the contents of the child pointer; that is, the key values represented by child N fall between Key N and Key N+1. Whether the interval is open or closed on each end is determined by the type of data to which the tree points.

The format of the key depends on the node type. For nodes of node type 0 (group nodes), the key is formatted as follows:

A single field of Size of Lengths bytes: Indicates the byte offset into the local heap for the first object name in the subtree which that key describes.

For nodes of node type 1 (chunked raw data nodes), the key is formatted as follows:

Bytes 1-4: Size of chunk in bytes.
Bytes 4-8: Filter mask, a 32-bit bitfield indicating which filters have been skipped for this chunk. Each filter has an index number in the pipeline (starting at 0, with the first filter to apply) and if that filter is skipped, the bit corresponding to its index is set.
N 64-bit fields: A 64-bit index indicating the offset of the chunk within the dataset where N is the number of dimensions of the dataset. For example, if a chunk in a 3-dimensional dataset begins at the position [5,5,5], there will be three such 64-bit indices, each with the value of 5.

Child Pointer

The tree node contains file addresses of subtrees or data depending on the node level. Nodes at Level 0 point to data addresses, either raw data chunks or group nodes. Nodes at non-zero levels point to other nodes of the same B-tree.

For raw data chunk nodes, the child pointer is the address of a single raw data chunk. For group nodes, the child pointer points to a symbol table, which contains information for multiple symbol table entries.

Conceptually, each B-tree node looks like this:

key[0]  child[0]  key[1]  child[1]  key[2]  ...  ...  key[N-1]  child[N-1]  key[N]

where child[i] is a pointer to a sub-tree (at a level above Level 0) or to data (at Level 0). Each key[i] describes an item stored by the B-tree (a chunk or an object of a group node). The range of values represented by child[i] is indicated by key[i] and key[i+1].

The following question must next be answered: "Is the value described by key[i] contained in child[i-1] or in child[i]?" The answer depends on the type of tree. In trees for groups (node type 0) the object described by key[i] is the greatest object contained in child[i-1] while in chunk trees (node type 1) the chunk described by key[i] is the least chunk in child[i].

That means that key[0] for group trees is sometimes unused; it points to offset zero in the heap, which is always the empty string and compares as "less-than" any valid object name.

And key[N] for chunk trees is sometimes unused; it contains a chunk offset which compares as "greater-than" any other chunk offset and has a chunk byte size of zero to indicate that it is not actually allocated.

Disk Format: Level 1A2 - Version 2 B-Trees

Version 2 B-trees are "traditional" B-trees, with one major difference. Instead of just using a simple pointer (or address in the file) to a child of an internal node, the pointer to the child node contains two additional pieces of information: the number of records in the child node itself, and the total number of records in the child node and all its descendents. Storing this additional information allows fast array-like indexing to locate the n'th record in the B-tree.

The entry into a version 2 B-tree is a header which contains global information about the structure of the B-tree. The root node address field in the header points to the B-tree root node, which is either an internal or leaf node, depending on the value in the header's depth field. An internal node consists of records plus pointers to further leaf or internal nodes in the tree. A leaf node consists of solely of records. The format of the records depends on the B-tree type (stored in the header).

Version 2 B-tree Header
byte byte byte byte
Signature
Version Type This space inserted only to align table nicely
Node Size
Record Size Depth
Split Percent Merge Percent This space inserted only to align table nicely

Root Node AddressO

Number of Records in Root Node This space inserted only to align table nicely

Total Number of Records in B-treeL

Checksum

Field Name Description
Signature

The ASCII character string "BTHD" is used to indicate the header of a version 2 B-link tree node.

Version

The version number for this B-tree header. This document describes version 0.

Type

This field indicates the type of B-tree:
Value Description
0 A "testing" B-tree, this value should not be used for storing records in actual HDF5 files.
1 This B-tree is used for indexing indirectly accessed, non-filtered 'huge' fractal heap objects.
2 This B-tree is used for indexing indirectly accessed, filtered 'huge' fractal heap objects.
3 This B-tree is used for indexing directly accessed, non-filtered 'huge' fractal heap objects.
4 This B-tree is used for indexing directly accessed, filtered 'huge' fractal heap objects.
5 This B-tree is used for indexing the 'name' field for links in indexed groups.
6 This B-tree is used for indexing the 'creation order' field for links in indexed groups.
7 This B-tree is used for indexing shared object header messages.
8 This B-tree is used for indexing the 'name' field for indexed attributes.
9 This B-tree is used for indexing the 'creation order' field for indexed attributes.

The format of records for each type is described below.

Node Size

This is the size in bytes of all B-tree nodes.

Record Size

This field is the size in bytes of the B-tree record.

Depth

This is the depth of the B-tree.

Split Percent

The percent full that a node needs to increase above before it is split.

Merge Percent

The percent full that a node needs to be decrease below before it is split.

Root Node Address

This is the address of the root B-tree node. A B-tree with no records will have the undefined address in this field.

Number of Records in Root Node

This is the number of records in the root node.

Total Number of Records in B-tree

This is the total number of records in the entire B-tree.

Checksum

This is the checksum for the B-tree header.


Version 2 B-tree Internal Node
byte byte byte byte
Signature
Version Type Records 0, 1, 2...N-1 (variable size)

Child Node Pointer 0O


Number of Records N0 for Child Node 0 (variable size)

Total Number of Records for Child Node 0 (optional, variable size)

Child Node Pointer 1O


Number of Records N1 for Child Node 1 (variable size)

Total Number of Records for Child Node 1 (optional, variable size)
...

Child Node Pointer NO


Number of Records Nnfor Child Node N (variable size)

Total Number of Records for Child Node N (optional, variable size)
Checksum

Field Name Description
Signature

The ASCII character string "BTIN" is used to indicate the internal node of a B-link tree.

Version

The version number for this B-tree internal node. This document describes version 0.

Type

This field is the type of the B-tree node. It should always be the same as the B-tree type in the header.

Records

The size of this field is determined by the number of records for this node and the record size (from the header). The format of records depends on the type of B-tree.

Child Node Pointer

This field is the address of the child node pointed to by the internal node.

Number of Records in Child Node

This is the number of records in the child node pointed to by the corresponding Node Pointer.

The number of bytes used to store this field is determined by the maximum possible number of records able to be stored in the child node.

The maximum number of records in a child node is computed in the following way: Subtract the fixed size overhead for the child node (e.g. it's signature, version, checksum, etc. and one pointer triplet of information for the child node (because there is one more pointer triplet than records in each internal node)) from the size of nodes for the B-tree and dividing that result by the size of a record plus the pointer triplet of information stored to reach each child node from this node.

Note that leaf nodes don't encode any child pointer triplets, so the maximum number of records in a leaf node is just the node size minus the leaf node overhead, divided by the record size.

Also note that the first level of internal nodes above the leaf nodes don't encode the Total Number of Records in Child Node value in the child pointer triplets (since it is the same as the Number of Records in Child Node), so the maximum number of records in these nodes is computed with the equation above, but using (Child Pointer, Number of Records in Child Node) pairs instead of triplets.

The number of bytes used to encode this field is the least number of bytes required to encode the maximum number of records in a child node value for the child nodes below this level in the B-tree.

For example, if the maximum number of child records is 123, one byte will be used to encode these values in this node, if the maximum number of child records is 20000, two bytes will be used to encode these values in this node, etc. The maximum number of bytes used to encode these values is 8 (i.e. an unsigned 64-bit integer).

Total Number of Records in Child Node

This is the total number of records for the node pointed to by the corresponding Node Pointer and all its children. This field exists only in nodes whose depth in the B-tree node is greater than 1 (i.e. the "twig" internal nodes, just above leaf nodes, don't store this field in their child node pointers).

The number of bytes used to store this field is determined by the maximum possible number of records able to be stored in the child node and its descendents.

The maximum possible number of records able to be stored in a child node and its descendents is computed iteratively, in the following way: The maximum number of records in a leaf node is computed, then that value is used to compute the maximum possible number of records in the first level of internal nodes above the leaf nodes. Multiplying these two values together determines the maximum possible number of records in child node pointers for the level of nodes two levels above leaf nodes. This process is continued up to any level in the B-tree.

The number of bytes used to encode this value is computed in the same way as for the Number of Records in Child Node field.

Checksum

This is the checksum for this node.


Version 2 B-tree Leaf Node
byte byte byte byte
Signature
Version Type Record 0, 1, 2...N-1 (variable size)
Checksum

Field Name Description
Signature

The ASCII character string "BTLF" is used to indicate the leaf node of a version 2 B-link tree.

Version

The version number for this B-tree leaf node. This document describes version 0.

Type

This field is the type of the B-tree node. It should always be the same as the B-tree type in the header.

Records

The size of this field is determined by the number of records for this node and the record size (from the header). The format of records depends on the type of B-tree.

Checksum

This is the checksum for this node.

The record layout for each stored (i.e. non-testing) B-tree type is as follows:

Version 2 B-tree, Type 1 Record Layout - Indirectly Accessed, Non-Filtered, 'Huge' Fractal Heap Objects
byte byte byte byte

Huge Object AddressO


Huge Object LengthL


Huge Object IDL


Field Name Description
Huge Object Address

The address of the huge object in the file.

Huge Object Length

The length of the huge object in the file.

Huge Object ID

The heap ID for the huge object.


Version 2 B-tree, Type 2 Record Layout - Indirectly Accessed, Filtered, 'Huge' Fractal Heap Objects
byte byte byte byte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL


Huge Object IDL


Field Name Description
Filtered Huge Object Address

The address of the filtered huge object in the file.

Filtered Huge Object Length

The length of the filtered huge object in the file.

Filter Mask

A 32-bit bitfield indicating which filters have been skipped for this chunk. Each filter has an index number in the pipeline (starting at 0, with the first filter to apply) and if that filter is skipped, the bit corresponding to its index is set.

Filtered Huge Object Memory Size

The size of the de-filtered huge object in memory.

Huge Object ID

The heap ID for the huge object.


Version 2 B-tree, Type 3 Record Layout - Directly Accessed, Non-Filtered, 'Huge' Fractal Heap Objects
byte byte byte byte

Huge Object AddressO


Huge Object LengthL


Field Name Description
Huge Object Address

The address of the huge object in the file.

Huge Object Length

The length of the huge object in the file.


Version 2 B-tree, Type 4 Record Layout - Directly Accessed, Filtered, 'Huge' Fractal Heap Objects
byte byte byte byte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL


Field Name Description
Filtered Huge Object Address

The address of the filtered huge object in the file.

Filtered Huge Object Length

The length of the filtered huge object in the file.

Filter Mask

A 32-bit bitfield indicating which filters have been skipped for this chunk. Each filter has an index number in the pipeline (starting at 0, with the first filter to apply) and if that filter is skipped, the bit corresponding to its index is set.

Filtered Huge Object Memory Size

The size of the de-filtered huge object in memory.


Version 2 B-tree, Type 5 Record Layout - Link Name for Indexed Group
byte byte byte byte
Hash of Name
ID (bytes 1-4)
ID (bytes 5-7)

Field Name Description
Hash

This field is hash value of the name for the link. The hash value is the Jenkins' lookup3 checksum algorithm applied to the link's name.

ID

This is a 7-byte sequence of bytes and is the heap ID for the link record in the group's fractal heap.


Version 2 B-tree, Type 6 Record Layout - Creation Order for Indexed Group
byte byte byte byte

Creation Order (8 bytes)

ID (bytes 1-4)
ID (bytes 5-7)

Field Name Description
Creation Order

This field is the creation order value for the link.

ID

This is a 7-byte sequence of bytes and is the heap ID for the link record in the group's fractal heap.


Version 2 B-tree, Type 7 Record Layout - Shared Object Header Messages (Sub-Type 0 - Message in Heap)
byte byte byte byte
Message Location This space inserted only to align table nicely
Hash
Reference Count

Heap ID (8 bytes)


Field Name Description
Message Location

This field Indicates the location where the message is stored:
Value Description
0 Shared message is stored in shared message index heap.
1 Shared message is stored in object header.

Hash

This field is hash value of the shared message. The hash value is the Jenkins' lookup3 checksum algorithm applied to the shared message.

Reference Count

The number of objects which reference this message.

Heap ID

This is a 8-byte sequence of bytes and is the heap ID for the shared message in the shared message index's fractal heap.


Version 2 B-tree, Type 7 Record Layout - Shared Object Header Messages (Sub-Type 1 - Message in Object Header)
byte byte byte byte
Message Location This space inserted only to align table nicely
Hash
Reserved (zero) Message Type Object Header Index

Object Header AddressO


Field Name Description
Message Location

This field Indicates the location where the message is stored:
Value Description
0 Shared message is stored in shared message index heap.
1 Shared message is stored in object header.

Hash

This field is hash value of the shared message. The hash value is the Jenkins' lookup3 checksum algorithm applied to the shared message.

Message Type

The object header message type of the shared message.

Object Header Index

This field indicates that the shared message is the n'th message of its type in the specified object header.

Object Header Address

The address of the object header containing the shared message.


Version 2 B-tree, Type 8 Record Layout - Attribute Name for Indexed Attributes
byte byte byte byte

Heap ID (8 bytes)

Message Flags This space inserted only to align table nicely
Creation Order
Hash of Name

Field Name Description
Heap ID

This is a 8-byte sequence of bytes and is the heap ID for the attribute in the object's attribute fractal heap.

Message Flags

The object header message flags for the attribute message.

Creation Order

This field is the creation order value for the attribute.

Hash

This field is hash value of the name for the attribute. The hash value is the Jenkins' lookup3 checksum algorithm applied to the attribute's name.


Version 2 B-tree, Type 9 Record Layout- Creation Order for Indexed Attributes
byte byte byte byte

Heap ID (8 bytes)

Message Flags This space inserted only to align table nicely
Creation Order

Field Name Description
Heap ID

This is a 8-byte sequence of bytes and is the heap ID for the attribute in the object's attribute fractal heap.

Message Flags

The object header message flags for the attribute message.

Creation Order

This field is the creation order value for the attribute.

Disk Format: Level 1B - Group Symbol Table Nodes

A group is an object internal to the file that allows arbitrary nesting of objects within the file (including other groups). A group maps a set of link names in the group to a set of relative file addresses of objects in the file. Certain metadata for an object to which the group points can be cached in the group's symbol table entry in addition to being in the object's header.

An HDF5 object name space can be stored hierarchically by partitioning the name into components and storing each component as a link in a group. The link for a non-ultimate component points to the group containing the next component. The link for the last component points to the object being named.

One implementation of a group is a collection of symbol table nodes indexed by a B-link tree. Each symbol table node contains entries for one or more links. If an attempt is made to add a link to an already full symbol table node containing 2K entries, then the node is split and one node contains K symbols and the other contains K+1 symbols.

Symbol Table Node (A Leaf of a B-link tree)
byte byte byte byte
Signature
Version Number Reserved (zero) Number of Symbols


Group Entries



Field Name Description
Signature

The ASCII character string "SNOD" is used to indicate the beginning of a symbol table node. This gives file consistency checking utilities a better chance of reconstructing a damaged file.

Version Number

The version number for the symbol table node. This document describes version 1. (There is no version '0' of the symbol table node)

Number of Entries

Although all symbol table nodes have the same length, most contain fewer than the maximum possible number of link entries. This field indicates how many entries contain valid data. The valid entries are packed at the beginning of the symbol table node while the remaining entries contain undefined values.

Symbol Table Entries

Each link has an entry in the symbol table node. The format of the entry is described below. There are 2K entries in each group node, where K is the "Group Leaf Node K" value from the superblock.

Disk Format: Level 1C - Symbol Table Entry

Each symbol table entry in a symbol table node is designed to allow for very fast browsing of stored objects. Toward that design goal, the symbol table entries include space for caching certain constant metadata from the object header.

Symbol Table Entry
byte byte byte byte

Link Name OffsetO


Object Header AddressO

Cache Type
Reserved (zero)


Scratch-pad Space (16 bytes)


(Items marked with an 'O' the above table are
of the size specified in "Size of Offsets.")

Field Name Description
Link Name Offset

This is the byte offset into the group's local heap for the name of the link. The name is null terminated.

Object Header Address

Every object has an object header which serves as a permanent location for the object's metadata. In addition to appearing in the object header, some of the object's metadata can be cached in the scratch-pad space.

Cache Type

The cache type is determined from the object header. It also determines the format for the scratch-pad space:
Type: Description:
0 No data is cached by the group entry. This is guaranteed to be the case when an object header has a link count greater than one.
1 Group object header metadata is cached in the scratch-pad space. This implies that the symbol table entry refers to another group.
2 The entry is a symbolic link. The first four bytes of the scratch-pad space are the offset into the local heap for the link value. The object header address will be undefined.

Reserved

These four bytes are present so that the scratch-pad space is aligned on an eight-byte boundary. They are always set to zero.

Scratch-pad Space

This space is used for different purposes, depending on the value of the Cache Type field. Any metadata about an object represented in the scratch-pad space is duplicated in the object header for that object.

Furthermore, no data is cached in the group entry scratch-pad space if the object header for the object has a link count greater than one.

Format of the Scratch-pad Space

The symbol table entry scratch-pad space is formatted according to the value in the Cache Type field.

If the Cache Type field contains the value zero (0) then no information is stored in the scratch-pad space.

If the Cache Type field contains the value one (1), then the scratch-pad space contains cached metadata for another object header in the following format:

Object Header Scratch-pad Format
byte byte byte byte

Address of B-treeO


Address of Name HeapO

(Items marked with an 'O' the above table are
of the size specified in "Size of Offsets.")

Field Name Description
Address of B-tree

This is the file address for the root of the group's B-tree.

Address of Name Heap

This is the file address for the group's local heap, in which are stored the group's symbol names.

If the Cache Type field contains the value two (2), then the scratch-pad space contains cached metadata for a symbolic link in the following format:

Symbolic Link Scratch-pad Format
byte byte byte byte
Offset to Link Value

Field Name Description
Offset to Link Value

The value of a symbolic link (that is, the name of the thing to which it points) is stored in the local heap. This field is the 4-byte offset into the local heap for the start of the link value, which is null terminated.

Disk Format: Level 1D - Local Heaps

A local heap is a collection of small pieces of data that are particular to a single object in the HDF5 file. Objects can be inserted and removed from the heap at any time. The address of a heap does not change once the heap is created. For example, a group stores addresses of objects in symbol table nodes with the names of links stored in the group's local heap.


Local Heap
byte byte byte byte
Signature
Version Reserved (zero)

Data Segment SizeL


Offset to Head of Free-listL


Address of Data SegmentO

(Items marked with an 'L' the above table are
of the size specified in "Size of Lengths.")
(Items marked with an 'O' the above table are
of the size specified in "Size of Offsets.")

Field Name Description
Signature

The ASCII character string "HEAP" is used to indicate the beginning of a heap. This gives file consistency checking utilities a better chance of reconstructing a damaged file.

Version

Each local heap has its own version number so that new heaps can be added to old files. This document describes version zero (0) of the local heap.

Data Segment Size

The total amount of disk memory allocated for the heap data. This may be larger than the amount of space required by the objects stored in the heap. The extra unused space in the heap holds a linked list of free blocks.

Offset to Head of Free-list

This is the offset within the heap data segment of the first free block (or the undefined address if there is no free block). The free block contains "Size of Lengths" bytes that are the offset of the next free block (or the value '1' if this is the last free block) followed by "Size of Lengths" bytes that store the size of this free block. The size of the free block includes the space used to store the offset of the next free block and the of the current block, making the minimum size of a free block 2 * "Size of Lengths".

Address of Data Segment

The data segment originally starts immediately after the heap header, but if the data segment must grow as a result of adding more objects, then the data segment may be relocated, in its entirety, to another part of the file.

Objects within a local heap should be aligned on an 8-byte boundary.

Disk Format: Level 1E - Global Heap

Each HDF5 file has a global heap which stores various types of information which is typically shared between datasets. The global heap was designed to satisfy these goals:

  1. Repeated access to a heap object must be efficient without resulting in repeated file I/O requests. Since global heap objects will typically be shared among several datasets, it is probable that the object will be accessed repeatedly.
  2. Collections of related global heap objects should result in fewer and larger I/O requests. For instance, a dataset of object references will have a global heap object for each reference. Reading the entire set of object references should result in a few large I/O requests instead of one small I/O request for each reference.
  3. It should be possible to remove objects from the global heap and the resulting file hole should be eligible to be reclaimed for other uses.

The implementation of the heap makes use of the memory management already available at the file level and combines that with a new top-level object called a collection to achieve Goal B. The global heap is the set of all collections. Each global heap object belongs to exactly one collection and each collection contains one or more global heap objects. For the purposes of disk I/O and caching, a collection is treated as an atomic object.

The HDF5 library creates global heap collections as needed, so there may be multiple collections throughout the file. The set of all of them is abstractly called the "global heap", although they don't actually link to each other, and there is no global place in the file where you can discover all of the collections. The collections are found simply by finding a reference to one through another object in the file. For example, data of variable-length datatype elements is stored in the global heap and is accessed via a global heap ID. The format for global heap IDs is described at the end of this section.


A Global Heap Collection
byte byte byte byte
Signature
Version Reserved (zero)

Collection SizeL


Global Heap Object 1


Global Heap Object 2


...


Global Heap Object N


Global Heap Object 0 (free space)


Field Name Description
Signature

The ASCII character string "GCOL" is used to indicate the beginning of a collection. This gives file consistency checking utilities a better chance of reconstructing a damaged file.

Version

Each collection has its own version number so that new collections can be added to old files. This document describes version one (1) of the collections (there is no version zero (0)).

Collection Size

This is the size in bytes of the entire collection including this field. The default (and minimum) collection size is 4096 bytes which is a typical file system block size. This allows for 127 16-byte heap objects plus their overhead (the collection header of 16 bytes and the 16 bytes of information about each heap object).

Global Heap Object 1 through N

The objects are stored in any order with no intervening unused space.

Global Heap Object 0

Global Heap Object 0 (zero), when present, represents the free space in the collection. Free space always appears at the end of the collection. If the free space is too small to store the header for Object 0 (described below) then the header is implied and the collection contains no free space.


Global Heap Object
byte byte byte byte
Heap Object Index Reference Count
Reserved (zero)

Object SizeL


Object Data


Field Name Description
Heap Object Index

Each object has a unique identification number within a collection. The identification numbers are chosen so that new objects have the smallest value possible with the exception that the identifier 0 always refers to the object which represents all free space within the collection.

Reference Count

All heap objects have a reference count field. An object which is referenced from some other part of the file will have a positive reference count. The reference count for Object 0 is always zero.

Reserved

Zero padding to align next field on an 8-byte boundary.

Object Size

This is the size of the object data stored for the object. The actual storage space allocated for the object data is rounded up to a multiple of eight.

Object Data

The object data is treated as a one-dimensional array of bytes to be interpreted by the caller.

(Items marked with an 'O' in the above tables are
of the size specified in "Size of Offsets.")
(Items marked with an 'L' the above tables are
of the size specified in "Size of Lengths.")

The format for the ID used to locate an object in the global heap is described here:

Global Heap ID
byte byte byte byte

Collection AddressO

Object Index

Field Name Description
Collection Address

This field is the address of the global heap collection where the data object is stored.

ID

This field is the index of the data object within the global heap collection.


Disk Format: Level 1F - Fractal Heap

Each fractal heap consists of a header and zero or more direct and indirect blocks (described below). The header contains general information as well as initialization parameters for the doubling table. The Root Block Address in the header points to the first direct or indirect block in the heap.

Fractal heaps are based on a data structure called a doubling table. A doubling table provides a mechanism for quickly extending an array-like data structure that minimizes the number of empty blocks in the heap, while retaining very fast lookup of any element within the array. More information on fractal heaps and doubling tables can be found in this RFC: Private Heaps in HDF5.

The fractal heap implements the doubling table structure with indirect and direct blocks. Indirect blocks in the heap do not actually contain data for objects in the heap, their "size" is abstract - they represent the indexing structure for locating the direct blocks in the doubling table. Direct blocks contain the actual data for objects stored in the heap.

All indirect blocks have a constant number of block entries in each row, called the width of the doubling table (stored in the heap header). The number of rows for each indirect block in the heap is determined by the size of the block that the indirect block represents in the doubling table (calculation of this is shown below) and is constant, except for the "root" indirect block, which expands and shrinks its number of rows as needed.

Blocks in the first two rows of an indirect block are Starting Block Size number of bytes in size, and the blocks in each subsequent row are twice the size of the blocks in the previous row (i.e. blocks in the third row are twice the Starting Block Size, blocks in the fourth row are four times the Starting Block Size, etc.). Entries for blocks up to the Maximum Direct Block Size point to direct blocks and entries for blocks greater than that size point to further indirect blocks (which have their own entries for direct and indirect blocks, etc).

The number of rows of blocks, nrows, in an indirect block of size iblock_size is given by the following expression:

nrows = (log2(iblock_size) - log2(<Starting Block Size> * <Width>)) + 1

The maximum number of rows of direct blocks, max_dblock_rows, in any indirect block of a fractal heap is given by the following expression:

max_dblock_rows = (log2(<Max. Direct Block Size>) - log2(<Starting Block Size>)) + 2

Using the computed values for nrows and max_dblock_rows, along with the Width of the doubling table, the number of direct and indirect block entries (K and N in the indirect block description, below) in an indirect block can be computed:

K = MIN(nrows, max_dblock_rows) * Width

If nrows is less than or equal to max_dblock_rows, N is 0. Otherwise, N is simply computed:

N = K - (max_dblock_rows * Width)

The size indirect blocks on disk is determined by the number of rows in the indirect block (computed above). The size of direct blocks on disk is exactly the size of the block in the doubling table.


Fractal Heap Header
byte byte byte byte
Signature
Version This space inserted only to align table nicely
Heap ID Length I/O Filters' Encoded Length
Flags This space inserted only to align table nicely
Maximum Size of Managed Objects

Next Huge Object IDL


v2 B-tree Address of Huge ObjectsO


Amount of Free Space in Managed BlocksL


Address of Managed Block Free Space ManagerO


Amount of Managed Space in HeapL


Amount of Allocated Managed Space in HeapL


Offset of Direct Block Allocation Iterator in Managed SpaceL


Number of Managed Objects in HeapL


Size of Huge Objects in HeapL


Number of Huge Objects in HeapL


Size of Tiny Objects in HeapL


Number of Tiny Objects in HeapL

Table Width This space inserted only to align table nicely

Starting Block SizeL


Maximum Direct Block SizeL

Maximum Heap Size Starting # of Rows in Root Indirect Block

Address of Root BlockO

Current # of Rows in Root Indirect Block This space inserted only to align table nicely

Size of Filtered Root Direct Block (optional)L

I/O Filter Mask (optional)
I/O Filter Information (optional, variable size)
Checksum

Field Name Description
Signature

The ASCII character string "FRHP" is used to indicate the beginning of a fractal heap header. This gives file consistency checking utilities a better chance of reconstructing a damaged file.

Version

This document describes version 0.

Heap ID Length

This is the length in bytes of heap object IDs for this heap.

I/O Filters' Encoded Length

This is the size in bytes of the encoded I/O Filter Information.

Flags

This field is the heap status flag and is a bit-field indicating additional information about the fractal heap.
Bit(s) Description
0 If set, the ID value to use for huge object has wrapped around. If the value for the Next Huge Object ID has wrapped around, each new huge object inserted into the heap will require a search for an ID value.
1 If set, the direct blocks in the heap are checksummed.
2-7 Reserved

Maximum Size of Managed Objects

This is the maximum size of managed objects allowed in the heap. Objects greater than this this are 'huge' objects and will be stored in the file directly, rather than in a direct block for the heap.

Next Huge Object ID

This is the next ID value to use for a huge object in the heap.

v2 B-tree Address of Huge Objects

This is the address of the v2 B-tree used to track huge objects in the heap. The type of records stored in the v2 B-tree will be determined by whether the address & length of a huge object can fit into a heap ID (if yes, it's a "directly" accessed huge object) and whether there is a filter used on objects in the heap.

Amount of Free Space in Managed Blocks

This is the total amount of free space in managed direct blocks (in bytes).

Address of Managed Block Free Space Manager

This is the address of the Free-Space Manager for managed blocks.

Amount of Managed Space in Heap

This is the total amount of managed space in the heap (in bytes), essentially the upper bound of the heap's linear address space.

Amount of Allocated Managed Space in Heap

This is the total amount of managed space (in bytes) actually allocated in the heap. This can be less than the Amount of Managed Space in Heap field, if some direct blocks in the heap's linear address space are not allocated.

Offset of Direct Block Allocation Iterator in Managed Space

This is the linear heap offset where the next direct block should be allocated at (in bytes). This may be less than the Amount of Managed Space in Heap value because the heap's address space is increased by a "row" of direct blocks at a time, rather than by single direct block increments.

Number of Managed Objects in Heap

This is the number of managed objects in the heap.

Size of Huge Objects in Heap

This is the total size of huge objects in the heap (in bytes).

Number of Huge Objects in Heap

This is the number of huge objects in the heap.

Size of Tiny Objects in Heap

This is the total size of tiny objects that are packed in heap IDs (in bytes).

Number of Tiny Objects in Heap

This is the number of tiny objects that are packed in heap IDs.

Table Width

This is the number of columns in the doubling table for managed blocks. This value must be a power of two.

Starting Block Size

This is the starting block size to use in the doubling table for managed blocks (in bytes). This value must be a power of two.

Maximum Direct Block Size

This is the maximum size allowed for a managed direct block. Objects inserted into the heap that are larger than this value (less the # of bytes of direct block prefix/suffix) are stored as 'huge' objects. This value must be a power of two.

Maximum Heap Size

This is the maximum size of the heap's linear address space for managed objects (in bytes). The value stored is the log2 of the actual value, that is: the # of bits of the address space. 'Huge' and 'tiny' objects aren't counted in this value, since they don't store objects in the linear address space of the heap.

Starting # of Rows in Root Indirect Block

This is the starting number of rows for the root indirect block. A value of 0 indicates that the root indirect block will have the maximum number of rows needed to address the heap's Maximum Heap Size.

Address of Root Block

This is the address of the root block for the heap. It can be the undefined address if there is no data in the heap. It either points to a direct block (if the Current # of Rows in the Root Indirect Block value is 0), or an indirect block.

Current # of Rows in Root Indirect Block

This is the current number of rows in the root indirect block. A value of 0 indicates that Address of Root Block points to direct block instead of indirect block.

Size of Filtered Root Direct Block

This is the size of the root direct block, if filters are applied to heap objects (in bytes). This field is only stored in the header if the I/O Filters' Encoded Length is greater than 0.

I/O Filter Mask

This is the filter mask for the root direct block, if filters are applied to heap objects. This mask has the same format as that used for the filter mask in chunked raw data records in a v1 B-tree. This field is only stored in the header if the I/O Filters' Encoded Length is greater than 0.

I/O Filter Information

This is the I/O filter information encoding direct blocks and huge objects, if filters are applied to heap objects. This field is encoded as a Filter Pipeline message. The size of this field is determined by I/O Filters' Encoded Length.

Checksum

This is the checksum for the header.


Fractal Heap Direct Block
byte byte byte byte
Signature
Version This space inserted only to align t