| THG Home |
The HDF interfaces that support compression and/or chunking are in the following table
| Interface | Compression | Chunking |
| SD - Multifile Scientific Data | yes | yes |
| GR - Multifile General Raster Image | yes | yes |
| DFR8 - Single-file 8-Bit Raster Image | yes | no |
| DF24 - Single-file 24-Bit Raster Image | yes | no |
In the SDS interface, compression is done with the SDsetcompress routine. The syntax of the routine SDsetcompress is as follows:
status = SDsetcompress(sds_id, comp_type, &c_info);
The parameter comp_type specifies the compression type definition. Compression information is specified by the parameter c_info. The following table summarizes the available options
| comp_type | algorithm | c_info |
COMP_CODE_RLE |
Run-length encoding | not used |
COMP_CODE_SKPHUFF |
Adaptive Huffman | the structure skphuff in the union comp_info must be provided with the size, in bytes, of the data elements |
COMP_CODE_DEFLATE |
GZIP "deflation" (Lempel/Ziv-77 dictionary coder) | the deflate structure in the union comp_info must be provided with the information about the compression effort |
SDsetcompress writes the compressed data, in its entirety, to the data set. The data set is built in-core then written in a single write operation.
The SDsetchunk function is called to make a SDS a chunked SDS. There are two restrictions that apply to chunked SDSs. The maximum number of chunks in a single HDF file is 65,535 and a chunked SDS cannot contain an unlimited dimension. SDsetchunk sets the chunk size and the compression method for a data set. The syntax of SDsetchunk is as follows:
status = SDsetchunk(sds_id, c_def, c_flag);
The chunking information is provided in the parameters c_def and c_flag. The parameter flag specifies the type of the data set, i.e., if the data set is chunked or chunked and compressed. The following table summarizes the available options
| c_flag | c_def | |
| HDF_CHUNK | chunked data set | the elements of the array chunk_lengths in the union c_def (c_def.chunk_lengths[]) have to be initialized to the chunk dimension sizes |
| HDF_CHUNK | HDF_COMP | chunked data set compressed with RLE, Skipping Huffman, and GZIP compression | the elements of the array chunk_lengths of the structure comp in the union c_def (c_def.comp.chunk_lengths[]) have to be initialized to the chunk dimension sizes |
| HDF_CHUNK | HDF_NBIT | chunked NBIT-compressed data set | the elements of the array chunk_lengths of the structure nbit in the union c_def (c_def.nbit.chunk_lengths[]) have to be initialized to the chunk dimension sizes |
GR Images are compressed using the routine GRsetcompress. The syntax of the routine GRsetcompress is as follows:
status = GRsetcompress(ri_id, comp_type, c_info);
The compression method is specified by the parameter comp_type. The parameter c_info has type comp_info and contains algorithm-specific information for the library compression routines. The following table summarizes the available options
| comp_type | c_info | |
| COMP_CODE_NONE | no compression | not used |
| COMP_CODE_RLE | RLE run-length encoding | not used |
| COMP_CODE_SKPHUFF | Skipping Huffman compression | the skipping size for the Skipping Huffman algorithm is specified in the field c_info.skphuff.skp_size |
| COMP_CODE_DEFLATE | GZIP compression | the deflate level for the GZIP algorithm is specified in the field c_info.deflate.level |
| COMP_CODE_JPEG | JPEG compression | not used |
The GR interface also supports chunking in a manner similar to that of the SD interface. There is one restriction on a raster image: it must be created with MFGR_INTERLACE_PIXEL in the call to GRcreate. See why. The function GRsetchunk makes the raster image, identified by the parameter ri_id, a chunked raster image according to the provided chunking and compression information. The syntax of GRsetchunk is as follows:
status = GRsetchunk(ri_id, c_def, c_flags);
The parameters c_def and c_flags provide the chunking and compression information and are discussed below
| c_flags | c_def | ||
| HDF_CHUNK | chunked and uncompressed data | the chunk dimensions must be specified in the field c_def.chunk_lengths[] | |
| HDF_CHUNK | HDF_COMP | chunked data set compressed with RLE, Skipping Huffman, and GZIP compression | the chunk dimensions must be specified in the field c_def.comp.chunk_lengths[] and the compression type in the field c_def.comp.comp_type. Valid values of compression type values are: | |
COMP_CODE_NONE |
uncompressed data | ||
COMP_CODE_RLE |
data compressed using the RLE compression algorithm | ||
COMP_CODE_SKPHUFF |
data compressed using the Skipping Huffman compression algorithm | the skipping size is specified in the field c_def.comp.cinfo.skphuff.skp_size | |
COMP_CODE_DEFLATE |
data compressed using the GZIP compression algorithm | the deflate level is specified in the field c_def.comp.cinfo.deflate.level. Valid deflate level values are integers from 1 to 9 inclusive |
The compression type is determined by the tag passed as the fifth argument in calls to the DFR8putimage and DFR8addimage routines. DFR8setcompress provides a method for compressing the next raster image written.
intn DFR8addimage(char *filename, VOIDP image, int32 width, int32 height, uint16 compress);
intn DFR8setcompress(int32 type, comp_info *cinfo);
The compress options are
| COMP_NONE | not compressed |
| COMP_JPEG | compresses images with a JPEG algorithm, which is a lossy method |
| COMP_RLE | COMP_RLE uses lossless run-length encoding to store the image |
| COMP_IMCOMP | uses a lossy compression algorithm called IMCOMP, and is included for backward compatibility only. If IMCOMP compression is used, the image must include a palette. |
The comp_info union contains algorithm-specific information for the library routines that perform the compression. It is only used by the COMP_JPEG compression type. A pointer to a valid comp_info union is required for all compression types other than COMP_JPEG, but the values in the union are not used.
To store a 24-bit raster image using compression, the calling program must contain the following function calls
intn DF24setcompress(int32 type, comp_info *cinfo);
intn DF24addimage(char *filename, VOIDP image, int32 width, int32 height);
The compress options are the same as for the DFR8 case.
Compression done with the SDsetcompress (or GRsetcompress) and chunking done with SDsetchunk (or GRsetchunk) are mutually exclusive. This is, in a application it is not possible to chunk a dataset and compress it with SDsetcompress. The following table illustrates the valid and invalid options.
| API call | behaviour |
| SDsetcompress | compresses, not possible to chunk |
| SDsetchunk | chunks, not possible to compress with SDsetcompress. to compress and chunk use the flags HDF_CHUNK | HDF_COMP on this call |
|
Last modified: March 19, 2007 Describes HDF compression/chunking. |