How to Read non-HDF-EOS2 objects from Hybrid files

There are some HDF-EOS2 files in which some objects are not created by HDF-EOS2 APIs and one can access those objects via HDF-EOS2 APIs. In other words, those objects are "invisiable" by HDF-EOS2 APIs. When using HDF-EOS2 APIs to access these files, those "invisiable" objects will simply be ignored. We can these files hybrid HDF-EOS2/HDF4 files or simply hybrid files. This page explains how to read those non-HDF-EOS2 objects from hybrid files.

The HDF-EOS2 library is built on the HDF4external library. For this reason, a valid HDF-EOS2 file is a valid HDF4 file, and any part of an HDF-EOS2 file can be read by the HDF4 library.

Furthermore, a valid HDF-EOS2 file can be modified by any HDF4 application. For example, some HDF4 objects can be added to an existing HDF-EOS2 file. If these added objects are located where the HDF-EOS2 library accesses, those objects become invisible to HDF-EOS2 APIs. This is how a hybrid file is generated.

Attributes attached to data fields or geo-location fields

One common "invisiable" object found in an hybrid file is an attribute of a data field or a geo-location field. HDF-EOS2 library only supports adding attributes to a grid object or a swath object. We could see attributes attached to fields in several data products. Again, these attributes are invisible to HDF-EOS2 APIs. Those who are writing converters may need to handle this case; otherwise, the generated file will lose some attributes.

To access those attributes, one needs to rely on HDF4. Since HDF4 APIs are not similar to HDF-EOS2 APIs, one needs to learn how to use HDF4 APIs. Although HDF4 provides various interfaces, only a few APIs are actually necessary to read attributes. We will show an example on how to read attributes from an AE_Oceanexternal file. You can download the file here. We will access one attribute attached to data field named High_res_cloud in swath named Swath1.

As we mentioned earlier, an HDF-EOS2 data field or geo-location field is implemented by one HDF4 object. As our best knowledge, an HDF-EOS2 field is mapped to one HDF4 Vdata if the data field is one-dimensional array. If the dimension rank is greater than one, the field is mapped to an HDF4 SDS. Supposed that there is a two-dimensional HDF-EOS2 data field named High_res_cloud, This field is actually an HDF4 SDS named High_res_cloud.

In this document, we will cover only when the rank is greater than one. To access HDF4 SDS, one needs to use the HDF4 SD interfaces. The HDF4 has its own SD interfaces to open and close a file. After opening a file, one can access an HDF4 SDS that corresponds to an HDF-EOS2 data field. Then, one can access attributes attached to that HDF4 SDS. For more information about HDF4, check hereexternal. The following will simply explain how to access an HDF-EOS2 data field attribute with HDF4 interfaces.

The first thing to do is to open the hybrid file using SDstart.

Figure 1 Opening a hybrid file using the HDF4 SD interface
int32 sdfileid;
sdfileid = SDstart("AMSR_E_L2_Ocean_B01_200206182340_A.hdf", DFACC_READ);
The first argument is the hybrid file name. The second argument specifies read-only access. This function returns an SD interface identifier. This is similar to GDopen or SWopen in that it opens a file.

Next step is to obtain the identifier of the HDF4 SDS that corresponds to the data field named High_res_cloud. As we mentioned earlier, there will be one HDF4 SDS with the same name.

Figure 2 Obtaining the identifier of the SDS corresponding an HDF-EOS2 data field
int32 sdsindex, sdsid;
sdsindex = SDnametoindex(sdfileid, "High_res_cloud");
sdsid = SDselect(sdfileid, sdsindex);
The first parameter to SDselect is simply the return value of SDstart. The second parameter is the index of the data field, which is an integer value. If one does not know the index but does know the name of the data field, one can get the index by calling SDnametoindex. SDnametoindex will return the index of the data field from data field's name given by the second parameter.

Note that it is possible that multiple data fields share the same name. If multiple data fields share the same name, SDnametoindex only returns an index of one of them. To understand how one can obtain the index of the particular data field of which name is shared with others is out of the scope of this page. One can read HDF4 documents. Fortunately we don't find multiple data fields share the same name under the same grid or swath objects for any HDF-EOS2 files. So our example will safely assume that each data field has a unique name under the same grid object or the swath object.

Now that we got the identifier of the SDS, we can access attributes attached to this SDS. Here, we will access the first attribute of High_res_cloud. To get information about the first attribute, SDattrinfo is used.

Figure 3 Retrieving information about the first attribute of High_res_cloud
char attrname[H4_MAX_NC_NAME + 1];
int32 attrtype;
int32 attrcount;
SDattrinfo(sdsid, 0, attrname, &attrtype, &attrcount);
The first parameter is the identifier of the SDS, which was returned by SDselect. The second parameter specifies the first attribute of the SDS specified by the first parameter. If one wants to access n-th attribute, this value should be n - 1. These two parameters are inputs to this function, and this function will give the other three arguments as outputs. The outputs of this API are the attribute name, the attribute data type and the number of elements of the attribute.

From the number of elements and the data type, we can calculate the size of buffer to retrieve the attribute. Assuming that the data type is 32-bit floating point, one can write the following code:

Figure 4 Allocating a buffer to retrieve the attribute
float32 *attrdata;
attrdata = malloc(sizeof(float32) * attrcount);
Note that attrcount retrieved by SDattrinfo gives the number of elements, not the number of bytes. That's why the above code multiplies attrcount by the size of the data type.

SDreadattr will read actual data from the attribute and fill the buffer.

Figure 5 Reading the values of the attribute
SDreadattr(sdsid, 0, attrdata);
The first two parameters are the same as those in SDattrinfo. The third parameter specifies the buffer that this function fills. As usual, passing insufficient buffer will result in buffer-overrun.

The last step is to terminate access to the SD interface identifier returned by SDstart.

This is similar to GDclose or SWclose.

Previously, the contents of the first attribute of High_res_cloud were stored at a variable named attrdata. One can perform desired operation on this variable. Let's print every element to see its contents.

Figure 7 Using retrieved attribute
int32 i;
for(i = 0; i < attrcount; ++i) {
printf("%s[%d] %f\n", attrname, (int)i, attrdata[i]);
}

To get the full source code, see here. How to build a C program is explained here.

HDF4 objects embedded in HDF-EOS2 files

Some HDF-EOS2 files even have HDF4 SDS and HDF4 Vdata that are invisible to HDF-EOS2. One file from MAC03S0external is an example. This file contains several HDF4 SDS and one Vdata that are invisible to HDF-EOS2.

To access these HDF4 objects, one needs to rely on HDF4 as we did for attributes attached to data fields and geo-location fields. We will not explain this case because accessing those objects has nothing to do with using HDF-EOS APIs. If you consider writing converters, you may need to handle this case not to lose any objects in the source file which is a hybrid file.


Last modified: 11/11/2020
About Us | Contact Info | Archive Info | Disclaimer
Sponsored by Subcontract number 4400528183 under Raytheon Contract number NNG15HZ39C, funded by NASA / Maintained by The HDF Group