h5py

h5py is a Python interface to the HDF5 library. It covers most HDF5 APIs.

Download

Although user can download, build, and install h5py from source, it is not recommended. Instead, please download Miniconda and install it first. Installing Miniconda is straightforward if you follow this instruction.

Installation

Once you can run Miniconda successfully, use conda install h5py from Miniconda's Python shell.

If you're a Docker user, please try our Docker images of Anacondaexternal through Docker Hub that include all the required Python modules (e.g., basemap) to visualize HDF-EOS data.

Usage

h5py greatly simplifies the complexity of HDF5 C APIs by providing easy-to-use high level APIs. Yet, it's powerful enough to do almost anything you can do from HDF5 C APIs.

How to read and visualize NASA HDF5 products

If you have installed h5py successfully, you can read and visualize NASA HDF5 and netCDF-4 data products. First, please make sure that you have installed basemap, matplotlib, and numpy modules and import them before h5py as shown in Figure 1. For example, if Python fails to load basemap module, you can install one using conda install basemap.

Figure 1. Python code for importing h5py interface and other packages for visualization
import os
import matplotlib as mpl
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import numpy as np
import h5py

Next, open the sample NASA HDF5 file and read datasets and attributes as shown in Figure 2. You can replace FILE_NAME if you want to visualize a different file of the same OMI L3 version 2 product. You can replace DATAFIELD_NAME if you want to visualize a different dataset in the HDF5 file. You can examine the available datasets using f.keys().

Figure 2. Python code for opening file and reading datasets
# Open file.
FILE_NAME = 'OMI-Aura_L3-OMTO3e_2005m1214_v002-2006m0929t143855.he5'
DATAFIELD_NAME = '/HDFEOS/GRIDS/OMI Column Amount O3/Data Fields/ColumnAmountO3'
with h5py.File(FILE_NAME, mode='r') as f:
# List available datasets.
print f.keys()

# Read dataset.
dset = f[DATAFIELD_NAME]
data = dset[:]

# Handle fill value.
data[data == dset.fillvalue] = np.nan
data = np.ma.masked_where(np.isnan(data), data)

# Get attributes needed for the plot.
# String attributes actually come in as the bytes type and should
# be decoded to UTF-8 (python3).
title = dset.attrs['Title'].decode()
units = dset.attrs['Units'].decode()

Finally, plot the data on map using the functions in basemap and matplotlib packages as shown in Figure 3.

Figure 3. Python code for visualizing data on map
m = Basemap(projection='cyl', resolution='l', llcrnrlat=-90, urcrnrlat = 90, llcrnrlon=-180, urcrnrlon = 180)
m.drawcoastlines(linewidth=0.5)
m.drawparallels(np.arange(-90., 120., 30.), labels=[1, 0, 0, 0])
m.drawmeridians(np.arange(-180., 181., 45.), labels=[0, 0, 0, 1])
x, y = m(longitude, latitude)
m.pcolormesh(x, y, data)

The complete code, which includes lat/lon calculation, is here. Use right mouse button and select Save Link As to download the code. If you execute the code (e.g., python OMI_L3_ColumnAmountO3.py) on the directory where the sample file exists, you will get the image as shown in Figure 4.

Generally speaking, each NASA HDF5 data product requires a different technique for reading and visualizing dataset properly. We provide comprehensive h5py examples here to help you access NASA data easily.

How to create HDF5 data

There is a GUI tool that can generate h5py code for HDF5 data creation. Such GUI tool is called HDF Product Designer and it allows you to edit HDF5 contents (group, dataset, attributes) visually and generates h5py code that matches your HDF5 content (Figure 5). Please click hereexternal to learn more about HDF Product Designer.

See Also

References


Last modified: 02/18/2021
About Us | Contact Info | Archive Info | Disclaimer
Sponsored by Subcontract number 4400528183 under Raytheon Contract number NNG15HZ39C, funded by NASA / Maintained by The HDF Group