[Nexus] Release of Version 2.0.0 (including HDF5 support)

Ray Osborn ROsborn at anl.gov
Mon Sep 16 23:28:31 BST 2002


On 9/16/02 12:33 PM, "Brian Tieman" <tieman at aps.anl.gov> wrote:

> Are there any driving reasons to use HDF5 over HDF4?  Does it provide better
> performance?  Better compression?  Easier to handle files?  Increased
> compatability? Etc...
> 
> I've long thought about moving to HDF5 independant of Nexus but I haven't seen
> a
> compelling reason to do so.  Maybe those of you who have used it can provide
> some insight?
> 
> Brian Tieman
> Argonne National Laboratory
> 

Brian,
The two main reasons are file size and performance.  Both were considered to
be barriers to the use of NeXus in the next generation of spallation neutron
sources, where the number of data elements is predicted to grow
substantially.  For example, HDF4 could only store 2GB of data because of
address size limitations, whereas existing ISIS instruments are already
approaching 1GB.  Also, HDF5 has support for multithreading, parallel I/O
(not that it is included in NeXus yet), and has fewer overheads caused by
the more complex set of data models supported by HDF4.  When we first heard
about HDF5, we realized that it was the way of the future, but the current
urgency is that SNS would not consider formally adopting NeXus until we had
produced an HDF5 version.

Third-party support for HDF5 is still lagging well behind HDF4, but it is
coming (for example, IDL has a free HDF5 plug-in and Matlab has promised
support soon), so your choice for now will probably depend on how important
that is to you.  HDF has no plans to drop support for either version.  If
they ever do, it would be trivial to convert between the two since the NeXus
data models are identical.  However, I don't think that will be necessary.

Here is an excerpt from the HDF5 FAQ at
<http://hdf.ncsa.uiuc.edu/HDF5-FAQ.html>.
--------
HDF5 was designed to address some of the limitations of the HDF 4.x library
and to address current and anticipated requirements of modern systems and
applications. 

Some of the HDF (4) limitations are:

*    A single file cannot store more than 20,000 complex objects, and a
single file cannot be larger than 2 gigabytes.
*    The data models are less consistent than they should be. There are more
object types than necessary, and datatypes are too restricted.
*    The library source is old and overly complex, does not support parallel
I/O effectively, and is difficult to use in threaded applications.

HDF5 includes the following improvements.

*    A new file format designed to address some of the deficiencies of HDF
4.x, particularly the need to store larger files and more objects per file.
*    A simpler, more comprehensive data model that includes only two basic
structures: a multidimensional array of record structures, and a grouping
structure. 
*    A simpler, better-engineered library and API, with improved support for
parallel I/O, threads, and other requirements imposed by modern systems and
applications.

-- 
Dr Ray Osborn                Tel: +1 (630) 252-9011
Materials Science Division   Fax: +1 (630) 252-7777
Argonne National Laboratory  E-mail: ROsborn at anl.gov
Argonne, IL 60439-4845





More information about the NeXus mailing list