Levels of NeXus compliance & More compression

Ray Osborn ROsborn at anl.gov
Fri Jan 21 14:59:14 GMT 2000


on 2000/1/21 9:14 AM, C.M.Moreton-Smith at rl.ac.uk at
C.M.Moreton-Smith at rl.ac.uk wrote:

> 
> Compression ++
> ==============
> Currently a de-motivator to storing our data in NeXus is that the
> compression is not as good as we can currently get with our native format
> files.  We use two simple FORTRAN routines which compress/decompress our
> integer signal data based on the assumption that the difference between two
> adjacent data points can usually be stored as a relative offset in a single
> byte rather than as a longword integer value.
> 
> The scheme is also very fast and good compression is still possible
> subsequently with LZW.  The question is, could we add this scheme as a
> compression option to the NeXus API?  Obviously if we don't add it to the
> API and we continue to use this, our files would not be browseable with a
> NeXus browser automatically.
> 
> On the plus side, the scheme is very likely to work for most forms of
> spectra based data and would certainly be of general benefit to NeXus(i.e.
> not just ISIS). On the minus side, we would have to implement the
> compression in the NeXus API and not in the generic HDF beneath.  This would
> mean that we need to use the NeXus API and NeXus browsers.  Are we sure
> enough of the benefits of NeXus to take a step like this and improve the
> NeXus over the underlying HDF?
> 
> In our case we can get up to 30% higher compression and with expected data
> rates of 1-2GB a day from some instruments, this becomes very significant,
> the traditional argument that disk space etc. is cheap is not very
> convincing for data rates like this.
> 
> 
> What do others think?
> 

Chris,
It's not clear from what you wrote if you tried all the algorithms that are
available in HDF (LZW, Run-length encoding, and Skipping Huffman).  I've
never looked into the relative merits of these three for different types of
data, although Brian Tieman posted some comparisons to this mailing list.
Perhaps one of the others works better with ISIS data.

I think that it would be quite dangerous to implement a new compression
scheme, because we would make the files non-standard HDF.  This is something
we have deliberately avoided in NeXus.

Regards,
Ray
-- 
Dr Ray Osborn                Tel: +1 (630) 252-9011
Materials Science Division   Fax: +1 (630) 252-7777
Argonne National Laboratory  E-mail: ROsborn at anl.gov
Argonne, IL 60439-4845





More information about the NeXus mailing list