[NeXus-committee] HDF5 developments

nick.rees at diamond.ac.uk nick.rees at diamond.ac.uk
Fri Apr 5 11:46:44 BST 2013


Hi,

I am writing to see if I can generate some community support for developments in HDF5. A number of our organisations have committed to the NeXus file format, and a number of others like the idea, but have reservations because of some HDF5 limitations - because HDF5 is the most popular file format underpinning the NeXus standards. After a successful collaboration between Dectris, PSI, DESY and The HDF Group last year, I am trying to lead some more development over the next year, with the goal of adapting the HDF5 library to overcome some of these shortcomings, and thereby promote the use of NeXus over HDF5 across all domains.

Last year's developments focussed on compression performance. The Dectris/PSI funded development allows detector developers to compress data outside of the library and just feed the compressed chunks in to be written. The DESY funded development allows the user to provide their own pluggable filters for specialist compression methods.

I see the way forward as being supported through two routes.

The first route is a support contract raised separately between each large facility and The HDF group. This would guarantee support for bug fixes, small developments, testing and specific, small, synchrotron based requirements (such as support by the HDF group of specific compression plug-ins, so that users are guaranteed to have access to the plug-ins after a specific HDF release). It would also contribute to what The HDF Group calls General Maintenance Quality Assurance and Support (GMQS). This is the cost of on-going support of the library and in the current HDF Group cost model it is funded by 37.5% of every contract. By generating a separate support revenue stream I hope we can reduce the project development costs. I see this being of the order of 10k monetary units (£/€/$ take your pick) annually. I am discussing the detail of this sort of support with The HDF Group and attached is a recent draft of what we are proposing. If you are interested, please contact Don Elmore of The HDF Group directly at <d-elmore at hdfgroup.org>.

The second route is through supporting targeted project work. The cost of this will be in the order of (£/€/$)100k for a phase of development and will generate a function that will benefit the community as a whole. One organisation would be responsible for funding one phase of development, but we could coordinate development across the community. The next two projects that have been identified are:

Single Writer Multiple Reader. This allows data analysis programs to read an HDF5 file while it is still being written. Attached is a recent draft of a set of use cases describing this functionality.

Parallel Compressed Writing. This allows multiple writers to write data in parallel to support new high speed parallel detectors. Our current preference is for the writers to write data independently, and for the separate files to be read as a single dataset of an upper level file via an extension of the HDF5 dataset definition. At the moment a dataset can be a soft-link to a single dataset in another file. We are thinking of extending this to allow a dataset to link to several files. Attached is a rough draft of the current requirements and use cases for comment.

Anyway, that is the idea. I am not sure who to start addressing this too, so I have started with the NeXus committee and a number of other people who I believe may have a strategic interest in these developments. I would be interested in hearing what you think about the idea. I have also created a web page with basically the same information as I have put in this email, but I will keep it updated with the latest developments:

http://confluence.diamond.ac.uk/display/HDF5DEV/HDF5+Developments

I look forward to hearing from you. I might also happy to discuss this in person if we meet up at meeting - possibly the EPICS meeting at Diamond later this month, or ECM-28 in Warwick. If enough interest is generated we could organise a workshop to discuss future developments.

Cheers,

Nick Rees
Principal Software Engineer           Phone: +44 (0)1235-778430
Diamond Light Source                  Fax:   +44 (0)1235-446713


-------------- next part --------------
A non-text attachment was scrubbed...
Name: Premium Support Agreement data sheet_draft.pdf
Type: application/pdf
Size: 87965 bytes
Desc: Premium Support Agreement data sheet_draft.pdf
URL: <http://lists.nexusformat.org/pipermail/nexus-committee/attachments/20130405/561a3c7b/attachment.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SWMR Use Cases-2013-03-13.pdf
Type: application/pdf
Size: 254585 bytes
Desc: SWMR Use Cases-2013-03-13.pdf
URL: <http://lists.nexusformat.org/pipermail/nexus-committee/attachments/20130405/561a3c7b/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TDI-CTRL-REQ-034 RFC Parallel Compressed Writer.pdf
Type: application/pdf
Size: 151201 bytes
Desc: TDI-CTRL-REQ-034 RFC Parallel Compressed Writer.pdf
URL: <http://lists.nexusformat.org/pipermail/nexus-committee/attachments/20130405/561a3c7b/attachment-0002.pdf>


More information about the NeXus-committee mailing list