area detector data?

Brian J. Tieman tieman at aps.anl.gov
Fri Dec 3 02:08:48 GMT 1999


Ray

Ray Osborn wrote:
> 
> Brian,
> Thanks for the reply.  I certainly appreciate the difficulties of adhering
> to a standard when there are many competing viewpoints (cf HTML).  How to
> maintain a standard's integrity while allowing it to develop is an
> organisational problem that we haven't yet solved.  We will be making
> proposals on that front, I hope shortly.  In the meantime, please tell your
> colleagues that the best way of getting some attention for what they want is
> to write to this list as you have.
> 

Well, I'll certainly keep trying.  Other than metadata issues, I do have
a few other things I'd like to see at some point.  Perhaps most
important to me is thread support.  I realize that's mostly a HDF
problem at the moment and I've only just begun to look at HDF5, but lack
of threads is hampering us in several areas.

> > Anyway, here's a couple of sample groups:
> >
> > data_attribute--group
> >    black_field--the name of this group and what is in it differ based on the
> > type of data stored in this file
> >       name--field containing file name
> >       description--field containing type of data--matches group name in all
> > cases I know about
> >       data_file_index--index number for this file
> >       data_type--integer expressing data type--interestingly enough this
> > matched NX_INT16, etc...
> >       data_dimensions--number of dimensions
> >       n_i_pixels--x axis size
> >       n_j_pixels--y_axis size
> >       integration time--length of integration time
> >    NeXus_API_version--version of napi used
> >    experiment_file_name--name of HDF file containing groups common to all
> > images
> >
> > data_array--group
> >    image_data--the actual 2D data
> >
> > I don't have ready access to the definition of some of the other groups, but
> > suffice it to say that the document describing them is 100 pages long.  This
> > looks so much unlike what I think NeXus is about that I refuse to call the
> > files
> > NeXus files.  They're really HDF files with a specific format for Computed
> > Micro-Tomography.
> >
> 
> Actually, this doesn't look that difficult to fit into the NeXus scheme.
> The following is a NeXus-compliant file (group classes in parentheses) :
> 
> black_field (NXentry)
>    name
>    description
>    data_file_index
>    integration_time
>    experiment_file_name
>    data (NXdata)
>       image_data
> 


> Note that the actual file name and NeXus version number are automatically
> stored as global attributes by the NeXus API.  The data type and pixel
> numbers are redundant because you can get that information by doing a call
> to NXgetinfo after opening image_data.  There is really no need to store
> such information twice, although you can add them to the black_field group
> if you want to.
> 

There are two forces at work on this project.  My job is to make sure
the data acquisition system works and saves files they can read back.  A
group of peopl in MCS have ported/developed code on the supercomputer to
read in these files as they are generated and perform the complex
calculations necessary to make the images make sense and display them
all in quasi-real time.

The code the MCS people developed expects the data to be in very
specific places.  What they did, in essence, is make a precompiler which
generates a class hierarchy for all the defined groups providing putter
and getter methods plus other utility methods for every piece of
metadata.  Each group becomes a class with a put and get for each
field.  These put and get fields are then how the analysis code
retrieves the data once the file has been read.  It looks to me like
some of the redundant data may have been put into it's own field just to
make a simple preprocessor.

I haven't seen it in action, but the preprocessor must generate massive
amounts of code!  And that code grow proportionally to the number of
groups and fields in the file.

I don't like this for many reasons and took a very much different
approach in my library.  It does lock us into the files as defined,
however.

> >
> > I have had more success in making other types of data more NeXus
> > like--however, most people I work for/with don't have a good grasp of what
> > NeXus is/does (I'm not even sure my grasp is as good as it should be).  So,
> > I'm content to write HDF files--for the moment at least...
> >
> >
> 
> This is a serious problem for me.  I have tried to make the NeXus web pages
> as comprehensible as possible, but I am too close to them to appreciate the
> difficulties outsiders have.  I would really appreciate feedback on what
> parts need improving.  I know that one suggestion is to make more NeXus
> files available as examples.
> 

NeXus files as examples would be a great help although my biggest
problem was in distinguishing between NeXus the file specification and
NeXus the API to the users.  That is, I _think_ the goal behind NeXus
the file specification is to define locations to store metadata such
that readers can make sensible decisions.  That is, I could store
temperature in a group called sample_temp, or I could store it alone
with the data, or with the acquisiton parameters, etc...  But if I put
it in the place NeXus defines, readers will have an easy time finding
that data.  The users--who are reasonably computer savy people--look at
the NeXus API and say well, I just use the NX_put_field routine to save
the temperature in any old group with any old field name I wish.  I
think that was the approach taken which got us into this mess to begin
with.

Everybody in my group knows how to use NAPI--they just don't seem to
understand that the standard is a little more than that.  If my
assumptions here are wrong, then you can see where I'm just as much a
part of the problem as anyone...

> >>
> >> HDF 4.1r3 allows for internal data compression of datasets using a variety
> >> of algorithms.  We have not yet implemented any of them because it had not
> >> yet got to the top of the priority list.  If this is critical for any
> >> particular user, we can see if it can be moved up.  I don't think it's that
> >> difficult, unless the performance penalty is significant.
> >>
> >
> > I'd really like to see this done.  Most of our images are 1024x1024x2bytes or
> > ~ 2MB.  A complete data set may contain close to 1000 of these images.
> > Plus, the hope is to go to 2048x2048x2bytes cameras within the next year or
> > so.  That's an awful lot of data.  As long as the file compression doesn't
> > bottle neck the file saving too much (we can currently save and image at ~1HZ)
> > it's an option I'd like to make use of.
> >
> 
> Never say we're not responsive.  Following that post, Mark Koennecke has
> already produced a version of the API which includes data compression.  We
> are testing it now and it looks as if it works very well.  I hope we can
> officially release it soon.
> 

This would be great!  I'm not yet sure how much our data would compress
in a lossless way, but I'm itching to try it!

Brian
tieman at aps.anl.gov



More information about the NeXus mailing list