[NeXus-committee] questions about NeXus data structures as per Figs 1-3 of draft paper

Pete R Jemian prjemian at gmail.com
Mon Aug 18 16:33:30 BST 2014


Joachim:

Your questions are most excellent.  These are all points that members of 
the NeXus Technical Committee should understand.

On 08/18/2014 08:32 AM, Joachim Wuttke wrote:
> Dear colleagues:
>
> In the draft manuscript (v6), Figs 1 and 3 show the
> common structure of raw-data and processed-data
> files, respectively.
>
> Are these structures also described in the docs? Where?

Not as clearly as these figures.  They will become part of the manual.

In previous versions of the manual, there was a table that was used to 
describe the instrument definition hierarchy.  This was set aside as the 
instrument definitions (described with meta-DTD and documented on the 
wiki) were refactored into the NXDL we have now.  Now you see that 
documentation reappearing.  The attempt here is to describe what is 
needed to demonstrate the point without describing all the 
possibilities.  (Such possibilities are terribly distracting.  For 
example, they lead some people to think that all possibilities are 
required.)

One additional figure might "drive the point home" to describe the 
absolute minimum required structure of a NeXus data file.  That is:

-----------------------------------
| NXroot                          |
|   -------------------------------
|   | NXentry            required |
|   |   ---------------------------
|   |   | NXdata         required |
|   |   |   data:NX_NUMBER        |
|   |   |     @signal=1  required |
-----------------------------------

And, even in this simple example, the name "data" is not required since 
the attribute signal=1 labels this for NeXus as the default data to be 
visualized.

This is the absolute minimum structure (virtually no metadata) and we 
have an example that shows this:
http://download.nexusformat.org/doc/html/examples/h5py/index.html

However, since so much raw data is acquired with knowledge of much more 
metadata, Figure 1 of the manuscript is a great suggestion for the 
structure of a prototypical NeXus data file.


> Some groups are marked as "required". Is this specified
> in the docs? Where?

http://download.nexusformat.org/doc/html/introduction.html#important-classes

Perhaps this should be content in the Design chapter?
http://download.nexusformat.org/doc/html/design.html


> If some groups at second or third level are required,
> then the first-level group "NXroot" must also be
> required, right?

At the root of the HDF5 data file, there has been no requirement to have 
an attribute @NX_class="NXroot" on the root.  It was suggested at a NIAC 
meeting some years ago that we should start adding this attribute as 
"good practice".  NIAC did not go so far as requiring it so as to 
maintain compatibility with common use and the NAPI implementation.


> Why should we require NXdata? If scientists at a certain
> instrument have no interest in using generic default
> plotting tools, and don't like the extra complexity of
> symbolic links in their raw data files, we should allow
> them to use NeXus without NXdata.

http://download.nexusformat.org/doc/html/motivations.html#index-0


> Why is NXinstrument required for raw-data files? Is there
> an application definition without an NXinstrument group?

NXinstrument is not required.

How did you assume that NXinstrument was required for raw-data files?
We need to adjust the manuscript to make sure others do not form that 
opinion.

BUT, pursuant to 
http://download.nexusformat.org/doc/html/motivations.html#defineddictionary, 
NXinstrument provides the place to store agreed-upon terms such as 
wavelength.

Another example use of NXinstrument is in the figure of this section:
http://download.nexusformat.org/doc/html/design.html#links


> It seems though that there are some raw-data application
> definitions without NXsample, and many without NXuser.
> Is there a rationale why some instrumt types would
> require these metadata, and others not? Or do the
> different application definitions just reflect different
> personal preferences of different authors?


Your last sentence is a good description of what I think is the reason.

>
> For multi-method instruments, some entries move from
> NXentry into NXentry/NXsubentry. What if one day an
> established multi-method instrument gets embedded into
> a yet more powerful instrument: will we then have
> NXentry/NXentry/NXsubentry or NXentry/NXsubentry/NXsubsentry
> or NXentry/NXsubentry/NXsubsubsentry? In my humble
> opinion, NXsubentry should never have been invented;
> why not use the power of recursion and allow NXentry/NXentry?


Certainly a good topic for NIAC debate.

Can you describe (with more specifics) such an instrument or concept 
that could not already be documented with Figure 2 in the manuscript?

In my view, for a single-technique small-angle scattering I(Q) dataset,
it seems much easier to place the SAS data at
    /NXentry/NXdata/I(Q)
than
    /NXentry/NXsubentry/NXdata/I(Q)

But, knowing that our users and scientists have boundless creativity, I 
accede to the structure of Figure 2 in the manuscript as that will cover 
conceivable future variety.

Pete



More information about the NeXus-committee mailing list