[Nexus] About application definitions

Osborn, Raymond rosborn at anl.gov
Wed Jan 20 20:23:06 GMT 2016


Hi Armando,
Thanks for raising this. As you know, we discussed this at the telco this morning so here is, I hope, an accurate summary of what we said.

As you point out, the NXsubentry group is designed to let the same NXentry group contain data that might be used in different contexts, e.g., small-angle scattering and fluorescence spectroscopy. The data might have been collected at the same time, but different parts of the data are intended for different types of analysis. The purpose of the application definition, which is placed in each NXsubentry, is to define what type of analysis that sub-entry requires. Obviously, to conform to the standard, each sub-entry must conform to the definition, which is described by an NXDL file.

The committee has agreed that, if the community is happy with a simple application definition that doesn’t contain the hierarchy normal to most NeXus files (the sub-entry could even be completely flat), then that is fine. Just create an application definition that defines what is to be in the sub-entry. I think your example is to create an application definition for what is required for PyMCA analysis - if a few parameters are sufficient, then you are not required to put them in NXinstrument/NXmonochromator, etc, groups. You can specify them at the top level. If that is sufficient for the PyMCA user community, that is fine by us. We do encourage you to publish your application definition, since a flat collection of metadata without any context will be meaningless to outsiders.

If, however, you want to analyze small-angle scattering, and the application definition you are conforming to requires hierarchical groups, you obviously have to follow the definition. That is the purpose of having a definition. However, NeXus allows you to link groups or fields multiple times, even with different names, so you don’t have to duplicate data required in more than one sub-entry. Just add a link and the data is only stored once. 

There is a new concept, called ‘features,' in which a single NXentry can specify that it implements more than one type of analysis, but that is still under development.

You are correct that the purpose of NeXus is to give communities the flexibility they need to store metadata in a meaningful way. Sometimes, a simple file is all that is needed, but other times, the complexity is driven by necessity. The goal is for the meaning of data to be clear to other people. If you follow the NeXus group hierarchy, then the data is usually self-describing without any external documentation. That is our preference. It would be a shame, for example, not to put plottable data within an NXdata group since that makes it easier to identify what kind of data you are storing. However, if you want to use a flat file format, then the application definition you write will allow people to understand what’s in it.

I hope others will chime in if I got any of this wrong.

With best regards,
Ray

> On Jan 16, 2016, at 10:29 AM, V. Armando Sole <sole at esrf.fr> wrote:
> 
> Happy New Year!
> 
> At the ESRF we do not intend to use the old fashion way of writing the application definitions at the top level inside an NXentry but inside one (or several) NXsubentry. That will allow us to treat the files in the same way whether they are single or multiple technique files.
> 
> I am revising the documentation concerning NXsubentry but I do not know if the latest documentation is the one at:
> 
> http://download.nexusformat.org/sphinx/classes/base_classes/NXsubentry.html?highlight=nxsubentry
> 
> So, I would like to get that point confirmed.
> 
> I have to say (repeat?) that the way definitions are presented to the potential users look very tedious. We *do* intend to recreate the NeXus structure as much as we can, but as definitions are currently presented they almost imply to repeat the NXentry structure in the NXsubentry. For you it was easy to say "we have foreseen a single application definition at the NXentry level, we just give the option to have it at an NXsubentry level and that's all". My point is that when having it at the NXentry level, you needed some structure to keep things tidy, but when NXsubentry appeared, the structure is overkill.
> 
> From a practical point of view, a simple definition for technique XXXX, requiting datasets I0 and It and a set of energies, it would require:
> 
> entry at NXentry
>    ...
>    whatever_name at NXsubentry
>        definition (NX_CHAR="XXXX", @version, @URL)
>        I0 [NXFLOAT]
>        It [NX_FLOAT]
>        energy [NX_FLOAT]
> 
> So, it continues to look a mistake to me, to present the definitions as forcing to have "energy" inside an NXmonochromator group, I0 inside an NXmonitor or an NXdetector and It inside an NXdetector. Please note that I am not saying it is a bad idea to have them inside those groups!. What I am saying is that to validate that a file implements the above definition, one only needs to validate the presence of I0, It and energy. Whether they are datasets of links to actual datasets is irrelevant.
> 
> You may say that since we intend to implement the NeXus structure, we should not worry so much and you are correct. The point is, and I repeat myself, that some communities and/or facilities are elaborating their own "data exchange" formats based on HDF5 that are *nothing else* than the equivalent of an application definition the way I have presented it here.
> 
> I would aim to present application definitions in that way. The entrance ticket for the potential users just interested on data analysis would be "less expensive". All what you would be asking compared to community driven definitions would be to include strings specifying the technique they implement, the version (optional but great!) and the URL describing it (optional but also great!).
> 
> You could go as far as presenting that approach as the "official NeXus approach to incorporate community driven definitions". I think you have no choice, so better show some flexibility.
> 
> Best wishes,
> 
> Armando
> 
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus

-- 
Ray Osborn, Senior Scientist
Materials Science Division
Argonne National Laboratory
Argonne, IL 60439, USA
Phone: +1 (630) 252-9011
Email: ROsborn at anl.gov





More information about the NeXus mailing list