[NeXus-committee] Validation: Against which application definition? Ignoring base classes?

Osborn, Raymond rosborn at anl.gov
Mon Sep 22 16:14:31 BST 2014


On Sep 19, 2014, at 9:07 AM, Joachim Wuttke <j.wuttke at fz-juelich.de> wrote:

> Dear colleagues:
> 
> NeXus validation tools are currently broken.
> Repairing them won't be just a coding task.
> Rather, a number fundamental problems will come up.
> Here a few of them. Your comments are welcome.
> 
> For simplicity of language, I suppose that
>  https://github.com/nexusformat/definitions/issues/298
> will be accepted: application definitions include
> contributed ones.
> 
> (1) Validation of a NeXus data file means validation
> against one application definition - right?

I have never used NXvalidate, so I don’t know what its criteria for validation are, but I think it would be helpful to have a tool that confirms the file is compliant even if an application definition is not specified. The base classes contain a list of valid fields in each group, and contain a limited number of minOccurs attributes. A tool like this would also check that the NXdata groups properly define a signal and axes. In one (unnamed) case, the application definition has an incorrect ‘errors’ field (named ‘error’), which would have been caught by such a tool when the application definition was being prepared.

> (2) It is desirable to have a tool that automatically
> checks whether a data file is NeXus compliant.
> Possible applications:
> - Facilities could run the validator over all newly
> acquired data, to make sure nobody is breaking NeXus.
> - One could think of a web service where people can upload
> a data file to have the format validated.

It’s not our highest priority, but NeXpy will eventually have full validation functionality. In fact, I would argue that Python is the best way of writing such a tool, since there are good XML tools and a convenient NeXus API. It’s also a good platform to test new NeXus concepts before they are presented to the NIAC. For example, I have implemented group level attributes to define signals and axes, and this is allowing me to test whether this functionality could still allow multiple signals (using field-level attributes as well). In the past, we have tended to make such decisions in the abstract, but this allows us to test ideas before ratifying them.

> (3) If (1) and (2) are agreed, then the problem arises
> how to know from a given file which application definition
> it pretends to follow. I think there is only one clean solution:
> the name of the application definition must be stored in each
> data file. E.g. in form of an NXroot attribute @application_definition.

I believe that is what the ‘definition’ field in the NXentry group is for - it identifies the application definition schema. I think we have always agreed that the application definition should be specified at the NXentry level, to allow the results from multiple instruments to be stored in a single file if required. It could also be in an NXsubentry for cases where two different application definitions may be required for different parts of the same data set. I have not seen this used in practice, but perhaps people have examples to show.

> (4) A few weeks ago we clarified that base classes member
> lists can be extended (by application definitions, and by
> any single NeXus application, if I understood correctly).
> With member lists being neither exclusive nor obligatory,
> base classes turn out to be no more than recommendations
> to application definition writers. Data file validation
> tools will have to ignore the base class specs altogether.

This is covered by my first comment.

> 
> Best regards - Joachim
> 
> _______________________________________________
> NeXus-committee mailing list
> NeXus-committee at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus-committee

It would be good to get this clarified if I’m not correct in all the above, so thanks for raising the issue.

Ray
-- 
Ray Osborn, Senior Scientist
Materials Science Division
Argonne National Laboratory
Argonne, IL 60439, USA
Phone: +1 (630) 252-9011
Email: ROsborn at anl.gov





More information about the NeXus-committee mailing list