[Nexus] NeXus - a solution to what is not the real problem ?

Andy Gotz andy.gotz at esrf.fr
Tue Mar 9 19:03:48 GMT 2010


Hi Gerd + Joachim,

I can hear lots of frustration in Joachim's email so I thought I would 
share some of my thoughts :

I fully agree with Gerd's conclusion that defining a new format is not 
an option for me. Nexus is an example of how painful it gets when 
defining new data formats. I don't want to go down this road again and 
alone. One of the problems is acceptance by the community. If we stick 
to Nexus it will eventually be adopted I am sure - perseverance is a 
powerful ally. Until it is adopted widely I understand Brian's 
frustration - imagine 10 years of trying to get a format accepted and 
still no success !

I think that some of the reasons why Nexus has not being adopted are : 
(1) there has been too much emphasis on storing of raw data, it would 
have been much more successful IMHO if Nexus had concentrated on 
analysed data. This is what the user sees and wants. It would have 
proved the added value of Nexus much quicker. (2) lack of manpower 
working on Nexus (maybe this will change with the Pandata networking 
activity getting some funding ...). (3) lack of reactivity of the Nexus 
developers, partly related to (2) I think but also (1). Recently at the 
Hyperspectral workshop at the ESRF we requested that data dimensions be 
added as attributes to the Nexus data definition. This way a program can 
easily identify the spectra, images and 3d volumes in a Nexus file. We 
do not have any feedback from the Nexus community how long it will take 
to get something as simple and fundamental as this to be accepted.

Of course it is easy to criticise because I was not there in the 
beginning of the Nexus. But I think we need some changes in the way the 
Nexus committee works today to make it more reactive. I have just been 
replaced on the NIAC committee so again it is easy to make such a 
suggestion !

In Joachim's example I find it is strange to be discussing his file 
format which is essentially how to store 2 columns of data. I agree that 
in this case Nexus can be seen as overkill. But if Nexus were used it 
could also use one of the standard nexus tools to extract it to ascii to 
get the immediate feedback. It also points a finger to the lack of Nexus 
viewers. I don't know of any which will refresh the contents of the 
displayed file automatically. But the case of 2 columns of ascii data is 
not what most of us are dealing with. Most of us have to deal with tens 
of thousands and millions of images and data volumes (cf. Brian's 100 
TB's). ASCII is NOT an option. We are solving a new problem in fact. 
When Nexus was started this was not such a critical issue. Today it is. 
The hyperspectral workshop was proof of this. Again it is mainly 
interesting to agree on how to store large volumes of data for analysis 
programs. Would we be better off adopting some other 3D data format ? 
But what about 2D and 1D and the experimental / data analysis context ?

I would say if you want to use YAML for your raw data that is your 
choice. But why not join the larger community for helping defining how 
to store analysed data. Then we can use your data analysis routines and 
vice versa.

I think the time has never been riper for Nexus to be adopted and to 
become a real standard. Thanks to the new institutes standardising on 
Nexus and some of the old ones trying to do the same. But there is no 
guarantee that Nexus will succeed. Joachim, your email is proof of this.

The bottom line is I think we need to improve the current situation 
together and not simply go our own way. The diversity of the scientific 
techniques of the communities we are serving makes this a non-trivial 
problem.

Andy

Gerd Wellenreuther wrote:
> Hi Joachim,
>
> Wuttke, Joachim schrieb:
>> What I am attacking is not the
>> serious work you are doing for really complicated data set, but the 
>> idea, popular
>> at management level, that NeXus should ultimately be used at _all_ 
>> neutron and
>> X-ray instruments. All I want to say is: for certain types of 
>> instruments, migrating
>> to NeXus would be considerable effort, would rather increase than 
>> reduce the
>> diversity of data formats, would not improve the messy state of 
>> software seen by
>> the users.
> There are lot of issues which can be tackled by the use of a common 
> data format, not only simple data exchange, but also software 
> developement, data archiving etc.. For this reason, the aim to define 
> and spread the use of such a common data format is good, IMHO. But 
> when it comes to *implementing* this dataformat at individual 
> beamlines / instruments, the corresponding work should *not* be done 
> by single beamline scientists, or even single institutes - this would 
> again result in different flavours of what was supposed to be the 
> common data format (as can be seen by the struggle of facilities like 
> Soleil and Diamond, both writing NeXus-files but not being able to use 
> the software(s) developed at the other side of the channel). So, if we 
> do it right, the single beamline scientist should not be required to 
> bear the biggest part of the workload - the opposite should be the 
> case. In the long run, we want to save time, right?
>
> So the goal has to be to define a common data format *ALONG* with a 
> high-level API (much higher than the present NeXus-API in my opinion) 
> which assists IT-guys and scientists from different facilities to 
> exploit the capabilities of NeXus, *AND* further tools for the 
> scientists to do their job. Of course, if the management thinks this 
> is a good idea, they have to fund people to implement it :).
>
> Sure, NeXus is not the perfect candidate for such a common data 
> format. Fact is also: There is no other candidate. (At this point you 
> might notice: I am not considering starting from scratch as an option 
> :) .)
>
> Cheers, Gerd
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus



More information about the NeXus mailing list