[Nexus] NeXus - a solution to what is not the real problem ?
Andy Gotz
andy.gotz at esrf.fr
Tue Mar 9 19:03:48 GMT 2010
Hi Gerd + Joachim,
I can hear lots of frustration in Joachim's email so I thought I would
share some of my thoughts :
I fully agree with Gerd's conclusion that defining a new format is not
an option for me. Nexus is an example of how painful it gets when
defining new data formats. I don't want to go down this road again and
alone. One of the problems is acceptance by the community. If we stick
to Nexus it will eventually be adopted I am sure - perseverance is a
powerful ally. Until it is adopted widely I understand Brian's
frustration - imagine 10 years of trying to get a format accepted and
still no success !
I think that some of the reasons why Nexus has not being adopted are :
(1) there has been too much emphasis on storing of raw data, it would
have been much more successful IMHO if Nexus had concentrated on
analysed data. This is what the user sees and wants. It would have
proved the added value of Nexus much quicker. (2) lack of manpower
working on Nexus (maybe this will change with the Pandata networking
activity getting some funding ...). (3) lack of reactivity of the Nexus
developers, partly related to (2) I think but also (1). Recently at the
Hyperspectral workshop at the ESRF we requested that data dimensions be
added as attributes to the Nexus data definition. This way a program can
easily identify the spectra, images and 3d volumes in a Nexus file. We
do not have any feedback from the Nexus community how long it will take
to get something as simple and fundamental as this to be accepted.
Of course it is easy to criticise because I was not there in the
beginning of the Nexus. But I think we need some changes in the way the
Nexus committee works today to make it more reactive. I have just been
replaced on the NIAC committee so again it is easy to make such a
suggestion !
In Joachim's example I find it is strange to be discussing his file
format which is essentially how to store 2 columns of data. I agree that
in this case Nexus can be seen as overkill. But if Nexus were used it
could also use one of the standard nexus tools to extract it to ascii to
get the immediate feedback. It also points a finger to the lack of Nexus
viewers. I don't know of any which will refresh the contents of the
displayed file automatically. But the case of 2 columns of ascii data is
not what most of us are dealing with. Most of us have to deal with tens
of thousands and millions of images and data volumes (cf. Brian's 100
TB's). ASCII is NOT an option. We are solving a new problem in fact.
When Nexus was started this was not such a critical issue. Today it is.
The hyperspectral workshop was proof of this. Again it is mainly
interesting to agree on how to store large volumes of data for analysis
programs. Would we be better off adopting some other 3D data format ?
But what about 2D and 1D and the experimental / data analysis context ?
I would say if you want to use YAML for your raw data that is your
choice. But why not join the larger community for helping defining how
to store analysed data. Then we can use your data analysis routines and
vice versa.
I think the time has never been riper for Nexus to be adopted and to
become a real standard. Thanks to the new institutes standardising on
Nexus and some of the old ones trying to do the same. But there is no
guarantee that Nexus will succeed. Joachim, your email is proof of this.
The bottom line is I think we need to improve the current situation
together and not simply go our own way. The diversity of the scientific
techniques of the communities we are serving makes this a non-trivial
problem.
Andy
Gerd Wellenreuther wrote:
> Hi Joachim,
>
> Wuttke, Joachim schrieb:
>> What I am attacking is not the
>> serious work you are doing for really complicated data set, but the
>> idea, popular
>> at management level, that NeXus should ultimately be used at _all_
>> neutron and
>> X-ray instruments. All I want to say is: for certain types of
>> instruments, migrating
>> to NeXus would be considerable effort, would rather increase than
>> reduce the
>> diversity of data formats, would not improve the messy state of
>> software seen by
>> the users.
> There are lot of issues which can be tackled by the use of a common
> data format, not only simple data exchange, but also software
> developement, data archiving etc.. For this reason, the aim to define
> and spread the use of such a common data format is good, IMHO. But
> when it comes to *implementing* this dataformat at individual
> beamlines / instruments, the corresponding work should *not* be done
> by single beamline scientists, or even single institutes - this would
> again result in different flavours of what was supposed to be the
> common data format (as can be seen by the struggle of facilities like
> Soleil and Diamond, both writing NeXus-files but not being able to use
> the software(s) developed at the other side of the channel). So, if we
> do it right, the single beamline scientist should not be required to
> bear the biggest part of the workload - the opposite should be the
> case. In the long run, we want to save time, right?
>
> So the goal has to be to define a common data format *ALONG* with a
> high-level API (much higher than the present NeXus-API in my opinion)
> which assists IT-guys and scientists from different facilities to
> exploit the capabilities of NeXus, *AND* further tools for the
> scientists to do their job. Of course, if the management thinks this
> is a good idea, they have to fund people to implement it :).
>
> Sure, NeXus is not the perfect candidate for such a common data
> format. Fact is also: There is no other candidate. (At this point you
> might notice: I am not considering starting from scratch as an option
> :) .)
>
> Cheers, Gerd
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus
More information about the NeXus
mailing list