[Nexus] NeXus - a solution to what is not the real problem ?

Mark Koennecke Mark.Koennecke at psi.ch
Tue Mar 9 14:58:54 GMT 2010


Dear Joachim Wuttke,

Wuttke, Joachim wrote:
> Dear colleagues,
>
> I am currently preparing a deliberately provocative memo with
> working title »Why don't we have better data processing software
> for quasielastic neutron scattering ?«. One section in this paper
> will deal with data storage, and in its present form, it is quite an
> attack on NeXus. To play fair, I post it here, looking forward for
> your comments. Maybe you will convince me that I am mistaken.
>
> Looking forward to a sound discussion - Joachim
>
>
> Though all raw data produced by QENS instruments have basically the
> same structure, many different storage formats are in use.
> Therefore, porting data processing software from one instrument
> to another is generally not possible without
> adapting at least a read-in routine or providing a raw-data conversion tool.
> This is a severe nuisance for users,
> and an obstacle for code sharing and collaborative software development.
> For these reasons,
> it is a popular idea that efforts to improve the software environment
> should start with the adoption of a \textsl{common raw data format} ---
> I shall call this strategy \textsl{data format first}.
>
> The common raw data format of our time will be NeXus, if any.
> Under development since more than 15 years,
> NeXus~\cite{qda3} addresses neutron as well as X-ray scattering.
> It enjoys strong political backing,
> as evidenced by an International Advisory Committee
> with delegates from all major facilities.
> A growing number of new spectrometers actually use NeXus,
> be it by choice or forced by site policy;
> on the other hand, so far only few existing instruments have migrated.
>
> When writing the instrument software for SPHERES,
> I consciously opted against NeXus,
> in favor of a less rigid self-defined format
> that is easier to read by a human,
> thereby facilitating the debugging of data acquisition and
> raw data processing software.
> Maybe, my wishes could have been accomodated within NeXus,
> had I communicated more intensely with the project team.
> However, I have more fundamental objections ---
> not against NeXus itself,
> but against unrealistic promises,
> against overestimating data formats,
> against the flawed strategy \textsl{data format first}.
>
> Unifying data formats reminds me of church history:
> attempts to (re)unify $n$ different denominations regularly
> result in $n+1$ denominations being around:
> the new, unified church, plus all the groups that split off
> to preserve the good old faith of their own.
> When migrating an existing spectrometer towards NeXus,
> the instrument scientist needs either to support for long time
> read-in routines for both the old and the new data format,
> or to provide routines that achieve lossless conversion from the old
> into the new format.
> Choosing NeXus as raw data format is not sufficient to guarantee
> that data from different instruments can be read by the same software.
> For instance, at SPHERES,
> energy calibration is done at acquisition time,
> and energy transfers $\hbar\omega$ are part of the raw data set.
> At the ILL backscattering spectrometers,
> only a few hardware parameters are stored from which
> the downstream software must construct the energy scale.
> Translating the current output format into something looking like NeXus
> would not make the raw data files mutually legible.
> Unifying raw data formats is not possible without unifying
> data acquisition programs ---
> which will be rarely feasible
> because in most cases the hardware is too different.
>
> Some time ago,
> NeXus may have been attractive for developers
> because its rich application programming interface (API)
> relieved them from implementing write-out and read-in routines.
> However, this advantage has vanished because
> modern generic data formats like YAML \cite{qda5}
> allow to store and retrieve
> complex data, composed of scalars, hashes, arrays
> in arbitrary tree-like structures,
> at zero cost through a much simpler API.
>
> Most fundamentally,
> I think that efforts to unify the raw data format
> are adressing the wrong interface:
> most users do not want to see raw data at all.
> What users want is a calibrated, normalized, reasonably binned
> scattering law $S(q,\omega)$.
> What should be standardized is the procedure to obtain such $S(q,\omega)$.
> While most of this procedure can be implemented in quite a generic way,
> it will remain the instrument scientist's resposibility
> to plug in a low-level routine that reads in and calibrates the
> raw data from his instrument.
> Only he has the technical knowledge required to do it correctly,
> and hardly anybody else needs to care about the raw data and their format.
>
>   
My 2c worths of comments:

First a minor point: with hdfview, nxbrowse and the XML output option 
around I
never found it to hard to debug NeXus file writing.

A major point is that you are talking about two different levels of 
sharing software or data:
1) If you have a common raw data format, then you may share the data 
reduction program
    which produces the S(Q,omega). The attempt to do this is worthwhile. 
You already admitted
   that this can be done in a fairly generic way.
2) The processed S(Q,omega) data which the user analyses
NeXus actually tries to address both levels. An application definition 
for S(Q,omega) is actually
on the NeXus TODO list and can come quickly into existence if some 
expert gives more
details on the minimum content required for such a file.

It is largely a matter of philosophy if the data reduction to energy 
should happen directly
in the DAQ SW. There is always the problem that this has been 
implemented wrongly.  So, the
wise instrument scientist keeps the raw data too. And users (at least 
those I have experience with)
will ask for this. From the contact with my TOF scientists I know that 
even such a seemingly
simple thing like the conversion to energy can be cause for heated 
arguments.

Best Regards,

    Mark Koennecke





> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus
>   



More information about the NeXus mailing list