[Nexus] NeXus - a solution to what is not the real problem ?

freddie.akeroyd at stfc.ac.uk freddie.akeroyd at stfc.ac.uk
Tue Mar 9 16:48:00 GMT 2010


Joachim,

There is a common misconception that NeXus is only intended for raw data
- its original intention was actually as an "exchange format" to reduce
the "N*N converters" issue for moving data between N facilities.
Whatever purpose it is used for community agreement on the file contents
gives most gain, but there are still benefits without this e.g. at least
you can read the file contents correctly (no need to worry about endian
and floating point formats) even if you cannot analyse it automatically
at that time. 

Many of the people involved at the start had interest in data
acquisition and NeXus provides a good choice there. An important point
for facilities is data archiving - you want your files to be readable a
long time in the future and HDF5 provides a well established and solid
foundation. NeXus also provides XML as an alternative underlying format
for smaller, human readable files.

While NeXus raw data files at two facilities will not be directly
interchangeable unless they both follow the same layout, there are still
benefits to using the NeXus API. Generic tools are available for
combining and rearranging NeXus file layouts, making plots, extracting
parts etc. - these can be used if both facilities are NeXus based.
Packages like Matlab and IDL understand HDF5 and hence can read NeXus
files without additional software.

NeXus can handle processed/reduced data files too - as you mention there
is much more scope for sharing these between facilities and analysis
programs after agreement from interested parties on what is required for
a particular application. While this is probably the data that would
most interest facility users, it comes originally from raw data and if
both are NeXus based the data reduction chain at the facility will
probably be easier to manage. The reduced/application specific data file
area is where most work is concentrated at the moment.

NeXus is thus not about a "common raw data format" but instead about
defining a set of common file layouts/content to enable the better
sharing of data at any stage of analysis. There can be as few or as many
of these "application definitions" as the community sees appropriate.
NeXus tries to standardise where it can to avoid ambiguities and
misunderstandings (e.g. how units, axes, distances etc. should be
measured or represented), but is ultimately as flexible or restrictive
as the community wishes.

Regards,

Freddie

> -----Original Message-----
> From: nexus-bounces at nexusformat.org [mailto:nexus-
> bounces at nexusformat.org] On Behalf Of Wuttke, Joachim
> Sent: 09 March 2010 13:37
> To: nexus at nexusformat.org
> Subject: [Nexus] NeXus - a solution to what is not the real problem ?
> 
> Dear colleagues,
> 
> I am currently preparing a deliberately provocative memo with
> working title >Why don't we have better data processing software
> for quasielastic neutron scattering ?<. One section in this paper
> will deal with data storage, and in its present form, it is quite an
> attack on NeXus. To play fair, I post it here, looking forward for
> your comments. Maybe you will convince me that I am mistaken.
> 
> Looking forward to a sound discussion - Joachim
> 
> 
> Though all raw data produced by QENS instruments have basically the
> same structure, many different storage formats are in use.
> Therefore, porting data processing software from one instrument
> to another is generally not possible without
> adapting at least a read-in routine or providing a raw-data conversion
> tool.
> This is a severe nuisance for users,
> and an obstacle for code sharing and collaborative software
> development.
> For these reasons,
> it is a popular idea that efforts to improve the software environment
> should start with the adoption of a \textsl{common raw data format}
---
> I shall call this strategy \textsl{data format first}.
> 
> The common raw data format of our time will be NeXus, if any.
> Under development since more than 15 years,
> NeXus~\cite{qda3} addresses neutron as well as X-ray scattering.
> It enjoys strong political backing,
> as evidenced by an International Advisory Committee
> with delegates from all major facilities.
> A growing number of new spectrometers actually use NeXus,
> be it by choice or forced by site policy;
> on the other hand, so far only few existing instruments have migrated.
> 
> When writing the instrument software for SPHERES,
> I consciously opted against NeXus,
> in favor of a less rigid self-defined format
> that is easier to read by a human,
> thereby facilitating the debugging of data acquisition and
> raw data processing software.
> Maybe, my wishes could have been accomodated within NeXus,
> had I communicated more intensely with the project team.
> However, I have more fundamental objections ---
> not against NeXus itself,
> but against unrealistic promises,
> against overestimating data formats,
> against the flawed strategy \textsl{data format first}.
> 
> Unifying data formats reminds me of church history:
> attempts to (re)unify $n$ different denominations regularly
> result in $n+1$ denominations being around:
> the new, unified church, plus all the groups that split off
> to preserve the good old faith of their own.
> When migrating an existing spectrometer towards NeXus,
> the instrument scientist needs either to support for long time
> read-in routines for both the old and the new data format,
> or to provide routines that achieve lossless conversion from the old
> into the new format.
> Choosing NeXus as raw data format is not sufficient to guarantee
> that data from different instruments can be read by the same software.
> For instance, at SPHERES,
> energy calibration is done at acquisition time,
> and energy transfers $\hbar\omega$ are part of the raw data set.
> At the ILL backscattering spectrometers,
> only a few hardware parameters are stored from which
> the downstream software must construct the energy scale.
> Translating the current output format into something looking like
NeXus
> would not make the raw data files mutually legible.
> Unifying raw data formats is not possible without unifying
> data acquisition programs ---
> which will be rarely feasible
> because in most cases the hardware is too different.
> 
> Some time ago,
> NeXus may have been attractive for developers
> because its rich application programming interface (API)
> relieved them from implementing write-out and read-in routines.
> However, this advantage has vanished because
> modern generic data formats like YAML \cite{qda5}
> allow to store and retrieve
> complex data, composed of scalars, hashes, arrays
> in arbitrary tree-like structures,
> at zero cost through a much simpler API.
> 
> Most fundamentally,
> I think that efforts to unify the raw data format
> are adressing the wrong interface:
> most users do not want to see raw data at all.
> What users want is a calibrated, normalized, reasonably binned
> scattering law $S(q,\omega)$.
> What should be standardized is the procedure to obtain such
> $S(q,\omega)$.
> While most of this procedure can be implemented in quite a generic
way,
> it will remain the instrument scientist's resposibility
> to plug in a low-level routine that reads in and calibrates the
> raw data from his instrument.
> Only he has the technical knowledge required to do it correctly,
> and hardly anybody else needs to care about the raw data and their
> format.
> 
>
-----------------------------------------------------------------------
> -------------------------
>
-----------------------------------------------------------------------
> -------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
>
-----------------------------------------------------------------------
> -------------------------
>
-----------------------------------------------------------------------
> -------------------------
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus
-- 
Scanned by iCritical.


More information about the NeXus mailing list