Alternative axis scheme

Ray Osborn ROsborn at anl.gov
Wed Jan 12 14:54:26 GMT 2000


Eric Boucher raised an important limitation in the NeXus scheme for linking
multidimensional data sets to the axis data sets.  The issue is not whether
a data set with attribute "axis" = 1 should be the x-axis, as we've recently
been debating, but whether a data set can be used as different axes for
different multidimensional data

i.e. whether the same one-dimensional array (SDS) can be used to label the
x-axis of one data set and the y-axis of another.

This is not possible if we identify the type of axis by an attribute in the
axis data set itself.  It can only be achieved if we identify the axes using
attributes in the multi-dimensional data to be plotted.  Then each
multidimensional data set is free to identify whatever one-dimensional data
sets it wants to act as its axes (assuming the dimension sizes match).

I would like to propose a possible solution to this.  I suggest that we
define an attribute of the multidimensional data to store an int32 array of
the reference numbers of the one-dimensional data sets it wants to use for
its axes.  I suggest we call it "axis_ID" so that it's not mistaken for
anything physical.

HDF uniquely identifies all its data objects in a file with a unique
tag/reference pair, which can be retrieved as an NXlink struct using
NXgetdataID.  This struct stores the tag and reference numbers as two int32
values.  Actually, we don't have to worry about the tag, because it is the
same for all SDS's (defined as DFTAG_NDG).  Therefore, the reference number
uniquely identifies a particular SDS.

When plotting data, a program will open the NXdata group, find which data
set has the "signal" attribute, read its attributes to get the "axis_ID"
array, search for those reference numbers within the NXdata group (using
NXgetdataID), and open each axis data set in turn.

There are a number of advantages to this scheme :

1) It solves Eric's problem.  The plottable data chooses its own axes in
whatever order it likes.

2) It solves the problem over axis=1, 2 etc.  The reference array can be
inverted before it is passed back to the Fortran API, so axis_ID[0] refers
to the first array index in C, and axis_ID(1) refers to the first array
index in Fortran.  This array is not intended to be read by humans so the
actual storage order doesn't matter.

3) The process of identifying default axes can be completely automatic (we
don't have to worry about whether the primary attribute has been set before
deciding which axis to choose).

4) It's backwardly compatible.  If the "axis_ID" attributes aren't set, we
can still use the old "axis" attributes attached to the axis data sets
themselves.  In fact, both schemes could coexist.  One reason for allowing
this is that the new scheme does not allow a method for defining subsidiary
plotting axes.  You may want to plot against two_theta or against
detector_index.  Both could have the "axis" attribute set to 1, but one
would be the default axis by being specified in the "axis_ID" array.
Perhaps we should deprecate the use of the "primary" attribute.

The main disadvantage is that it is more difficult for the casual user to
open a NeXus data file in a generic browser, and identify the axis data.  At
the moment, s/he can read the "axis" attribute.  In the new scheme, it will
not be obvious how to link the "axis_ID" attribute with a particular data
set (although it can be done - most browsers will offer a way of reading the
tag/ref pair which can then be matched with the "axis_ID" values).

The other disadvantage is that it's getting quite late to introduce major
new design features in the format.  However, as I suggested, we can make it
backwardly compatible, and even allow users to choose which scheme they
adopt.  This would be facilitated if we provided enhanced routines to
identify "signal" and "axis" data.  These exist to some extent in the F90
utility routines (see
<http://www.neutron.anl.gov/NeXus/NeXus_API.html#Utils>) but need to be
transferred to the C API.

Please let me (and the list) know whether you have any comments on this
proposal.  It would not be difficult to implement fairly quickly, so perhaps
we should encourage some people to test it out.

Ray
-- 
Dr Ray Osborn                Tel: +1 (630) 252-9011
Materials Science Division   Fax: +1 (630) 252-7777
Argonne National Laboratory  E-mail: ROsborn at anl.gov
Argonne, IL 60439-4845







More information about the NeXus mailing list