[Nexus] Suggestions for NeXus from CIF

Tue Apr 6 09:02:53 BST 2010

Suggestions for NeXus following the study of imageCIF
----------------------------------------------------

In the recent workshop HDF-5 for hyperspectral dataformats a mapping from imageCIF 
to NeXus was discussed. While many aspects of the mapping were straightforward or 
just need the addition of fields to base classes, some areas were markedly different.

One confusing (to me) aspect of imageCIf is that the relationship between components 
of the instrument and descriptions in the file is maintained through ID's. Which can 
come in text and binary forms. In NeXus this cohesion is achieved through storing 
data componentwise and in a hierarchy. 

In other areas, most notably data and axis descriptions, imageCIF has superior 
concepts. Here NeXus can learn.

This document contains some suggestion and ought to be a basis for discussion. 
Moreover, the material presented is not intended to replace current NeXus mechanisms 
but to document them in another way and to expand on them. 

Please direct all responses to this mailing to the NeXus mailing list, 
nexus at nexusformat.org. We like to have this stuff all in the same place. 

Data Storage
-------------
One area where imageCIG differs is that imageCIF allows to store data in a form 
which requires corrections before it can be used. Tomography people, for example, 
like to store their reconstructed volumes as 16 bit integers together with a scale 
factor to get to the real value. Thereby saving a factor of 2 in storage space.      

For NeXus I make the following suggestion:

NeXus strongly advises to store detector or other data in a form which is 
directly readable and in C storage order. If for some hopefully good reason 
this is not possible, the data transformation necessary has to be specified 
by annotating the dataset with appropriate attributes. In the formulas below 
Vtrue denotes the true value of the data item, Vraw the one which is stored 
in the data element on file. The attributes are:

- linearity: This is the indicator that a transformation of the Vraw data is 
             necessary. Linearity can have one the following values:
             * offset: Vtrue = Vraw + offset
             * scaling: Vtrue = Vraw * scaling
             * scaling_offset: both an offset and scaling is applied. Vtrue = 
               Vraw*scaling + offset
             * sqrt_scaled: Vtrue = (Vraw/scaling)*(Vraw/scaling)
             * logarithmic_scaled: Vtrue = (Vraw/scaling)**10   
- offset:  The offset to apply
- scaling: The scale factor to apply
- direction: a komma separated list of length ndim which specifies for each dimension 
             if it is increasing or decreasing. If this attribute is missing, increasing 
             is implied. 
- precedence: a komma separated list of length ndim which gives the rank order in which 
              array indexes change with respect to other indexes. A precedence of 1 denotes 
              the fastest changing index. If this attribute is missing, C storage order 
              is implied. 

Image Dimensions
----------------

Much of synchrotron radiation work is image processing. Thus the synchrotron community 
asks for a special attribute to be added to multi dimensional datasets. This attribute ought 
to have the name imagedim and have as a value the indexes of the dimensions which make up the 
image part of the data.   For example, consider a scan on a 2D detector giving rise to:
data[NP,xsize,ysize] then imagedim  would be 1,2. 

Axis
----

The CIF way of specifying axis is far more accurate then what we do with NeXus. Thus the 
suggestion is to align NeXus with the well thought out CIF scheme. This section consists 
first of a discussion of the CIF axis system and then of suggestions how to use this 
within NeXus. 

CIF uses a coordinate system which is similar to the McStas coordinate system which NeXus 
uses at its bottom. Just the orientation of the Z-axis differs. The description of any given 
axis in CIF consists of three elements:
- The type of the axis. This can be translation or rotation
- The axis vector. This is the direction of a translation or the vector around which the 
  axis rotates. 
- The axis offset. The offset to the base of the rotation or translation. If this is not given 
  0,0,0 is assumed. 

CIF also describes in which order transformations have to be applied to get a component into its 
final position from its zero position. In CIF this is done by chaining axis through the depends 
attribute. 

This scheme is a generalisation of the methods used commonly in crystallography. There 
a crystal is brought into scattering position by applying a series of rotations. Please 
note that order is important!

Axis Suggestions for NeXus   
---------------------------
1) NeXus stays with the McStas coordinate system.

2) NeXus uses the vector and offset scheme to document existing NeXus axis. The base of 
all operations is always the component, if not specified by an offset vector. 
Rotations are in degree, translations in milimetre. 

- rotation_angle has a vector 0 1 0, rotation around Y
- azimuthal_angle is a rotation around Z, vector = 0 0 1
- polar_angle is also a rotation around Y, vector 0 1 0, but as the rotation axis is 
  with the previous component upstream, we have an offset of 0 0 -distance

In NXsample we additionally have:
- chi is a rotation around Z, vector 0 0 1
- phi is a rotation around Y, vector 0 1 0
- kappa, for kappa the vector attribute has to be given as there are kappa goniometres 
 with different values of kappa.

3) Each NeXus component can have an additional field with the name transform. This 
contains a komma separated list of the operations required to place the component
at its position in the instrument. The formula is:

      Xcurrent = op1*op2....*opn * X0

with transform becoming: op1,op2,....,opn Names of operations are the names of the axis to 
apply. Unqualified names relate to axis in the same group. In order to refer axis outside 
the current group, full path names must be given. Storing this separatly in a transform field 
gives direct access whereas the CIF depends system requires a lot of searches to reconstruct 
the sequence of transforms. This is why I like transform better. 

In this description, our NeXus polar coordinate system has the transform:

          azimuthal_angle, polar_angle

This is also the default if the transform field is missing.  

4) NeXus strongly prefers to use the NeXus simple coordinate system with polar_angle 
   and azimuthal_angle as describe above. This description has the advantage that 
   polar_angle is always two theta. 

5)  With the vector/offset scheme arbitrary axis can be stored in NeXus. The rule then 
    is  that type, vector and offset have to be specified as attributes.
    Type is NX_CHAR, vector and offset  are of dim 3 and type NX_FLOAT. We need these 
    attributes anyway as there are angles such as kappa, which differ in their 
    rotation axis between instruments. 

6) NeXus is missing a rotation around the X axis. As we already bought into quite 
lyrical names for rotation axis I suggest aequatorial_angle as a name for this. 

7) Consequently, as NeXus does not have fields for describing translations, except 
in Nxgeometry, I suggest to add x_translation, y_translation and z_translation fields 
to each component. I choose to suggest separate fields for the translations as they 
frequently map to dedicated motors. Please note that all angles have to be 0 if you 
were to determine the operation of any given translation motor.  

8) The orientation field in NXgeometry receives the same meaning as vector in axis 
descriptions. With vector being aligned with the main axis of the component. 

9) NXgeometry stays as is as a means to describe shapes, engineering coordinates of 
   orientations of components.

Best Regards,

   Mark Koennecke