Hi,

as most of you know we had two workshops concerning dataformats and 
synchrotrons in the last few months. Namely the workshop on HDF5 as 
hyperspectral dataformat at ESRF and the NeXus for Synchrotrons Workshop 
at PSI. These workshops resulted in several suggestions for extensions to 
Nexus which are now up for vote. In short these are four suggestions. Please 
use this list for votes, and the rest of the e-mail for explanations. 

1) Introduce NXsubentry
2) Introduce scaled data
3) Extend NeXus axis definitions to be more precise
4) NXmeasurement


NXsubentry
----------
Add to NXentry a new class named NXsubentry which has the same structure 
as NXentry. Each NXsubentry is to hold the data or links thereto of a single 
application definition in a  multi method instrument. 

=== The Reasoning===
Synchrotron beamlines often utilise several different detectors and detector 
types in order to combine multiple techniques in simultaneous measurements. 
NeXus currently asks for separate NXentry groups to be written for each 
technique. This is good if one measurement is written to a file. However, 
there is a second requirement that multiple scans, multiple measurements, 
possibly a whole log of an experimental session is written to one NeXus file. 
Then having different techniques in different NXentries will make the files 
difficult to understand as the relationshipbetween different measurements 
is lost.  Thus, in order to keep the data from these multiple techniques 
together, it is desirable to have the ability to write it all into a single 
NXentry in a NeXus. The current NeXus application definitions refer to the 
same names and paths and so there are many name collisions when trying to 
satisfy two application definitions in one NXentry in a file. The ability 
to combine application definitions could be enabled by modifying the 
application definitions to refer to new and separate groups inside the main 
NXentry of the NeXus file that refer to the particular application/technique 
name and which contains all of the data (or links to it) that is relevant 
to that application/technique. For an example experiment that involves 
a combination of SAS and Fluorescence, the proposed NeXus structure could 
look like:


entry:NXentry/
  definition = "NXSas, NXFluo"
  user:NXuser/
  sample:NXsamle/
  instrument:NXinstument/
    SASdet:NXdetector/
    fancyname:NXdetector/
    fancyname2:NXdetector/
    ...
  SAS:NXsubentry/
    definition = "NXSas"
    instrument:NXinstrument/
      detector:link to SASdet
    data:NXdata/
  Fluo:NXsubentry/
    definition = "NXFluo"
    instrument:NXinstrument/
      detector:link to fancyname
      detector2:link to fancyname2
    data:NXdata/

In the above NeXus tree, the entire beamline state could be stored in 
entry/instrument and then any subset of this that is relevant to the SAS or 
Fluorescence techniques would then be linked within the entry/SAS/instrument 
and the entry/Fluo/instrument groups as defined by the current application 
definitions with a minor change in the hierarchy. The advantages of this 
approach are:
* Only minor changes from current practice.
* The only name collisions to worry about are the names of the 
 applications/techniques themselves.
* Application definitions need not be concerned with the names and paths that 
 other application definitions proscribe.
* The paths for each application remains well defined and an analysis program 
 for either technique can find the relevant data without having to understand 
 the other techniques present in the file. Further, the same analysis programs 
 can read the multi-technique files in the same way (i.e. with the same code) 
 exactly the same as they read single-technique files.
* A user inspecting the data manually can find all the relevant information for
  a particular analysis in the one group and so doesn't need to understand the 
  entire beamline.
One drawback of this approach is that the beamline staff would have to define 
many links when configuring the data acquisition software. However, this is 
necessary work regardless of how the data is saved since the user must be 
informed of how the different instrument components and detectors relate to 
the various analyses anyway. In fact, NeXus and the above proposal simplifies 
this task by clearly documenting in a formal manner where the relevant 
information can be read.

Another use of NXsubentry is the retrofitting of existing non compliant NeXus 
files with NXsubentries complying to an application definition. 

Scaled Data
------------
NeXus STRONGLY suggests to store data as arrays of physical values in C 
storage order. However, for cases where this is not possible or would cause 
an efficiency concern when writing allow to store raw data. Such data must 
be annotated with additional attributes as described below in order to allow 
reading software to reconstruct the true physical value.


==The Reasoning==
The data rates possible at synchrotron facilities and the new pixel detectors 
test current computing technology to their limits. There may not be enough 
time to scale or convert data on the fly before writing to disk. In some 
occasions significant space savings can be obtained by storing data as short 
integers and scaling them to the desired floating point values. 
  

In the formulas below 
Vtrue denotes the true value of the data item, Vraw the one which is stored 
in the data element on file. The attributes are:

* transform: This is the indicator that a transformation of the Vraw data is 
   necessary. Transform can have one the following values:
** offset: Vtrue = Vraw + offset
** scaling: Vtrue = Vraw * scaling
** scaling_offset: both an offset and scaling is applied. 
   Vtrue = Vraw*scaling + offset
** sqrt_scaled: Vtrue = (Vraw/scaling)*(Vraw/scaling)
** logarithmic_scaled: Vtrue = (Vraw/scaling)**10   
** polynomial: Vtrue = p1 + p2*Vraw + p3*Vraw*Vraw + p4*Vraw*Vraw*Vraw ....
* offset:  The offset to apply
* scaling: The scale factor to apply
* direction: a komma separated list of length ndim which specifies for each 
  dimension if it is increasing or decreasing. If this attribute is missing, 
  increasing is implied. 
* precedence: a komma separated list of length ndim which gives the rank 
  order in which array indexes change with respect to other indexes. A 
  precedence of 1 denotes the fastest changing index. If this attribute is 
  missing, C storage order is implied.
* coefficients, a komma separated list of the polynomial coefficients to use 
  for a polynomial transform

Coordinate System
------------------
This suggestion results from comparing imageCIF with NeXus. Ideally we should 
be able to make a mapping from CIF to NeXus. Unfortunately, NeXus had some 
weaknesses in coordinate systems (addressed by this proposal) and scaled data. 
Please note, that this proposal extends in what we already do in NeXus and 
does not invalidate earlier efforts. 


The CIF way of specifying axis is far more accurate then what we do with NeXus.
Thus the suggestion is to align NeXus with the well thought out CIF scheme. 
This section consists first of a discussion of the CIF axis system and then 
of suggestions how to use this within NeXus. 



CIF uses a coordinate system which is similar to the McStas coordinate system 
which NeXus uses at its bottom. Just the orientation of the Z-axis differs. 
The description of any given axis in CIF consists of three elements:
* The type of the axis. This can be translation or rotation
* The axis vector. This is the direction of a translation or the vector around 
  which the axis rotates. 
* The axis offset. The offset to the base of the rotation or translation. If 
  this is not given 0,0,0 is assumed. 


CIF also describes in which order transformations have to be applied to get 
a component into its final position from its zero position. In CIF this is 
done by chaining axis through the depends attribute. 



This scheme is a generalisation of the methods used commonly in 
crystallography. There a crystal is brought into scattering position by 
applying a series of rotations. Please note that order is important!


===Axis Suggestions for NeXus===   

1) NeXus stays with the McStas coordinate system.


2) NeXus uses the vector and offset scheme to document existing NeXus axis. 
  The base of all operations is always the component, if not specified by an 
  offset vector. Rotations are in degree, translations in milimetre. 

Some examples:
* rotation_angle has a vector 0 1 0, rotation around Y
* azimuthal_angle is a rotation around Z, vector = 0 0 1
* polar_angle is also a rotation around Y, vector 0 1 0, but as the rotation 
  axis is with the previous component upstream, we have an offset of 
  0 0 -distance

In NXsample we additionally have:
* chi is a rotation around Z, vector 0 0 1
* phi is a rotation around Y, vector 0 1 0
* kappa, for kappa the vector attribute has to be given as there are 
  kappa goniometres with different values of kappa.



3) Each NeXus component can have an additional field with the name transform. 
This contains a komma separated list of the operations required to place 
the componentat its position in the instrument. The formula is:

      Xcurrent = op1*op2....*opn * X0

with transform becoming: op1,op2,....,opn Names of operations are the names of 
the axis to apply. Unqualified names relate to axis in the same group. In 
order to refer axis outside the current group, full path names must be given. 
Storing this separatly in a transform field gives direct access whereas 
the CIF depends system requires a lot of searches to reconstruct the sequence 
of transforms. 



In this description, our NeXus polar coordinate system has the transform:

          azimuthal_angle, polar_angle

This is also the default if the transform field is missing.  


4) NeXus strongly prefers to use the NeXus simple coordinate system with 
polar_angle and azimuthal_angle as describe above. This description has the 
advantage that polar_angle is always two theta. 



5)  With the vector/offset scheme arbitrary axis can be stored in NeXus. 
The rule then is  that type, vector and offset have to be specified as 
attributes.Type is NX_CHAR, vector and offset  are of dim 3 and type 
NX_FLOAT. We need these attributes anyway as there are angles such as kappa, 
which differ in their rotation axis between instruments. 



6) NeXus is missing a rotation around the X axis. As we already bought into 
quite lyrical names for rotation axis I suggest aequatorial_angle as a name 
for this. 

7) Consequently, as NeXus does not have fields for describing translations, 
except in Nxgeometry, I suggest to add x_translation, y_translation and 
z_translation fields to each component. I choose to suggest separate fields 
for the translations as they frequently map to dedicated motors. Please note 
that all angles have to be 0 if you were to determine the operation of any 
given translation motor.  


8) The orientation field in NXgeometry receives the same meaning as vector 
in axis descriptions. With vector being aligned with the main axis of the 
component. 



9) NXgeometry stays as is as a means to describe shapes, engineering 
coordinates of orientations of components.


NXmeasurement
----------------
In order to satisfy the requirements of the beamline scientist an additional, 
simplified NeXus hierarchy is proposed:


entry:NXentry
   measurement:NXmeasurement
      positions:NXpositioners
      scalars:NXscalar
      images:NXimagedata
  
Please note that this is an example how a NXmeasurement group may look like. 
The general feeling was to allow much freedom in NXmeasurement and standardize 
later on if a common pattern emerges. The meaning is that the NXpositioners 
groups contains a list of all constants and motor positions, NXscalar arrays 
of all parameters varied during the scan and NXimageData the images and other 
detector data which has been captured during a scan or measurement. This 
structure is for the expert, the instrument scientist, who knows his 
instrument by heart and wishes to be able to plot anything against anything 
in his instrument. NXmeasurement is not meant to stand alone but is to be 
augmented with further NXsubentries containing the data in proper NeXus 
notation and hierarchy.