[NeXus-committee] NeXus XML format and reduced data files

Akeroyd, FA (Freddie) F.A.Akeroyd at rl.ac.uk
Sun Oct 21 14:55:06 BST 2007


> I think that this is rather like a lot of work.  Especially  if you
wish 
> to port similar structures to HDF-5 and HDF-4.
> I also think that there is another question folded into this:
>   - Do we want a table data type? Because that is, IMHO, what this 
> proposal is.

Hi Mark,

I wasn't going quite as far as suggesting a table/tuple data type within
NeXus - rather an alternative way of writing out the arrays in XML so
that the output more resembled a table. If you had an NXdata containing
arrays x,y,e we would currently write out all the x values, follow by y
and finally e; the proposal was to instead write out the sequence x[0],
y[0], e[0], x[1], y[1], e[1] etc. formatted into columns. If the NXdata
contained only equal length arrays, this would visually look like a
table; if not the mechanism would still work but later rows would just
be shorter. 

> Though I myself do not fancy excel for data analysis I know that many 
> people do. I wonder about a couple of things:

> - Can this SAS be converted to NeXus and back using a XLST transform?

I don't know XSLT well enough to be sure ... it would need to loop
though the data within one node and spread that data out over several
other nodes. I can see how this could be programmed using the mxml API
so one idea I had was to add a layer beneath the NeXus XML API that
would transform the data on loads and saves from files, thus allowing
the NeXus API and applications to be used directly without the need for
additional tools.

> - May be we rather suggest a NXU function which takes as input a path
to 
> the group and, given a suitable structure
>  of the group, emits its content as a CSV file. CSV is understood by 
> many database and spreadsheet products, including excel.

I think the NXextract tool Stephane Poirier demonstrated at the NIAC
would already be able to do this for us, but this way requires the user
first running a separate program on a file rather than being able to use
it directly. Also you would need to extract each NXdata into a separate
file to load into Excel whereas with an XML based representation you can
select individual nodes and also browse the other metadata which would
not be present in the CSV file. 

While extracting to CSV is useful to be able to do, I think transforming
to and from an alternative XML representation is the best option in this
case. Though it would be most useful if this transform could be handled
automatically by the NeXus API, it could be a separate (XSLT or other)
application. What is most important is that it is possible to transform
between these two representations in some way.

Regards,

Freddie



More information about the NeXus-committee mailing list