[Nexus] NeXus - a solution to what is not the real problem ?

Mon Mar 15 08:57:55 GMT 2010

Dear List, Sebastian,

Sebastian Busch wrote:
> Lieber Joachim, dear list,
>
> please find attached a draft of a NeXus--yaml file with the SPHERES data
> Joachim sent recently. It is neither really NeXus compliant nor a real
> yaml file (as I don't speak either) but it should go into the right
> direction.
>
> I hope it can show that there is in fact not much difference between the
> current state of SPHERES files and this version.
>
> Mark Könnecke has kindly provided me an XML file of a FOCUS simulation
> as a starting point, you'll find therefore elements of FOCUS in the file
> (Be filter etc).
>
> When writing the file, I had some questions which you can find at the
> end of the mail if you are interested... But now answering you, Joachim:
>
> Wuttke, Joachim wrote:
>   
>> ... organize the data flow in a way that
>> users can perform approximately correct standard analyses ...
>> ... would NeXus, by facilitating collaborative software development,
>> help me to reach that goal ?
>>     
>
> I think 'yes'! At the time-of-flight spectrometer TOFTOF, many users
> asked if they could treat the data with LAMP or DAVE because they were
> used to these programs. This was for a long time not possible because
> their developers did not implement a read-in routine for TOFTOF.
>
> Now, TOFTOF data are always converted to a pseudo-IN6 format which can
> then be read -- at least by LAMP. So there is a de-facto standard (IN6)
> which is neither elegant nor readable nor well documented. Your users
> will sooner or later start to convert your data sets to HFBS format if
> they want to use DAVE. Or they will never look at the data again. Why
> not use a standard format from the beginning?
>
> There will always be many data evaluation programs and most of them will
> never have the resources to write a new routine for another
> spectrometer. I am sure that it would reduce the workload on you and
> increase the 'return on beamtime' if the users can use the software they
> always use -- which in turn can read your data [or S(Q, omega)] because
> it is organized in a standard -- NeXus -- way.
>
>   
>> ... Converting software and existing data to NeXus would
>> cost considerable time; ...
>>     
>
> Although my attached example may have a completely wrong syntax, I think
> it shows that there would actually not be a drastic change in your
> files. Some parameters would change their names to NeXus standard,
> others would have to be added, but I would be surprised if you need more
> than one afternoon to implement these changes.
>
>   
>> it would make the data and software architecture more
>> complicated; 
>>     
>
> It would certainly increase the size of your data files because more
> information would be stored. However, (1) that's talking about some
> kilobytes and (2) it is good to have this information in the file.
>
> The data would not be more complicated (more complete!) and the software
> would not be more complicated either -- you can still use your yaml
> parsing as others just use their preferred HDF reader!
>
>   
>> ... I would still refuse for esthetic reasons.
>>     
>
> I thought so ;) Do you still regard the yaml version as unesthetic?
> Which parts? The organisation? The information about the spectrometer
> (that's the only part which you don't have already in your raw data
> files as far as I can see)?
>
>   
>> For my taste, the NeXus documentation looks hopelessly bloated.
>>     
>
> After trying to write the attached file, I agree: the documentation
> could need some care. But I guess that nobody would say "no" if there
> were volunteers who have time for that..... ;)
>
>   
>> ... for my purpose, NeXus would be overkill.
>>     
>
> In this point, I disagree. HDF would be overkill -- but applying the
> principles of NeXus is not. You add some more information about your
> spectrometer: that's not overkill. You rename some variables: that's not
> overkill. And you use the structure of some NeXus application definition
> -- or you work out a new application definition if you have other
> requirements: that's not overkill either.
>
>   
Thanks for the support!
>
> Turning to the questions I had when writing the file:
>
> 1) NeXus-related
> * Is the NAPItype needed? I am always using a duck typing programming
> language for which this information is useless (and I don't know much
> about C and colleagues). Can I leave it away?
>   
Nope. NAPItype is necessary. The thing is when implementing the 
NeXus-XML API we strived
to make this as general as HDF and reasonably efficient for medium sized 
datasets. This is why we
store arrays as a large bunch of numbers in C-storage order. And we need 
the NAPItype to figure
out the dimensions of the dataset.

> * Storing the counts in NXdetector as two-dimensional array is probably
> efficient but not well human-readable. The way I introduced the
> detectors must be wrong (always linking to the same name) -- which is
> the correct way?
> * Is it OK to add the monitors like so in the NXdetector?
>   
I still do not fully understand how SPHERES is operated. From the 
instrument design on
the WWW I come to the assumption (which can be terribly wrong) that you 
make scans
on energy transfer by varying doppler speed, chopper speed, velocity 
selector speed
or a combination thereof. With this assumption the detector data would 
be 2D, with
energy_transfer as one axis and two_theta (polar_angle) of the detectors 
the other.

Again I do not know where the monitors sit at SPHERES. But normally 
monitors go into
NXmonitor groups on NXentry level. Aain using the assumption above the 
monitor values
would be arrays holding the monitor value for each scan point.

> * IIRC, the NXdata field was used to facilitate automatic plotting.
> Typically, I would like to plot (a) the energy-dependent sum over all
> detectors and (b) the angle-dependent sum over all (or a subrange of)
> energies. How do I do that? Does the spectrometer have to store an
> virtual extra detector which contains the corresponding sums?
>
>   
This seems to be derived data. This ought to go into separate fields or 
groups.
In the FOCUS file I gave you as an example I have a virtual NXdetector 
group
(called merged) which is the data from the upper, lower and middle detector
bank, well, merged.

Best Regards,

    Mark Koennecke

> 2) yaml-related
> * I got confused with sequences and maps (lists and dictionaries). Now,
> the first entry of an element is the corresponding value. (e.g.
> title:
> 	- "UNKNOWN"
> 	- NAPItype: "NX_CHAR[8]"
> ). Would it be better/correct to give this value the key "value"? i.e.
> title:
> 	- value: "UNKNOWN"
> 	- NAPItype: "NX_CHAR[8]"
> * Are strings not surrounded by quotation marks?
> * How does the map in the Histogram section work? Is the energy transfer
> the key of the entry?
>
>   
>
>
> I hope this translation helps to pinpoint the problems or critics. If
> you, Joachim, say what you like/dislike (a bit more specific than
> "unesthetic" ;) ) and you, the NeXus guys, say what is
> allowed/forbidden, we could perhaps do some refinement and find a
> solution which makes everybody happy.
>
>
> Best wishes,
> Sebastian.
>   
> _______________________________________________
> NeXus mailing list
> NeXus at nexusformat.org
> http://lists.nexusformat.org/mailman/listinfo/nexus