[Nexus] proposed additions to NXdata for non-linear scaling - to aid cbf interoperability

Wed Feb 19 17:02:44 GMT 2014

Dear Colleagues,

   I think having an NXformula class in NeXus would be an excellent step
towards achieving compatibility with CIF dREL, but for the case in hand,
it would be sufficient to extend the options for specifying linearity.

Here is what CBF does:

_save__array_intensities.linearity
     _item_description.description
;              The intensity linearity scaling method used to convert
                from the raw intensity to the stored element value:

                'linear' is linear.

                'offset'  means that the value defined by
                _array_intensities.offset should be added to each
                 element value.

                'scaling' means that the value defined by
                _array_intensities.scaling should be multiplied with each
                element value.

                'scaling_offset' is the combination of the two previous cases,
                with the scale factor applied before the offset value.

                'sqrt_scaled' means that the square root of raw
                intensities multiplied by _array_intensities.scaling is
                calculated and stored, perhaps rounded to the nearest
                integer. Thus, linearization involves dividing the stored
                values by _array_intensities.scaling and squaring the
                result.

                'logarithmic_scaled' means that the logarithm base 10 of
                raw intensities multiplied by _array_intensities.scaling_array_intensities.scaling and calculating 10
                to the power of this number.

                is calculated and stored, perhaps rounded to the nearest
                integer. Thus, linearization involves dividing the stored
                values by
                'raw' means that the data are a set of raw values straight
                from the detector.
;

Those are the terms for which we are looking for a mapping into NeXus.  
What I suggested
in the concordance was an attribute, rather than a field to easily 
resolve the complex issue
being able to plot a bunch of different data arrays in NXdata, as in the 
eiger specification
where we have data_0, data_1, etc.  There are several such attributes 
that would be useful
in being able to quickly and accurately visualize frames, and using a 
fields makes is difficult
to associate particular attributes with particular data arrays, because 
you cannot put a field
under a field, but you can put an attribute under a field.

Regards,
     Herbert

On 2/19/14 11:15 AM, jonathan.sloan at diamond.ac.uk wrote:
> Hi,
>
> Thanks for your replies.
>
>
>    
>> Hi All,
>>      Perhaps let's take a quick look at what would be required to
>> implement and support an NXformula class. The 2010 NIAC minutes mention
>> "muParser"
>> (http://muparser.beltoforion.de/mup_features.html#idFeatureOverview) for
>> the syntax and it could be used to cover high-performance needs in C++,
>> C and C#. I expect we should also support implementations in Python and
>> Java (at least), which should be doable for the short list of muParser
>> built-in functions and telling people needing high performance to use
>> muParser instead.
>>
>> I would propose that NXformula be implemented as a group containing:
>> * A field named "formula" containing a string representing the formula.
>> * Links with names corresponding to the parameters of the formula and
>> pointing to the appropriate datasets.
>>
>> E.g.
>> Scaling:NXformula
>>     formula="A*sin(B)+C/3.5"
>>     A:NXlink ->  /entry1/counter0/data
>>     B:NXlink ->  /entry1/sample/rotation_angle
>>     C:NXlink ->  /entry1/counter1/data
>>
>>
>> Does this sound like a "good" way to do it?
>>
>> Cheers,
>> Ben
>>      
> That does sound good, I was actually going to suggest something similar in response to your earlier comments.
>
> Would it be possible to extend this to cover tensors of data? It appears to cover only scalar values right now, and you should be able to gain much higher performance (and more usability) by pushing iteration down into lower-level code. I have a vague idea of how to go about doing so, but I suspect that it would introduce a lot of complexity which might not really be necessary to implement tensors nicely.
>
>
>    
>> I was thinking that the list of provided functions in muParser was
>> short enough that it would be better to write a new parser in
>> Python/Java that implements the same set of functions, rather than
>> providing bindings to muParser itself.
>>      
> Personally, I would assume it's simpler to expose the C interface to python, since it should be something that would only need to be done once. It's something that CBF already does, and would avoid the complexity of writing and testing two separate parsers. I've never actually done this though, so I might be wrong.
>
> Thanks.
>
>
>