[Nexus] proposed additions to NXdata for non-linear scaling - to aid cbf interoperability

Wed Feb 19 16:49:04 GMT 2014

Jonathan's suggestion is good.  When this topic was brought up before at 
the NIAC, it expanded from axis scaling into a discussion of general 
mathematical computations.  Since these computations would impact any 
library that would read NeXus data files and that impact was 
substantial, as Mark indicated, the matter was tabled.

I am in favor of creating a base class to describe the mathematical 
formulae or operations to transform (or generate) data content and store 
that base class in the NeXus data file.

I am opposed to imposing that work (that is, the work of calculating 
various data based on mathematical formulae written in the NeXus data 
file) onto all libraries that read the NeXus data file.  This is out of 
scope for NeXus as a standard data file format.  One must consider 
carefully the mission of NeXus.  As Mark stated, we have no funded 
effort to implement features and this feature would produce many more 
requests for additional related components.

Pete

On 02/19/2014 09:56 AM, Benjamin Watts wrote:
> Hi All,
>     I was thinking that the list of provided functions in muParser was
> short enough that it would be better to write a new parser in
> Python/Java that implements the same set of functions, rather than
> providing bindings to muParser itself. Something like:
> http://pyparsing.wikispaces.com/file/view/fourFn.py/30154950/fourFn.py
>
> The Python/Java versions wouldn't need to be very fast - just enough to
> get by. People needing performance would be told to use muParser. The
> code would be written by whoever decides they need it enough, which
> might be the CBF crew. I am continuing this conversation in the hope
> that we can lay out the set of requirements clearly enough that somebody
> could decide it is worth the effort.
>
> Cheers,
> Ben
>
>
> On 19/02/14 16:31, Wintersberger, Eugen wrote:
>> Hi all,
>>
>> On Wed, 2014-02-19 at 15:50 +0100, Benjamin Watts wrote:
>>> Hi All,
>>>      Perhaps let's take a quick look at what would be required to
>>> implement and support an NXformula class. The 2010 NIAC minutes mention
>>> "muParser"
>>> (http://muparser.beltoforion.de/mup_features.html#idFeatureOverview) for
>>> the syntax and it could be used to cover high-performance needs in C++,
>>> C and C#.
>> That is a really interesting project - was not aware of this.
>>
>>> I expect we should also support implementations in Python and
>>> Java (at least), which should be doable for the short list of muParser
>>> built-in functions and telling people needing high performance to use
>>> muParser instead.
>> Well -  I am not so sure if this is really that easy. I guess you would
>> provide a binding via the Python C-API and JNI. Both are not as easy to
>> handle as their documentation suggests.
>>
>> In fact I thought that NAPI is in maintenance mode and no new features
>> will be implemented (I think this decision was made on the NIAC meeting
>> in 2012). Who would write the code?
>>
>>> I would propose that NXformula be implemented as a group containing:
>>> * A field named "formula" containing a string representing the formula.
>>> * Links with names corresponding to the parameters of the formula and
>>> pointing to the appropriate datasets.
>>>
>>> E.g.
>>> Scaling:NXformula
>>>     formula="A*sin(B)+C/3.5"
>>>     A:NXlink -> /entry1/counter0/data
>>>     B:NXlink -> /entry1/sample/rotation_angle
>>>     C:NXlink -> /entry1/counter1/data
>>>
>>>
>>> Does this sound like a "good" way to do it?
>> Despite my concerns mentioned above this sounds absolutely fine for me.
>> One maybe needs an additional attribute to the data field of NXdetector
>> providing a path to the NXformula instance storing the transformation
>> 'code'.
>>
>> regards
>>    Eugen
>>
>>> Cheers,
>>> Ben
>>>
>>>
>>> On 19/02/14 14:28, Mark Koennecke wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 02/19/2014 11:15 AM, Wintersberger, Eugen wrote:
>>>>> Hi folks
>>>>>
>>>>> On Wed, 2014-02-19 at 10:11 +0100, Benjamin Watts wrote:
>>>>>> Hi Jonathan,
>>>>>>       This kind of thing has been discussed at previous NIAC meetings.
>>>>>> Your proposal makes for a simple change, but does more to push the
>>>>>> problem further away from yourself than to actually solve it.
>>>>> If I understand Ben correctly what he means is that the code working
>>>>> with the stored data needs to do a lot of work to interpret it
>>>>> correctly.
>>>>>
>>>>>> What we
>>>>>> really need is some "standard" way to present mathematical formulae in
>>>>>> general that is accessible across different programming languages.
>>>>> That's the right way to go.
>>>>>
>>>>>> I
>>>>>> think maybe Eugen Wintersberger had a good suggestion, but I don't
>>>>>> recall any follow-up on it.
>>>>> Hm - I cannot recall that I made a good suggestion on this. However,
>>>>> what I would mot probably do is to use Python syntax to represent math.
>>>>> This has two advantages:
>>>>>
>>>>> 1.) an actual language can immediately execute the code
>>>>> 2.) it might be easy to write a parser for Python code in any other
>>>>> language.
>>>>>
>>>>> In my opinion 2. is the the critical point. Once you start with this you
>>>>> are basically inventing a new scripting language. Consequently, a
>>>>> program which wants to use the data needs to parse this code and takes
>>>>> the appropriate action which is not a trivial task.
>>>>>
>>>>> In addition one should be very careful with such things to not open
>>>>> Pandoras box. I would restrict this to plain math (no loops, no
>>>>> branching, etc.). Otherwise you really invent a new language.
>>>>>
>>>>> What is still open is how to fill the namespace for the code to execute.
>>>>> Where should all the data mentioned in
>>>>>
>>>>> offset  + data**c
>>>>>
>>>>> come from?
>>>>>
>>>>> At the end of the day there is one question we should seriously discuss:
>>>>> how far do we want to go with math in Nexus?
>>>> This is exactly the point why we never got down to anything in this.
>>>> In 2011, Amando was tasked to make a proposal but nothing ever
>>>> came from it. With the manpower and support we do not have, we
>>>> cannot support our own data interpreter, cross platform and good
>>>> performance wise.
>>>>
>>>> IMHO, the wisest solution seems to be to add what is
>>>> absolutely required. Required in the sense to cover the use case where
>>>> you do not have the time to do the conversion on the fly before
>>>> writing the data to disk.
>>>>
>>>> Regards,
>>>>
>>>>              Mark
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> regards
>>>>>     Eugen
>>>>>
>>>>>

-- 
----------------------------------------------------------
  Pete R. Jemian, Ph.D.                <jemian at anl.gov>
  Beam line Controls and Data Acquisition, Group Leader
  Advanced Photon Source,   Argonne National Laboratory
  Argonne, IL  60439                   630 - 252 - 3189
-----------------------------------------------------------
     Education is the one thing for which people
        are willing to pay yet not receive.
-----------------------------------------------------------