Occam's Razor and attributes

c.m.moreton-smith at rl.ac.uk c.m.moreton-smith at rl.ac.uk
Thu Apr 23 09:43:48 BST 1998


Apologies for the brevity of the previous message,  I was just trying to
see how much of the original SoftNeSS group was represented in the NAPI
list but mailed to the list itself by mistake.

Something that has been concerning me for some time is the duplication
provided in the NeXus standard between the functionality provided by
lists of attributes and that provided by (V)groups.  The principle of
Occam's razor is to avoid generating multiple pathways and extra
complexity when it is not required because at some point it is usually
equivalent to shooting oneself in the foot!

In NeXus, this occurs when we provide two ways of storing/grouping named
data items, one by adding HDF attributes and one by storing extra fields
in a Vgroup.

The glossary sensibly restricts its use of attributes to NXData but it
is clear that the expectation is that users could be adding a Units
attribute to each "measureable" quantity as well as other attribute
types (even groups).  From the point of view of the Programming
interface, adding a specific Unit in the form of an attribute is a nice
convenience but should we really be doing this in the underlying HDF?
for example, is

Example 1a
=========
	my_entry	(Class=NXentry)
	{
		run234		:	NXData
		LRMECS	:	NXInstrument
		CuFeSO4	:	NXSample
	} :
	<attributes>
		My_Notes	: 	NXChar
		Sample_run	:	NXChar

really more simple than below ?

Example 1b
=========
	my_entry	(Class=NXentry)
	{
		run234			:	NXData
		LRMECS		:	NXInstrument
		CuFeSO4		:	NXSample
		Jeffs_comments 	:	NXGroup (generic but could have
a class)
		{
			My_Notes	: 	NXChar
			Sample_run	:	NXChar
		}
	}

Apart from only needing one type of call (NXMakegroup) in the
reading/writing code, we also have an indication of why the information
was entered as attributes rather than just added to the group as two
extra fields.  A browser reading this does not need to try and sort out
a different way of displaying attributes just for the sake of it.

What about a more awkward case?

Example 2a
=========
	run234		(Class=NXdata)
	{
		X	:	SDS, Rank = 2	<attributes>
							Units = "mm"
							Orientation =
"1,1,0"
		Counts	:	SDS, Rank = 1 <attributes>
							Signal = 1
	}
	
this would become

Example 2b
=========
	run234		(Class=NXdata)
	{
		X	(Class=NXmeasurement)
		{
			default : SDS, Rank = 2
			Units = "mm"
			Orientation="1,1,0
		}
		Counts	(Class=NXmeasurement)
		{
			default : SDS, Rank = 1
			Units = "mm"
			Signal = 1
		}
	}

by adding an NXputmeasurement and NXgetmeasurement call either to NAPI
or as a helper function we would also provide a way in which
measurements could "require" a units, signal and other attributes which
it would be desirable to have available.

By tackling this problem now, we are also providing a means to integrate
with Big HDF

Here is a proposal:

(1) 
 In Nexus 1.1 (not 1.0!)  we modify NAPI to emulate NXPutattr and
NXGetattr attributes by storing them in an "NXattributes class" Vgroup
within the current level vgroup.

	e.g. Starting with

	X : SDS

doing a NXputattr with a units string would produce

	X :
	{
		default 		: SDS
		attributes	: NXattributes
	}

this is not a brilliant solution but it provides a standard way to
provided backwards compatibility with NXputattr and NXgetattr calls
which 1.0 "legacy" code might make!

(2)
We add a new class NXmeasurement to contain, at a minimum, the value
being measured as an SDS and a units field, the default measurement to
be called "default".  We also specify an optional "signal" field.

(3)
We deprecate the use of NXPutattr and NXGetattr and recommend using two
new routines NXputmeasurement and NXgetmeasurement.


Maybe this is controversial but I don't believe that whilst using
attributes and groups in the interface it would be possible for any of
us to easily write code in C, FORTRAN, FORTRAN-90, C++ or even JAVA to
hold something like example 1a without going to a structure like 1b
anyway!

What happens if we do nothing?  I for one am a little concerned about
having to write code to decode a file where someone decides to adds a
tree of groups via attributes, maybe even by accident!

Please comment/flame here in NAPI, I've posted this to NAPI instead of
NeXus as I don't want to throw confidence in the underlying standard
even though this question is fundamental and needs sorting out!

PS
There's a nice little page on William of Occam via
http://www.medinfo.ufl.edu/year1/bcs/interv/occam.html

--
Chris Moreton-Smith, Software Development Manager
ISIS Facility, Rutherford Appleton Lab, Chilton, Didcot, OXON OX11 OQX
Telephone: +44 (0) 1235 446544, Fax: +44 (0) 1235 445720
Email: C.Moreton-Smith at rl.ac.uk





More information about the NeXus-developers mailing list