[NeXus-committee] Example of links
Paul Millar
paul.millar at desy.de
Thu Jan 30 18:05:46 GMT 2025
Hi Ray,
Thanks for sharing these examples, for talking about the "target" attribute.
For me, this is very interesting.
I took the opportunity to read through the description of groups and
links in the HDF5 manual. I've a background in storage and filesystem
programming, so the concepts in HDF5 make perfect sense to me: it's
(more or less) just the standard POSIX filesystem's namespace. HDF5
even reuses some of the POSIX vocabulary.
What confuses me is the "target" attribute in NeXus.
As the NeXus Design page itself describes, hard links (i.e., the same
object being linked to under multiple groups) are symmetric. There is no
sense of source and destination. Instead, hard links are simply being
able to refer to the same object via two (or more) paths. Under HDF5,
these paths are equivalent: neither path is more important.
From what I see, the NeXus "target" attribute seeks to break this
symmetry. The "target" attribute's value is the absolute path of these
paths. This makes the "target" path a preferred way of referring to the
object.
What I'm missing is why having a preferred path is necessary in NeXus.
The NeXus Design page is somewhat coy about saying why a "target"
attribute is needed. There's some vague mention of people getting
confused when using a particular tool, but nothing concrete. If people
are confused, isn't this rather a problem with that tool or with how
NeXus is organising data?
The page also includes some rather confusing use of terminology. The
page seemingly confuses "links" (all objects are accessible through at
least one link, if not they are garbage collected) with "hard linking"
(a common term for creating a new reference to some existing objects).
The NeXus Design page also talks about the "original dataset" . This is
arguable wrong. There is no "original dataset" since all hard links
refer to the same, single dataset. One might talk about the "original
path". However, given two paths, what is it that makes one path "original"?
As a counter example using the "Linking in a NeXus file" diagram from
the NeXus Design page, with HDF5 semantics I could create the dataset in
one group (that happens to be NXdata) and then create a link to that
dataset under a different group (which happens to be
NXinstrument/NXdetector). In temporal order, the "original dataset" (or
original path, if you prefer) would be under the NXdata group, which
isn't what is shown on the NeXus Design page and (I suspect) not what is
intended.
Cheers,
Paul.
On 28/01/2025 17.33, Osborn, Raymond via NeXus-committee wrote:
>
> HI everyone,
>
> At yesterday’s Telco, someone asked if I could share the file I showed
> in NeXpy. If you already have NeXpy installed (‘pip install nexpy’),
> then you already have the file. Just open NeXpy and click on “Open
> Example File” in the Help menu. It was the one called “chopper.nxs”.
> Since there was some confusion even about the existence of links, I
> thought some notes might be helpful.
>
> The reason the “target” attribute doesn’t appear in the NXDL files is
> roughly the same reason that “NX_class” attribute doesn’t appear. It’s
> part of the standard that is documented elsewhere
> (https://manual.nexusformat.org/design.html#design-links). In fact, at
> one point, it was considered best practice (or even required) for
> everything in a NXdata group to be a link to an object elsewhere. If
> you look at the latest NeXus publication [M. Koennecke, et al, Physica
> B *385–86*, 1343–1345 (2006)], it states that “The NXdata group holds
> links to all the variables varied or collected during the scan.” It
> later softens this requirement, but the tree structure of chopper.nxs
> illustrates this idea:
>
> >>> print(chopper['entry/data'].tree)
>
> data:NXdata
>
> @axes = ['polar_angle', 'time_of_flight']
>
> @signal = 'data'
>
> data = int32(148x750)
>
> polar_angle -> /entry/instrument/detector/polar_angle
>
> time_of_flight -> /entry/instrument/detector/time_of_flight
>
> In the absolutist version of NXdata, the data array should also have
> been linked to the corresponding NXdetector group.
>
> In the nexusformat API, I can add a link to a NXdata group (or
> anywhere) by typing:
>
> chopper[‘entry/data/polar_angle’] =
> NXlink(‘/entry/instrument/detector/polar_angle’)
>
> This automatically makes this a hard link, although you can add
> “soft=True” to make it a soft link. It then adds the “target”
> attribute to chopper[‘(‘/entry/instrument/detector/polar_angle’]. To
> do the same in h5py, you would have to type:
>
> with h5open(‘chopper.nxs’, ‘r+’) as f:
>
> f[‘/entry/data/polar_angle’] = f[‘/entry/instrument/detector/polar_angle’]
>
> f[‘/entry/data/polar_angle’].attrs[‘target’] =
> ‘/entry/instrument/detector/polar_angle’
>
> This would add the ‘target’ attribute to object in both groups, since
> it’s the same object.
>
> Personally, I think in the base classes, we should still only refer to
> fields, not links. In principle, any object, whether group or field,
> can be a link. In application definitions, if a field is contained
> within one group but also used in another, such as a NXdata group, it
> should be explicitly listed as a link. Whether the documentation needs
> improving is another debate.
>
> With best regards,
>
> Ray
>
> --
> Ray Osborn, Senior Scientist
> Materials Science Division
> Argonne National Laboratory
> Lemont, IL 60439, USA
> Phone: +1 (630) 252-9011
> Email: ROsborn at anl.gov
>
>
> _______________________________________________
> NeXus-committee mailing list
> NeXus-committee at nexusformat.org
> https://lists.nexusformat.org/mailman/listinfo/nexus-committee
More information about the NeXus-committee
mailing list