[NeXus-committee] Example of links

Aaron Brewster asbrewster at lbl.gov
Thu Jan 30 21:50:10 GMT 2025


In h5py, I had thought you could query a group or field to see if it's a
soft link and get its original location.  I don't know how to do the same
for a hard link but I presume it's possible.  Therefore the target
attribute would appear to be redundant.

However, to me, the most important reason why to have @target is to not be
tied to HDF5.  It's useful to have it from a specification point of view.
-Aaron

On Thu, Jan 30, 2025 at 1:28 PM Raymond Osborn via NeXus-committee <
nexus-committee at shadow.nd.rl.ac.uk> wrote:

> Hi Paul,
> Thanks for the follow-up questions. I will try to answer them below.
>
> *From*: NeXus-committee <nexus-committee-bounces at shadow.nd.rl.ac.uk> on
> behalf of Paul Millar via NeXus-committee <
> nexus-committee at shadow.nd.rl.ac.uk>
> *Date*: Thursday, January 30, 2025 at 12:07 PM
> *To*: nexus-committee at nexusformat.org <nexus-committee at nexusformat.org>
> *Subject*: Re: [NeXus-committee] Example of links
>
> Hi Ray,
>
> Thanks for sharing these examples, for talking about the "target"
> attribute.
>
> For me, this is very interesting.
>
> I took the opportunity to read through the description of groups and
> links in the HDF5 manual.  I've a background in storage and filesystem
> programming, so the concepts in HDF5 make perfect sense to me: it's
> (more or less) just the standard POSIX filesystem's namespace.  HDF5
> even reuses some of the POSIX vocabulary.
>
> What confuses me is the "target" attribute in NeXus.
>
> As the NeXus Design page itself describes, hard links (i.e., the same
> object being linked to under multiple groups) are symmetric. There is no
> sense of source and destination.  Instead, hard links are simply being
> able to refer to the same object via two (or more) paths.  Under HDF5,
> these paths are equivalent: neither path is more important.
>
>  From what I see, the NeXus "target" attribute seeks to break this
> symmetry.  The "target" attribute's value is the absolute path of these
> paths.  This makes the "target" path a preferred way of referring to the
> object.
>
> What I'm missing is why having a preferred path is necessary in NeXus.
>
>
> If the reason for using links is to save space (e.g., adding the same
> sample information to multiple entries), then it probably doesn’t matter
> which is the parent and which the child. The purpose of the link could also
> be to ensure that, e.g., the sample lattice parameter is updated in every
> entry when it is changed in one of them. Again, none of the objects is
> obviously the parent.
>
> However, there are important structural reasons for adding links with one
> of the objects as the parent. The most common use of links is in the NXdata
> group, where the axes are stored elsewhere. Here’s a shortened version of
> chopper.nxs, for example.
>
> >>> print(chopper.tree)
> chopper:NXroot
>     entry:NXentry
>        data:NXdata
>            @axes = ['polar_angle', 'time_of_flight']
>            @signal = 'data'
>            data = int32(148x750)
>            polar_angle -> /entry/instrument/detector/polar_angle
>            time_of_flight -> /entry/instrument/detector/time_of_flight
>        instrument:NXinstrument
>            detector:NXdetector
>                distance = float32(148)
>                polar_angle = float32(148)
>                time_of_flight = float32(751)
>                type = 'He3 gas cylinder'
>
> Here the main NXdata group plots the data against polar angle and
> time-of-flight, both of which are properties of the detector and so are
> stored in ‘entry/instrument/detector’. If someone plotting the data wants
> to know about other detector properties, such as the sample-to-detector
> distance, those are also in the NXdetector group and the target attribute
> shows the user where to look. There could be multiple NXdetector groups,
> but the link identifies the right one. So the target attribute provides
> important functionality. In a data reduction script that wants to convert
> from time-of-flight to energy transfer, it is essential they know in which
> group the relevant distance fields are stored. That is only possible by
> making the object in the NXdetector group the parent and using the ’target’
> attribute to point to it.
>
> Ironically, I think this functional purpose is what led the Fairmat group
> to propose the ’target’ attribute, so the original reasoning was sound, if
> now forgotten.
>
>
> The NeXus Design page is somewhat coy about saying why a "target"
> attribute is needed.  There's some vague mention of people getting
> confused when using a particular tool, but nothing concrete.  If people
> are confused, isn't this rather a problem with that tool or with how
> NeXus is organising data?
>
>
> The importance of links was crystal-clear to the original developers of
> NeXus twenty years ago for the reasons I described above. I hadn’t realized
> that this aspect of the standard was no longer understood. I guess we did a
> bad job of documenting it at the time.
>
> The page also includes some rather confusing use of terminology. The
> page seemingly confuses "links" (all objects are accessible through at
> least one link, if not they are garbage collected) with "hard linking"
> (a common term for creating a new reference to some existing objects).
>
>
> If documentation of NeXus links is intermingled with discussions of
> garbage collection, then it should be changed.
>
>
> The NeXus Design page also talks about the "original dataset" . This is
> arguable wrong.  There is no "original dataset" since all hard links
> refer to the same, single dataset. One might talk about the "original
> path".  However, given two paths, what is it that makes one path
> "original"?
>
>
> This may be clumsy wording, but I think the meaning in the above example
> is that ‘/entry/instrument/detector/time_of_flight’ is the “original
> dataset.” It is reproduced in the NXdata group to make plotting more
> convenient.
>
>
> As a counter example using the "Linking in a NeXus file" diagram from
> the NeXus Design page, with HDF5 semantics I could create the dataset in
> one group (that happens to be NXdata) and then create a link to that
> dataset under a different group (which happens to be
> NXinstrument/NXdetector). In temporal order, the "original dataset" (or
> original path, if you prefer) would be under the NXdata group, which
> isn't what is shown on the NeXus Design page and (I suspect) not what is
> intended.
>
>
> The temporal order when writing the file is irrelevant.
>
> All your complaints about the documentation seem justified, so we should
> probably revise it, but the value of using the target attribute is still, I
> believe, valid.
>
> I hope this helps.
>
> With best regards,
> Ray
>  --
> Ray Osborn, Senior Scientist
> Materials Science Division
> Argonne National Laboratory
> Lemont, IL 60439, USA
> Phone: +1 (630) 252-9011
> Email: ROsborn at anl.gov
>
> _______________________________________________
> NeXus-committee mailing list
> NeXus-committee at nexusformat.org
> https://lists.nexusformat.org/mailman/listinfo/nexus-committee
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nexusformat.org/pipermail/nexus-committee/attachments/20250130/7bb4b8ee/attachment.htm>


More information about the NeXus-committee mailing list