[NeXus-committee] NeXus paper V5
Joachim Wuttke
j.wuttke at fz-juelich.de
Wed Aug 13 17:00:41 BST 2014
Dear Mark,
back from my summer vacations I find your manuscript.
Thank you for this initiative. Getting this overview
fit for print will be a helpful exercice since it
obliges us to clarify a number of points. Please
find my questions and comments below.
What I like best is the praiseworthy modesty of the
first four words of the abstract: "NeXus is an effort".
Best regards, Joachim
p 1, title matter:
minimal version of my full address is
"Forschungszentrum Jülich, JCNS at MLZ, Garching, Germany".
p 1, Sect I, col 1, bullet 1:
"then" -> "than"
Or: "more difficult than it needs to be" -> "unnecessarily difficult"
p 1, Sect II, sentence 1:
Who are the "users" of NeXus?
(A) The instrument users, or (B) the authors of data acquisition
and instrument control software?
End users of a fully automatized instrument have little or no
choice which data shall be stored, which points to answer B.
However, "their experiment or data analysis" makes only sense with A.
p 2, col 1, lines 3-4:
"to read only one logical file" - in his lifetime? per experiment?
per scan?
p 2, col 1, paragraph 3 "Of course, NeXus strives ...":
I suggest to delete this paragraph as overly redundant with what
has been said in Sect I.
p 2, col 1, paragraph 4 "A NeXus container file...":
Here two things are mixed:
- A raw data file may contain some entries that are not needed
for data analysis (but maybe valuable for long-term documentation
of the used instrument, or for organisational purposes: for
instance, the names of the experimentalists clearly belong into
a raw data file, but are not needed for data analysis).
- NeXus foresees many different types of fields, of which any
given instrument only uses a subset.
The numbers 100 and 20 are too arbitrary. The paragraph is
not yet convincing as a rationale for application definitions.
To make the structure of the paper more transparent (what's described
where), I would suggest to enrich Sects I or/and II with forward
references to the following sections.
p 2, col 2, "provided by the NeXus class name":
Perhaps indicate that this is realized as a HDF5 attribute.
p 2, col 2, last full paragraph:
A sample is not part of the instrument. Therefore I see no need
for justifying why NXsample is not part of NXinstrument.
On the other hand, I have wondered since long why NXmonitor is
not at the same level as NXinstrument. I am glad to see an
attempt of justification, albeit an unconvincing one.
I guess "quickly" refers not to "pulled", but to "access", and
should be moved accordingly.
I guess, "access" does not mean machine access (which should be
a matter of microseconds) but easy inspection by humans occasionally
browsing a raw data file.
Even with these corrections, I find this a terribly unconvincing
rationale for drawing NXmonitor out of its natural place in the
hierarchy. Therefore I suggest a wording like: "In the course of
NeXus history, the debatable decision was taken to move NXmonitor
out of NXinstrument to a higher hierarchy level, in order to
facilitate quick inspection by humans."
p 2/3, "Another requirement":
Requirements are listed in Sect 1. A reader will expect this list
to be complete. We should not introduce further requirements later.
Also, we should clarify whether the NXdata section is obligatory
(for a file to be valid NeXus), or whether it is optional (required
only if generic plot tools shall be used).
p 3, Sect IIIA1:
The division into two steps does not appear natural to me.
Also, I see no need to justify the choice of some rules by citing
the problem for automatic tools to find their data. Everything
in NeXus is intended to enable automatic parsing, so why make
this point here?
The text under step 2 is unclear to me. "The NXsubentry groups
have the same hierarchy as the NXentry group" ? No, according to
table II, NXsource is still in NXentry/NXinstrument, not in
NXentry/NXsubentry/NXinstrument.
"to avoid duplication" - duplication of which data ? "links into
the main NXentry hierarchy" - links to which entities ? Needs a
longer explanation. Is it possible to give an example of such
links in table II ?
p 3, Sect IIIA2, bullet 1:
"appropriate place" - sounds like it is not entirely trivial
to determine where to store what. Should be elaborated.
"The array's first dimension is the number of ..." Categorically
impossible, a dimension (meant here in the sense of axis, not of
rank) can contain a given number points, but it "is" not a number.
Perhaps: "The array's first dimension is the scan axis."
Same subsection, last sentence:
Spurious "is" in "is behaviour is".
What exactly is undefined? The preceding sentence says rather clearly
what happens when a multi-d scan is interrupted. What more needs
to be defined?
p 4, Sect IV, paragraph 1, line 5:
"possible" -> "allowed"
Same paragraph, lines 8-11:
The text introduces
- "the NeXus dictionary" = the collection of all NeXus base classes
- "dictionaries of allowed names" = NeXus base classes.
This is utterly confusing, and incompatible with everyday usage
of the word "dictionary".
Same paragraph, lines 11:
"names" -> "keywords to designate groups and fields"
Same paragraph, lines 12-13:
"for all the names in a NeXus base class"
-> "for all allowed keywords".
p 4, Sect IV, last paragraph:
"Procedures are in place":
Far too short for something that is critical for NeXus to function.
Or is it meant as a bridge towards Sect V? Then the connection
must be made explicit.
p 4, Sect V, paragraph 1, sentence 2:
"NeXus asks you to store .. information about your data"
I understand nothing.
What does the subject, "NeXus", stand for: The NIAC? The API?
Who is "you": a NeXus user? a contributor?
"information about data" is "metadata" for short. Where should
I store it? Why should I store as much of it as possible?
Next sentence:
"NeXus defines" - again: who??
Next sentence:
Fully redundant.
"This" refers to the subject of the previous sentence.
Hence the sentence says:
"A certain use case of NeXus, an application definition,
is the NeXus application definition."
Next sentence:
Within this sentence, "use cases" is used one time correctly,
and then two times in a strange way that breaks the standing
expression and uses "use" as predicate.
Same sentence:
Omega -> $\Omega$
Same paragraph, last sentence:
Why is it noteworthy that data can be ignored?
Sect V, paragraph 2:
The expression "use case" continues to change its meaning.
In the previous paragraph it was equated first with "an application
definition", then with a type of applications like SAS or
powder diffraction. Now it's something that can be fulfilled.
p 5, col 1, line5:
"Users will ... produce compatible NeXus files for data written ..."
So first the data are written, and then the NeXus files are produced?
And these are _compatible_ ones, as opposed to what?
p 5, Sect VII:
Calling some facilities "relevant" sounds unfriendly towards the
others. Suggestion: "In the NIAC, most major neutron, X-ray, and
muon facilities are represented. Other facilities are invited to
join."
p 5, Sect VIII:
"still exists and" is a void statement and can be deleted
(nothing can be done to discontinue the existence of published code).
"maintained at a bug fix level" can be understood as: should not
be used for new projects. What then is the recommended API for
new projects? A plain HDF5 emitter/parser?
p 5, Sect IX, last setence:
Why should one hesitate to consult a "WWW-site"? Suggestion:
"More information, including a full PDF manual, can be found on
the project web site. Also, do not hesitate to contact members
of the NIAC."
p 5, Ref 1:
Should be formatted like a regular journal reference. And indicate
authors, not editors.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4916 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.nexusformat.org/pipermail/nexus-committee/attachments/20140813/9a3160e6/attachment.p7s>
More information about the NeXus-committee
mailing list