[Nexus] Fwd: Re: [MAHID] naming conventions
Pete R. Jemian
prjemian at gmail.com
Fri Jan 29 19:01:35 GMT 2010
Freddie:
Agreed.
NeXus.xsd is not used to validate NXDL files.
NXDL files validate against nxdl.xsd which includes nxdlTypes.xsd.
Can we merge these?
Or, I add the same rule to nxdl.xsd.
Currently, it permits names to be "xs:string" and that is not good.
Pete
On 1/29/2010 12:51 PM, freddie.akeroyd at stfc.ac.uk wrote:
> Pete,
>
> I believe the NeXus aim was to stick to character sequences that were
> also valid as program variable names; this allows programming language
> classes/structures to be built that mirror a defined file structure. The
> scheme you enclose below allows "." which is usually invalid in program
> variable names, instead being reserved as an operator. The expression
> "[_a-zA-Z][_a-zA-Z0-9]*" fits with the NeXus aim and is probably the
> best to use, but I've just noticed a small mistake in
> http://svn.nexusformat.org/definitions/trunk/NeXus.xsd as it uses
> "[_a-zA-Z0-9]+" for the "validName" restriction thus allowing variable
> names to start with a digit rather than only contain them; we should
> update it to "[_a-zA-Z][_a-zA-Z0-9]*"
>
> Regards,
>
> Freddie
>
>> -----Original Message-----
>> From: nexus-bounces at nexusformat.org [mailto:nexus-
>> bounces at nexusformat.org] On Behalf Of Pete R. Jemian
>> Sent: 29 January 2010 18:17
>> To: NeXus
>> Subject: [Nexus] Fwd: Re: [MAHID] naming conventions
>>
>>
>> Matt Newville suggests a stronger declaration of our naming rules.
>> Our manual, in ClassDefinitions.xml, says:
>> -------------------% clip here %-------------------
>> <para>Short name of the data field.
>> Name must satisfy both HDF and XML
>> naming rules.</para>
>> -------------------% clip here %-------------------
>>
>>
>> Matt suggests something stronger, derived from the XML standard,
>> -------------------% clip here %-------------------
>> Names for Groups, Datasets, and attributes must match:
>> NameStartChar ::= _ | a..z | A..Z
>> NameChar ::= NameStartChar | . | 0..9
>> Name ::= NameStartChar (NameChar)*
>>
>> Or, as a regular expression: [_a-zA-Z][_a-zA-Z.0-9]*
>> -------------------% clip here %-------------------
>>
>> Are these names validated in any way?
>>
>> Also, we _must_ tighten up our examples (in the manual
>> and example data files)!
>>
>> Comments?
>>
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: [MAHID] naming conventions
>> Date: Fri, 29 Jan 2010 11:16:10 -0600
>> From: Matt Newville<newville at cars.uchicago.edu>
>> Reply-To: mahid at googlegroups.com
>> To: mahid at googlegroups.com
>>
>> Hi,
>>
>> I think that the Nexus approach toward names of (correct me if I have
>> this wrong)
>> A Name for Group, Dataset or attributes
>> must be a valid HDF5 and XML name.
>>
>> is a bit too weak. To verify a name is allowed, does one check both?
>> I don't actually see a simple grammar production for HDF5 names (I
>> believe it may simply be "char*"). Spaces and non-printable ASCII
>> characters are definitely allowed, and I suspect that unicode support
>> in names may vary with HDF5 versions and libraries.
>>
>> I think non-printable characters and whitespace should be avoided. If
>> I read it correctly, one of the examples in the Nexus doc has a
>> dataset named " data " (Example 3.1, page 16):
>> <NXdata name=" data ">
>> <time_of_flight axis= 1 primary= 1> 1500.0 1502.0 1504.0 ...
>> </time_of_flight>
>> <polar_angle axis= 2 primary= 1> 15.0 15.6 16.2 ...
>> </polar_angle>
>> <data> 5 7 14 ...</data>
>> </NXdata>
>>
>> That could be unintentional, but (if I understand correctly) the
>> corresponding HDF5 file would have a Group named " data ", which is
>> allowed (both HDF5 and XML). That seems problematic to me (what if
>> there are Groups named 'data', ' data ', and ' data'?). I recommend a
>> much simplified variation of the XML grammar production that doesn't
>> allow whitespace, non-printable characters, or most punctuation in
>> names. Specifically, I suggest
>>
>> Names for Groups, Datasets, and attrbutes must match:
>> NameStartChar ::= _ | a..z | A..Z
>> NameChar ::= NameStartChar | . | 0..9
>> Name ::= NameStartChar (NameChar)*
>>
>> Or, as a regular expression: [_a-zA-Z][_a-zA-Z.0-9]*
>>
>> We could consider other punctuation characters, such as '@$&~|:-', but
>> I think we could easily live without these too.
>>
>> Any comments?
>>
>> Cheers,
>>
>> --Matt Newville
>> _______________________________________________
>> NeXus mailing list
>> NeXus at nexusformat.org
>> http://lists.nexusformat.org/mailman/listinfo/nexus
--
----------------------------------------------------------
Pete R. Jemian, Ph.D. <jemian at anl.gov>
Beam line Controls and Data Acquisition, Group Leader
Advanced Photon Source, Argonne National Laboratory
Argonne, IL 60439 630 - 252 - 3189
-----------------------------------------------------------
Education is the one thing for which people
are willing to pay yet not receive.
-----------------------------------------------------------
More information about the NeXus
mailing list