[Nexus-developers] Handling of C-string null characters

Jens Krüger Jens.Krueger at frm2.tum.de
Wed Jul 6 07:08:46 BST 2005


Am Dienstag, 5. Juli 2005 19:20 schrieb Ray Osborn:
> There is one urgent thing that we need to clear up before we release NAPI
> v3.0, and that concerns how we handle string lengths.  Following problems
> with the XML API, Mark has now changed NXgetinfo so that it returns the
> length of the string in the Fortran API but adds one to the length in the C
> API to accommodate the NULL character.  I think this is the wrong way to
> approach this problem, and I think Freddie agreed with me when he wrote to
> confirm what the API now does.  We need to resolve this quickly so other
> opinions are welcomed.
> 
> So I'm raising the old question - how long is a string?
> 
> Current Behaviour:
> 
> NXgetinfo and NXmalloc adds the extra byte to the length of character
> strings, when called in C, but it is removed in the Fortran API.  The length
> of "neutron" is 8 in C but 7 in Fortran (and presumably other APIs such as
> Python).  NXgetdata will return "neutron\0" in C, but "neutron" in Fortran.
> 

Hi Ray,

I agree with your proposal since the strlen() function in C returns the length of
the string without the ending '\0' byte. Every C programmer knows the problem
with the '\0' byte and if he has to malloc  any space for a string.

Jens

> Proposal (my view, and I believe Freddie's):
> 
> The length of a character string returned by NXgetinfo should be the number
> of characters excluding the NULL character, and NXgetdata should return
> exactly those characters.  The documentation should warn the C-programmer to
> add one byte to the allocation, if they use malloc directly, and to add the
> NULL character to the string returned by NXgetdata to make a C-string.
> NXmalloc will automatically add the extra byte when allocating memory.
> 
> This ensures that the length does not depend on the language used to read
> the NeXus file.   C-programmers are used to dealing with this issue and
> don't need to be spoon-fed.  The average non-programming user will, however,
> be confused why "neutron" is 8 characters long according to NXbrowse and
> most other generic file readers, but only seven according to the Fortran
> API.  This will prevent such confusion in a well-documented way.
> 
> We may need to put this to a vote, but we should settle it before Friday if
> Nick's timetable is to be kept.
> 
> Regards,
> Ray 

-- 

Jens Krüger

Technische Universität München
ZWE FRM-II
Lichtenberg-Str. 1
D-85747 Garching

Tel: + 49 89 289 14 716
Fax: + 49 89 289 14 666
mailto:jens.krueger at frm2.tum.de
http://www.frm2.tum.de/





More information about the NeXus-developers mailing list