[Nexus-developers] Handling of C-string null characters
Jens Krüger
Jens.Krueger at frm2.tum.de
Wed Jul 6 07:08:46 BST 2005
Am Dienstag, 5. Juli 2005 19:20 schrieb Ray Osborn:
> There is one urgent thing that we need to clear up before we release NAPI
> v3.0, and that concerns how we handle string lengths. Following problems
> with the XML API, Mark has now changed NXgetinfo so that it returns the
> length of the string in the Fortran API but adds one to the length in the C
> API to accommodate the NULL character. I think this is the wrong way to
> approach this problem, and I think Freddie agreed with me when he wrote to
> confirm what the API now does. We need to resolve this quickly so other
> opinions are welcomed.
>
> So I'm raising the old question - how long is a string?
>
> Current Behaviour:
>
> NXgetinfo and NXmalloc adds the extra byte to the length of character
> strings, when called in C, but it is removed in the Fortran API. The length
> of "neutron" is 8 in C but 7 in Fortran (and presumably other APIs such as
> Python). NXgetdata will return "neutron\0" in C, but "neutron" in Fortran.
>
Hi Ray,
I agree with your proposal since the strlen() function in C returns the length of
the string without the ending '\0' byte. Every C programmer knows the problem
with the '\0' byte and if he has to malloc any space for a string.
Jens
> Proposal (my view, and I believe Freddie's):
>
> The length of a character string returned by NXgetinfo should be the number
> of characters excluding the NULL character, and NXgetdata should return
> exactly those characters. The documentation should warn the C-programmer to
> add one byte to the allocation, if they use malloc directly, and to add the
> NULL character to the string returned by NXgetdata to make a C-string.
> NXmalloc will automatically add the extra byte when allocating memory.
>
> This ensures that the length does not depend on the language used to read
> the NeXus file. C-programmers are used to dealing with this issue and
> don't need to be spoon-fed. The average non-programming user will, however,
> be confused why "neutron" is 8 characters long according to NXbrowse and
> most other generic file readers, but only seven according to the Fortran
> API. This will prevent such confusion in a well-documented way.
>
> We may need to put this to a vote, but we should settle it before Friday if
> Nick's timetable is to be kept.
>
> Regards,
> Ray
--
Jens Krüger
Technische Universität München
ZWE FRM-II
Lichtenberg-Str. 1
D-85747 Garching
Tel: + 49 89 289 14 716
Fax: + 49 89 289 14 666
mailto:jens.krueger at frm2.tum.de
http://www.frm2.tum.de/
More information about the NeXus-developers
mailing list