[Nexus] NXdir: SEG fault

Mark Rivers rivers at cars.uchicago.edu
Fri Jan 29 15:03:40 GMT 2010


Hi Mark,

> 80000 bytes is 80K which does not seem to me as particularily large,
given that 
> even a netbook has 1GB of memory these days.

In a typical single-thread application 80K for a stack is not an issue.
But in a multi-threaded application like the EPICS or Tango control
systems there could be hundreds of threads, each of which needs its own
stack.  Unless one knows for sure which threads can be making the Nexus
calls, then one might need to create a large stack for every thread.
And how large is large enough?  If the complexity of the Nexus file
grows, does the stack size increase?  These are important questions to
answer, because the consequences of getting it wrong is an application
that will crash during data taking.

I have been doing EPICS multi-threaded programming for years, and the
default thread stack size of about 10K has always been enough.  The only
time I have had a problem is in this Nexus call.  Recall that data
acquisition applications that create Nexus files could be running on
real-time dedicated processors, like VME, which do not necessarily have
large amounts of memory either.

> As the implementation is working nicely on many systems and there is a
workaround
> I am very much tempted to write this off as a particular of that MS 
> compiler.

I don't think this is true.  How many multi-threaded NeXus applications
do you know of?  How many are calling Nxgetgroupinfo?

Mark


-----Original Message-----
From: nexus-bounces at nexusformat.org
[mailto:nexus-bounces at nexusformat.org] On Behalf Of Mark Koennecke
Sent: Friday, January 29, 2010 2:31 AM
To: nexus at nexusformat.org
Subject: Re: [Nexus] NXdir: SEG fault


Hi,

these are two separate issues.

Mark Rivers wrote:
> I have discovered a possibly related problem with stack overflow when
calling Nexus API from a multi-threaded C program.  I found that the
function NXgetgroupinfo uses a very large amout of stack.  I would get
stack overflow when my thread had less than 40000 bytes of stack, it
worked OK with 80000 bytes.  The stack overflow only happened for me on
Windows with the MS VC++ compiler, it did not happen on Linux, or
Windows with the gcc compiler.  But I suspect that NXgetgroupinfo is
doing some recursion that can use a large amount of stack depending on
the structure of the HDF file.
>   
I assume this is relating to HDF-5. NXgetgroupinfo uses H5Giterate in 
order to get information about the group. It may recurse; I do not
know how H5Giterate is implemented. It does not allocate particularly 
large data structures either. Now H5Giterate is one the recommended
ways (by the HDF people) to search a group. And it was the only one 
available when we implemented the HDF-5 part of NeXus. 80000 bytes
is 80K which does not seem to me as particularily large, given that even

a netbook has 1GB of memory these days.  I am also not sure that
an alternative implementation will use less stack space. As the 
implementation is working nicely on many systems and there is a
workaround
I am very much tempted to write this off as a particular of that MS 
compiler.
>  
> Mark
>  
>
> ________________________________
>
>
> Using nxdir, I get a SEGV:
>
>
>
> 	nxdir -t multi mcstas.h5 -p "*" --data-mode script
> 	*** stack smashing detected ***: nxdir terminated
> 	Segmentation fault
> 	
>
>   
This is a real bug, caused by a string overrun when formatting numbers 
in nxdir. I have fixed this; the
updated code is in the NeXus subversion repository.

Best Regards,

Mark Koennecke
> System is:
>
>
> 	NeXus 4.2.0 (from tarball)
> 	Linux Ubuntu 8.04 64 bits
> 	
>
>
> Generation of NeXus file with McStas performs OK (file attached),
except for warning messages when trying to create/open already
existing/opened groups/objects. 
>
> The only way to seemingly test if a group exists without creating with
an error message is to call NXgetgroupinfo to get the current group and
make a strcmp with the requested group.
>
> There is unfortunately no such routine for data sets. We can not
obtain the current Data Set name, only its dimension, rank and type with
NXgetinfo.
>
> An other solution suggested by Mark K is to unactivate HDF error
messages with NXMDisableErrorReporting(); and reset it with
NXMEnableErrorReporting();
>
> If you think of a better way to inquire groups/data sets, and a
solution for nxdir that fails showing data values/attributes, please
tell me.
>
> Emmanuel.
>
>   

_______________________________________________
NeXus mailing list
NeXus at nexusformat.org
http://lists.nexusformat.org/mailman/listinfo/nexus


More information about the NeXus mailing list