[Nexus] NXdir: SEG fault

freddie.akeroyd at stfc.ac.uk freddie.akeroyd at stfc.ac.uk
Fri Jan 29 17:33:45 GMT 2010


Mark,

I've done a bit of digging in the NeXus code and noticed that one of the
parameters passed to the H5Giterate() call within NXgetgroupinfo() is
"group_info1", which is a user supplied function used to store the
results and hence called recursively. This function unnecessarily
allocates a large structure on the stack, which is probably causing the
problems. I've now applied a fix to the NeXus code trunk - if you wish
to update your local copy details of the lines to change are at:


http://trac.nexusformat.org/code/changeset/1419

Regards,

Freddie

> 
> Hi Mark,
> 
> > 80000 bytes is 80K which does not seem to me as particularily large,
> given that
> > even a netbook has 1GB of memory these days.
> 
> In a typical single-thread application 80K for a stack is not an
issue.
> But in a multi-threaded application like the EPICS or Tango control
> systems there could be hundreds of threads, each of which needs its
own
> stack.  Unless one knows for sure which threads can be making the
Nexus
> calls, then one might need to create a large stack for every thread.
> And how large is large enough?  If the complexity of the Nexus file
> grows, does the stack size increase?  These are important questions to
> answer, because the consequences of getting it wrong is an application
> that will crash during data taking.
> 
> I have been doing EPICS multi-threaded programming for years, and the
> default thread stack size of about 10K has always been enough.  The
> only
> time I have had a problem is in this Nexus call.  Recall that data
> acquisition applications that create Nexus files could be running on
> real-time dedicated processors, like VME, which do not necessarily
have
> large amounts of memory either.
> 
> > As the implementation is working nicely on many systems and there is
> a
> workaround
> > I am very much tempted to write this off as a particular of that MS
> > compiler.
> 
> I don't think this is true.  How many multi-threaded NeXus
applications
> do you know of?  How many are calling Nxgetgroupinfo?
> 
> Mark
> 
> 
> -----Original Message-----
> From: nexus-bounces at nexusformat.org
> [mailto:nexus-bounces at nexusformat.org] On Behalf Of Mark Koennecke
> Sent: Friday, January 29, 2010 2:31 AM
> To: nexus at nexusformat.org
> Subject: Re: [Nexus] NXdir: SEG fault
> 
> 
> Hi,
> 
> these are two separate issues.
> 
> Mark Rivers wrote:
> > I have discovered a possibly related problem with stack overflow
when
> calling Nexus API from a multi-threaded C program.  I found that the
> function NXgetgroupinfo uses a very large amout of stack.  I would get
> stack overflow when my thread had less than 40000 bytes of stack, it
> worked OK with 80000 bytes.  The stack overflow only happened for me
on
> Windows with the MS VC++ compiler, it did not happen on Linux, or
> Windows with the gcc compiler.  But I suspect that NXgetgroupinfo is
> doing some recursion that can use a large amount of stack depending on
> the structure of the HDF file.
> >
> I assume this is relating to HDF-5. NXgetgroupinfo uses H5Giterate in
> order to get information about the group. It may recurse; I do not
> know how H5Giterate is implemented. It does not allocate particularly
> large data structures either. Now H5Giterate is one the recommended
> ways (by the HDF people) to search a group. And it was the only one
> available when we implemented the HDF-5 part of NeXus. 80000 bytes
> is 80K which does not seem to me as particularily large, given that
> even
> 
> a netbook has 1GB of memory these days.  I am also not sure that
> an alternative implementation will use less stack space. As the
> implementation is working nicely on many systems and there is a
> workaround
> I am very much tempted to write this off as a particular of that MS
> compiler.
> >
> > Mark
> >

-- 
Scanned by iCritical.


More information about the NeXus mailing list