Levels of NeXus compliance & More compression

C.M.Moreton-Smith at rl.ac.uk C.M.Moreton-Smith at rl.ac.uk
Mon Feb 21 09:40:59 GMT 2000


Ray queried the times for compression in the tests I did, NeXus with deflate
level 6 comes out well on top as you can see below.  Level 9 appears hardly
worth considering for the tiny gain in compression!

	cnt1.none.nxs 	42s			(default compression)
	cnt1.none.sz	1 min 19s
	cnt1.none.cab	5 min 39s
	cnt1.none.rk	7 min 6s
	cnt1.none.nx9	13 mins 24s		(deflate level = 9)

Chris

(These were measured on a Dual PIII 550MHz system with plenty of memory)

> -----Original Message-----
> From: Ray Osborn [mailto:ROsborn at anl.gov]
> Sent: 16 February 2000 15:30
> To: NEXUS at anpns1.pns.anl.gov; C.M.Moreton-Smith at rl.ac.uk
> Subject: Re: Levels of NeXus compliance & More compression
> 
> 
> on 2000/02/16 4:12 AM, C.M.Moreton-Smith at rl.ac.uk at
> C.M.Moreton-Smith at rl.ac.uk wrote:
> 
> > I've done a bit more research into better compression for 
> NeXus files and
> > thought it would be good to pass on what I've discovered.
> > 
> > Chasing around the web I've come across two lossless 
> compression codes which
> > (on ISIS files at least) can give approximately 40-50% 
> better compression
> > than LZW (level 9).  We have a real problem currently with 
> datafiles of
> > 50-100MB each which can now be produced as rapidly as every 
> 2 minutes! with
> > this in mind, an improvement like this in compression of 
> NeXus files could
> > save us 400GB of storage which is quite a cost saving.
> > 
> 
> I find your results very encouraging.  It suggests that the 
> standard NeXus
> compression works very well, giving you reduction factors of 
> nearly 10.  It
> is, of course, possible to do better if you compress the 
> whole HDF file
> because then you are compressing the HDF headers, address 
> blocks, small
> SDS's etc.  If that space gain is more vital than immediate 
> access to the
> data, then do both.  Use NeXus compression, and then compress 
> the whole
> file.
> 
> > Contrary to my earlier belief, the codes work best on raw 
> (uncompressed)
> > files and go well beyond the two stage compression we are 
> currently using
> > for archiving existing ISIS (.RAW) files.
> > 
> > I've attached a table giving the compression ratios 
> achieved with the
> > different codes, firstly on a NeXus level 0 translation of 
> a 54Mb GEM data
> > file and secondly on a file containing just the integer 
> count data alone
> > (CNT1) from the same file.
> > 
> > Tantalizingly, the source for the best codes is not 
> available currently
> > which makes them risky to rely on.  The best widely 
> available compressor
> > surprisingly is the Microsoft CAB file kit!
> > 
> 
> I agree that it is risky using proprietary codes.  Someone recently
> described this as giving ownership of your files to the company that
> controls the data format.  In effect, they decide whether you can have
> access to your own data.
> 
> > One thing that is clear is that with a good compression 
> code, multi-stage
> > compression is less satisfactory and that there is still a 
> good case for
> > writing uncompressed NeXus files first then compressing the 
> whole file
> > afterwards.
> > 
> 
> I think the advantage of using NeXus compression before overall file
> compression is that you don't take such a big disk space hit when you
> decompress the file.  I'm sure you've had experience of users 
> deciding to
> restore a dozen files at once.  With the double compression 
> scheme, this
> will not be a problem; the user will still have efficiently 
> compressed files
> that they can access as simply as if decompressed.
> 
> One issue that is only hinted at in the attached file is the 
> speed of each
> of these schemes.  Can you say how long it took for NeXus to 
> compress the
> 50Mb files?  I suspect that this will be the deciding factor 
> for most people
> in choosing compression strategies.
> 
> Ray
> -- 
> Dr Ray Osborn                Tel: +1 (630) 252-9011
> Materials Science Division   Fax: +1 (630) 252-7777
> Argonne National Laboratory  E-mail: ROsborn at anl.gov
> Argonne, IL 60439-4845
> 
> 



More information about the NeXus mailing list