Levels of NeXus compliance & More compression
C.M.Moreton-Smith at rl.ac.uk
C.M.Moreton-Smith at rl.ac.uk
Mon Feb 21 09:40:59 GMT 2000
Ray queried the times for compression in the tests I did, NeXus with deflate
level 6 comes out well on top as you can see below. Level 9 appears hardly
worth considering for the tiny gain in compression!
cnt1.none.nxs 42s (default compression)
cnt1.none.sz 1 min 19s
cnt1.none.cab 5 min 39s
cnt1.none.rk 7 min 6s
cnt1.none.nx9 13 mins 24s (deflate level = 9)
Chris
(These were measured on a Dual PIII 550MHz system with plenty of memory)
> -----Original Message-----
> From: Ray Osborn [mailto:ROsborn at anl.gov]
> Sent: 16 February 2000 15:30
> To: NEXUS at anpns1.pns.anl.gov; C.M.Moreton-Smith at rl.ac.uk
> Subject: Re: Levels of NeXus compliance & More compression
>
>
> on 2000/02/16 4:12 AM, C.M.Moreton-Smith at rl.ac.uk at
> C.M.Moreton-Smith at rl.ac.uk wrote:
>
> > I've done a bit more research into better compression for
> NeXus files and
> > thought it would be good to pass on what I've discovered.
> >
> > Chasing around the web I've come across two lossless
> compression codes which
> > (on ISIS files at least) can give approximately 40-50%
> better compression
> > than LZW (level 9). We have a real problem currently with
> datafiles of
> > 50-100MB each which can now be produced as rapidly as every
> 2 minutes! with
> > this in mind, an improvement like this in compression of
> NeXus files could
> > save us 400GB of storage which is quite a cost saving.
> >
>
> I find your results very encouraging. It suggests that the
> standard NeXus
> compression works very well, giving you reduction factors of
> nearly 10. It
> is, of course, possible to do better if you compress the
> whole HDF file
> because then you are compressing the HDF headers, address
> blocks, small
> SDS's etc. If that space gain is more vital than immediate
> access to the
> data, then do both. Use NeXus compression, and then compress
> the whole
> file.
>
> > Contrary to my earlier belief, the codes work best on raw
> (uncompressed)
> > files and go well beyond the two stage compression we are
> currently using
> > for archiving existing ISIS (.RAW) files.
> >
> > I've attached a table giving the compression ratios
> achieved with the
> > different codes, firstly on a NeXus level 0 translation of
> a 54Mb GEM data
> > file and secondly on a file containing just the integer
> count data alone
> > (CNT1) from the same file.
> >
> > Tantalizingly, the source for the best codes is not
> available currently
> > which makes them risky to rely on. The best widely
> available compressor
> > surprisingly is the Microsoft CAB file kit!
> >
>
> I agree that it is risky using proprietary codes. Someone recently
> described this as giving ownership of your files to the company that
> controls the data format. In effect, they decide whether you can have
> access to your own data.
>
> > One thing that is clear is that with a good compression
> code, multi-stage
> > compression is less satisfactory and that there is still a
> good case for
> > writing uncompressed NeXus files first then compressing the
> whole file
> > afterwards.
> >
>
> I think the advantage of using NeXus compression before overall file
> compression is that you don't take such a big disk space hit when you
> decompress the file. I'm sure you've had experience of users
> deciding to
> restore a dozen files at once. With the double compression
> scheme, this
> will not be a problem; the user will still have efficiently
> compressed files
> that they can access as simply as if decompressed.
>
> One issue that is only hinted at in the attached file is the
> speed of each
> of these schemes. Can you say how long it took for NeXus to
> compress the
> 50Mb files? I suspect that this will be the deciding factor
> for most people
> in choosing compression strategies.
>
> Ray
> --
> Dr Ray Osborn Tel: +1 (630) 252-9011
> Materials Science Division Fax: +1 (630) 252-7777
> Argonne National Laboratory E-mail: ROsborn at anl.gov
> Argonne, IL 60439-4845
>
>
More information about the NeXus
mailing list