[Nexus] NXnet proposal (development of a test SRB network for N/u/X data)
Moreton-Smith, CM (Christopher)
C.M.Moreton-Smith at rl.ac.uk
Mon Sep 13 07:56:06 BST 2004
In the last few months, we've been experimenting, more and more seriously
with a technology being developed by the San Diego Supercomputer Centre
known as SRB (Storage Resource Broker). See
http://www.npaci.edu/DICE/SRB <http://www.npaci.edu/DICE/SRB>
It is a technology also being very well supported by the UK's e-Science
initiative in conjunction with SDSC.
http://www.e-science.clrc.ac.uk/web/projects/storage_resource_broker
<http://www.e-science.clrc.ac.uk/web/projects/storage_resource_broker>
At ISIS, we have decided to use the technology as the means for us to access
our data archive on the Atlas peta-byte data store (where we currently store
our archived data but in a different format). This however is an almost
trivial use of the system, by it's very nature it encompasses the concept of
a global file system set up between co-operating establishments and
accessible from anywhere with a variety of interfaces (web, PC client, Java,
command line on anything).
We are currently running a test system before we go live with real data
later in the autumn; it is however a bit sterile because although it does
all we need, we are only working within the local site. To really see the
potential of the system it would be nice to get a handful of people to join
servers to the network (in test data mode only!) and see how the exchange of
experimental (and other) data can really work between establishments. To
get a feel for SRB and how it works, I would entreat you to try _both_ the
windows client (INQ) in demo mode and the MySRB web client demonstrators
(both available on the SDSC site). For more serious use a command line or
language APIs are also available.
If you are in the (relatively small) group of systems administrators who
have the job of archiving and making users data available to them, and after
having a play with the demos are interested in joining up a test server (it
is necessary to be familiar with a Linux style manual build and install - or
Windows if it's just a basic server) and have the ability to open some
specific firewall ports to the world, it is however, probably only a ½ hour
job to get a basic server running. Please e-mail me for more info; we would
simply need to add the address of your server into our network to make it a
member and then create a user or two for you to try things out.
Best regards,
Chris Moreton-Smith
The Proposal
The motivation for NeXus is to allow the exchange of data between programs,
instruments, establishments and branches of science where the neutron, muon
and X-ray technologies are the common theme. Recently the NeXus community
has been greatly (and crucially) occupied with choosing what data to store
in a file and how it should be described in a common fashion.
This proposal tackles a complementary facet of the broader problem; namely
the exchange of the data files themselves between the programs, operating
systems, instruments and establishments.
Most scientists doing experiments on instruments at different establishments
will have experienced the joy of trying to copy data off a variety of
different computer systems, often being forced to network a laptop machine
at the last minute, write a CD or floppy disk before racing to catch a
flight, manually select and copy files one at a time via ftp and/or
negotiate a firewall to get at their data. If very lucky, maybe they have
been able to download data from a conveniently set up web site, where
remembering the password has been the only problem.
If this state of affairs wasn't bad enough, there are things that are even
harder to do than read your data. For example, you might want to send a few
calibration files and sample setup notes to a remote site ready for an
experiment. This would normally be seen as folly unless you have a trusted
colleague already on site to help you receive and look after the files. And
what if you've forgotten a file you needed on arrival? Wouldn't it be nice
if you could do a bit of data reduction on site and then continue at home,
all the time saving the reduced files in the same directory as the raw data
(and with no need to copy the data locally). And then permit a colleague to
access a few of the files from their own laptop whilst at home (by setting
permissions like you do on a local Unix NFS file system or Windows share)?
There is really very little reason why we should accept such a desperate
state of affairs. With the help of the UK's e-Science centre we have been
experimenting with the San Diego Supercomputing Centre's SRB (Storage
Resource Broker) for several months now. This provides at a minimum a very
credible and functional globally distributable file system.
The proposal is in three parts.
1) We adopt SRB as a working system on which to experiment with
building a global integrated network for sharing Neutron, Muon and X-Ray
data between our establishments and our users. We do this pragmatically
(like we have done with HDF) because it currently seems to do the job and
support is what standards like this need to develop.
2) More fundamentally, we extend our remit of defining and organizing
data types within the NeXus file to also giving some sort of standardisation
to the organisation and location of data within a global file system. Quite
simply, this just avoids things being lost by everyone storing things under
different names and in different places (for example, a naming convention
for raw files).
3) Even more fundamentally, we spend some effort defining the sort of
meta-data which we might associate with each file (possibly not contained in
the NeXus file itself). This meta-data would enable a data portal style
search engine, just like a super data-Google quickly to find relevant data
by searching throughout this global file system. Some of this sort of work
is already underway but some sort of standardization of the type and
contents of this metadata is very close to the sort of standardization we
are aiming at with the NeXus file contents and would greatly ease the
ability to search and find relevant information.
The sheer use-ability of this particular system, especially for quite a
young technology is staggering. This is not something to be planning for
next year or the year after, it's something to be using now - and then
planning how to build a data storage and access strategy around it. NeXus
has taken a long time to grow to the point where we are able to agree on the
most difficult issue of what is common within our data files, this is
because it's a really difficult job. Taking some responsibility for (2) and
(3) in particular is a lot less work but is something best tackled early.
--
Chris Moreton-Smith, Software Development Manager
ISIS Science Instrumentation, CCLRC, Chilton, Didcot, OXON OX11 0QX
Telephone: +44 (0) 1235 446544, Fax: +44 (0) 1235 445720
Email: <mailto:C.Moreton-Smith at rl.ac.uk> C.Moreton-Smith at rl.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.nexusformat.org/pipermail/nexus/attachments/20040913/af7b9b24/attachment.html
More information about the NeXus
mailing list