[Nexus-developers] NXconvert-NXtranslate

Mark Koennecke Mark.Koennecke at psi.ch
Fri Nov 14 16:17:20 GMT 2003


  High,
  
  what Peter suggests is a more general tool then we envisaged on the
  NIAC meeting. His proposed NXtranslate  converts not only NeXus files
  but has the capability to convert from other types of files as well.
  In general, I like this extension of the original concept for
  NXconvert. 

  Peter suggested to specify data content through extra tags in the
  definition which detail data content in a URI type scheme. This
  means that any extension to read different files would have to implement 
  URI like handlers. I feel this make things unnecessarily complicated
  through another level of software indirection. Moreover private URI like
  handlers might get in conflict or be confused with WWW standards. 

  Rather I would take up the idea of a single translation XML-file
  describing the structure of the new file. However, the data section for
  datasets would hold a command in a scripting language which copies
  the data into place. 

  This makes the interface more programmatic. Moreover, most scripting
  languages come with extension mechanisms (even binary) built in, so
  we do not have to provide for that.     
  
  I reworked Peter proposal to contain my ideas, please see the
  attachment for more details.

			Best Regards and have a Nice Weekend,

				Mark Koennecke
     
-------------- next part --------------
NXtranslate Whitepaper

NXtranslate is designed to be an extensible program to convert data
from existing source files to compliant NeXus target files. The source
files are not just those produced by the NeXus API (napi), but any
data files for which a library to read information from them is
available.

NXtranslate Tasks

The following problems have to be addressed by an automatic conversion
tool for NeXus files:
- It must be extensible.
- It should be possible to select the type of output file (HDF4, HDF5,
  or XML-based NeXus).
- Data must be copied from a source file to an arbitrary place in the
  target file structure.
- The tool must be capable of processing multiple entries in a file.
- Some data must be copied from attributes in the source file to data
  items in the target file.
- Links have to be created in the target file.
- It must be possible to supply missing data.
- Some values may need to be recalculated.
- It must allow for combining data from several files.
- It must be able to read in data from multiple formats, not just
  those accessible from the napi.

In addition, the NeXus file converter should preferably be a command
line utility in order to facilitate batch conversions of large numbers
of files.


NXtranslate Implementation Strategy

NXtranslate will use a single XML-translation file. This translation file
will be a NeXus XML file describing the structure of the NeXus file to
be generated. Instead of data it will contain statements in a scripting 
language which will fill in the necessary data. This scripting language will
be enhanced with functions to manipulate NeXus files directly and convenience
functions to  copy data from an input NeXus file to an output NeXus file
easily. 

The main components of Ntranslate will be a XML parser and a scripting 
language. For both, of the shelf software components will be used, with
Tcl being the preferred scripting language. The XML parser should be
of the DOM type for two reasons: the upcoming XML support for NeXus 
requires a DOM parser anyway, thus we introduce only one additional
library dependence. The second reason is that it is easier to traverse
a DOM tree multiple times. These two components will be augmented by a
loop which processes through the NXentry structure for each entry in
the first input file. For each SDS, this routine will call the
scripting language interpreter with the text provided instead of data for
the SDS.  


Translation File

As mentioned above, the translation file is a XML-based NeXus
file with some enhancements. The file will have two sections.

The first section will be delimited by <script> </script>. This
section is  passed to the interpreter once. It is supposed to contain 
statements for loading additional scripting libraries (for example, to 
access ISIS raw files) and to initialize access to files, databases
etc. Command line arguments will be made available to the interpreter
before this script is processed.

The second section contains the description of an entry in the new NeXus 
file. This section can either look like the NeXus meta-DTD or a 
NeXus-XML file. However, the data content of each scientific dataset
and global attribute is a text which will be passed on to the
interpreter. NXtranslate will make the name of the entry currently
being processed available as a variable in the scripting language. 
NXtranslate will add descriptive attributes to the dataset after the
dataset has been copied. Dimensions and data types are the job of the
copying script.


Example Translation File

Here is a *partial* translation file as an example. Note that the
precise format will change as the XML base of NeXus is standardized.
copyFromNexus is a symbol for a provided utility routine which copies
data from one NeXus file into the current structure.

  <script>
    source nxsupport.tcl
    set inputFile [nx_open [lindex $argv 4] $NXACC_READ]
  </script>
  <NXroot>
    <entry1 type="NXentry">
      <title type="" nx_tag="title">
         copyFromNexus $inputFile /$entryName/title
      </title>
      <definition type="NX_CHAR" version="1.0"
                  URL="http://www.neutron.anl.gov/nexus/xml/NXtofnpd.xml">
        TOFNPD
      </definition>
      <run_number type="NX_INT" nx_tag="run_number"/>
      <user type="NXuser">
        <name type="NX_CHAR" role="PI">
	  copyGlobalAttribute2SDS $inputFile owner
        </name>
        <affiliation type="NX_CHAR">
	  copyGlobalAttribute2SDS $inputFile owner_address
         </affiliation>
      <SEPD type="NXinstrument">
        <name type="NX_CHAR" short_name="SEPD">
          writeText SEPD
        </name>
        <source type="NX_CHAR">
          <name type="NX_CHAR" short_name="IPNS">
            writeText name>
          </name>
          <distance type="NX_FLOAT" units="meter">
	  writeFloat [expr [getFromNexus $inputFile \
              /$entryName/SEPD/source/distance] * -1]
          </distance>
        </source>
        <bank1 type="NXdetector" nx_tag="H1:bank1:geom"/>
      </SEPD>
      <bank1 type="NXdata">
        <time_of_flight type="NX_FLOAT" units="" axis="1">
          copyFromNexus $inputFile /$entryName/detector1/tof
        </time_of_flight>
        <counts type="NX_FLOAT" signal="1" units="">
          copyFromNexus $inputFile /$entryName/detector1/counts
        </counts>
      </bank1>
    </entry1>
  </NXroot>


Command Line

nxtranslate <translation file> -o <output filename> [-hdf4] [-hdf5]
[-xml] infile1 infile2 .....















More information about the NeXus-developers mailing list