[Question regarding Fortran NAPI-interface]

Sat Feb 28 23:59:40 GMT 1998

> 
> Right now however, i have a question concerning the compatibility of Fortran
> and C-arrays. As you probably are aware of, in Fortran the fastest moving index
> is the innermost one, in contrast to almost any other language ( C, Pascal.. ).
> ie. an array of rank 2 with dimension  (2,2) is stored in the following order
> Fortran: a(1,1), a(2,1), a(1,2), a(2,2)
> and in C: a(1,1), a(1,2), a(2,1), a(2,2).
> 
> Looking at the interfaces, i have the impression, that when a Array( rank>1
> ) is
> passed or retrieved through the Fortran-Napi,  the array will be in the
> wrong order.
> In the example program's ( napi_test.c and napif_test.f ) you avoid this
> problem
> by using a rank-1 array for the rank-2 data. However, i'm not sure whether this
> is very practical in real-life. So if i'm not mistaken, then a data-array
> of true
> rank > 1 ( i.e. A(i1=1..n1,i2=1..n2) ) put into a NeXus-File through the
> current Fortran-interface would actually be stored as the transpose of it (
> and it's getting
> worse for higher ranks). If this data is read back again through the
> Fortran-interface,
> then there's no problem. However, if the data is read with a C-program, the
> data
> could look pretty ugly ( i would have made a test on that if only i had any c-
> knowledge :-(  ). So the question remains what should be done about this:
> 
Stephan,

I'm not sure if we really have a problem here, and whether it would help
switching the arrays round between FORTRAN and C. An array is stored in
memory as contiguous columns (FORTRAN) or contiguous rows (C and others).
The array looks transposed in C as you would index it as [j][i] rather
than as (i,j), but by leaving things this way we can access the data
efficiently. For example, say we have a two D array that was, in fact,
a continuous set of 30 spectra and of FORTRAN dimensions S(5000,30).
As FORTRAN stores in column order, you can pass an individual spectrum M
to a routine by passing S(1,M). If we write arrays without transposing,
we would get a C array cs[30][5000] (and we have to remember to swap
the indices), but we can still access an individual spectrum array M as
&cs[M-1][0]. If we were to transpose the array to get the indices the
same way as in FORTRAN, we could then use cs[i-1][j-1] for S(i,j), but we
then get two problems: (1) We no longer have a pointer to a whole
spectrum array, and so would have to create an extra temporary array for
e.g. plotting a spectrum and (2) when we loop over i to look at spectrum M
in cs[i][M-1] we are jumping 30*sizeof(array element) bytes of memory,
which is probably OK for a size of 30, but could cause excessive paging
for larger arrays.

Thus i would propose we do nothing to the way data is written (except i
need to make a small modification to the FORTRAN interface to reverse the
dimensions array on a read or write), though we will have to be careful
about how we describe which array dimension represents a certain item
of data. We cannot say dimension 1 is X and dimension 2 is Y as that may
be true in FORTRAN, but in C dimension 2 would be X. What we can say in
both cases is that the FASTEST VARYING ARRAY INDEX IS X.  If we tag each
dimension with a name, then a program could look up the name and hence
find whether it needs to use dimension 1 or 2.  I guess a spectrum array
is a worse case example as it has only one "natural" way to access it -
for something on a (Qx,Qy) grid, cuts along either axis are equally likely
and it becomes important to make sure which is "Qx" is properly agreed on!

Does anybody else have any views on how arrays should be stored or indexed?

Freddie

-- 
Freddie Akeroyd                        Email:  faa at isise.rl.ac.uk
ISIS Facility                          Tel:    +44 1235 445457   
Rutherford Appleton Laboratory         Fax:    +44 1235 445720   
Chilton,   DIDCOT,   OX11 OQX          WWW:    http://www.isis.rl.ac.uk/