[foreign] enhancing pointer and array cast
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Feb 28 16:13:06 UTC 2019
Hi,
The recent discussion (see [1]) on Pointer vs. (nested) Array (thanks
Lev) triggered some new ideas. I think what we're fundamentally banging
our head against is the fact that C plays a loose game where arrays
decay into pointers, which makes it easier to express idioms that are
not so easily handled by the Panama API - which is primarily designed
for safety.
To recap that discussion - let's assume I want to model complex numbers;
to do that, I can model them as _pairs_ of double numbers. A possible
way to model a pair is by using a two-element array (which is the choice
the FFTW [2] library does).
So, how do I talk about arrays of complex numbers? Well, these are
arrays of arrays of (two) doubles. Or, in Panama:
Array<Array<Double>>
This is all and well, but most of the C code out there, when populating
such an array will just allocate a slab of double big enough to contain
all the required data, and then fill it with a loop. In Panama, this is
not possible, at least not directly: for each complex number you have to
allocate an Array - and then set it at the right position of the
outermost array.
Panama does provide escape hatches - namely, the Pointer::cast and
Array::cast method are created for precisely this reason. So one could
imagine doing this:
Array<Double> buffer = scope.allocateArray(NativeTypes.DOUBLE,
nComplexPairs * 2);
And then initialize the buffer contents as follows (we assume realPart
and imgPart are two arrays containing the real/imaginary part of the
complex numbers we want to store):
for (int i = 0 ; i < nComplexPairs ; i++) {
buffer.set(i, realPart[i]);
buffer.set(i + 1, imgPart[i]);
}
Now, we need to convert the buffer to the type the API expects, namely
an Array<Array<Double>>. We could do this:
Array<Array<Double>> arr = buffer.cast(LayoutType.DOUBLE.array(2),
nComplexPairs);
This is fine - in C this operation would have been implicit, but since
Panama focus in on safety, here we need an explicit cast.
Except, this doesn't work. Because Panama cast operation only allow
casts when source and target layout is the same! You could try to
workaround the limitation by going through NativeTypes.VOID, which
disables the check - but that's very unfriendly - good as an escape
hatch, but is there a better way to get there?
So, I started to think about ways to enhance Panama's cast operation in
a safe fashion. After all, the main point here is that the source and
target layout have the same _flattened_ representation:
[ f64 f64 ............ f64 ] //repeated nComplexPairs * 2 times
It follows that it should be safe to go from one to the other - we are
just trying to enforce *more* structure on the same flattened layout!
But this is not the only thing that is safe to do; consider another
example, where I want to temporarily view the buffer above as a _single_
complex number (e.g. array with size 1). In this case, we want to go from:
[ f64 f64 ............ f64 ] //repeated nComplexPairs * 2 times
to
[ f64 f64 ]
This is also safe, because the target flattened representation is the
start of the source flattened representation.
What about expanding a layout? Is that safe? E.g. go from:
[ f64 f64 ]
to
[ f64 f64 f64 f64 ... ]
In general no, this is not safe. But there is a particular use case in
which we might be interested in allowing this operation:
struct foo {
int nelems;
double elems[] //incomplete array
}
In this case we know that, at runtime, there will be some storage
allocated immediately after the struct, containing the values for 'elems'.
What is the layout for foo? Something like this:
[ i32 [0: f64] ]
Now, if we flatten this to
[ i32 ]
it doesn't help much, as we have lost information on the fact that we
had an incomplete array at the end. So, let's say that the flattened
representation of an incomplete array is the incomplete array itself.
That is:
[ i32 [0: f64] ]
trivially flattens to:
[ i32 [0: f64] ]
Then, let's also say that, if the source layout *ends with an incomplete
array whose element layout is L*, it should be safe to cast this to a
target layout where the incomplete array is replaced by any number of L
(assuming we remain within the bounds of the memory region associated
with the layout).
So, the above layout would be compatible with this:
[ i32 ]
by shrinking (as seen above), but also with this:
[ i32 i32 i32 i32 ]
by expanding.
Concluding, we have seen 3 ways in which we can establish
'compatibility' between different layouts:
* flattened-equivalence: source and target have same flattened layout
* shrink: flattened source layout 'starts with' flattened target layout
* expand: a layout whose last element is an incomplete array of element
L is compatible with any layout where the trailing incomplete array is
replaced by any number of occurrences of L.
I think that, if Pointer::cast and Array::cast supported these
transforms, it would be a lot easier to switch back and forth between
layout-equivalent representations, which would ultimately make it easier
to program in cases like the one discussed in [1].
Thoughts?
Cheers
Maurizio
[1] -
https://mail.openjdk.java.net/pipermail/panama-dev/2019-February/004626.html
[2] - http://www.fftw.org/
More information about the panama-dev
mailing list