[foreign] enhancing pointer and array cast

Thu Feb 28 16:13:06 UTC 2019

Hi,
The recent discussion (see [1]) on Pointer vs. (nested) Array (thanks 
Lev) triggered some new ideas. I think what we're fundamentally banging 
our head against is the fact that C plays a loose game where arrays 
decay into pointers, which makes it easier to express idioms that are 
not so easily handled by the Panama API - which is primarily designed 
for safety.

To recap that discussion - let's assume I want to model complex numbers; 
to do that, I can model them as _pairs_ of double numbers. A possible 
way to model a pair is by using a two-element array (which is the choice 
the FFTW [2] library does).

So, how do I talk about arrays of complex numbers? Well, these are 
arrays of arrays of (two) doubles. Or, in Panama:

Array<Array<Double>>

This is all and well, but most of the C code out there, when populating 
such an array will just allocate a slab of double big enough to contain 
all the required data, and then fill it with a loop. In Panama, this is 
not possible, at least not directly: for each complex number you have to 
allocate an Array - and then set it at the right position of the 
outermost array.

Panama does provide escape hatches - namely, the Pointer::cast and 
Array::cast method are created for precisely this reason. So one could 
imagine doing this:

Array<Double> buffer = scope.allocateArray(NativeTypes.DOUBLE, 
nComplexPairs * 2);

And then initialize the buffer contents as follows (we assume realPart 
and imgPart are two arrays containing the real/imaginary part of the 
complex numbers we want to store):

for (int i = 0 ; i < nComplexPairs ; i++) {
    buffer.set(i, realPart[i]);
    buffer.set(i + 1, imgPart[i]);
}

Now, we need to convert the buffer to the type the API expects, namely 
an Array<Array<Double>>. We could do this:

Array<Array<Double>> arr = buffer.cast(LayoutType.DOUBLE.array(2), 
nComplexPairs);

This is fine - in C this operation would have been implicit, but since 
Panama focus in on safety, here we need an explicit cast.

Except, this doesn't work. Because Panama cast operation only allow 
casts when source and target layout is the same! You could try to 
workaround the limitation by going through NativeTypes.VOID, which 
disables the check - but that's very unfriendly - good as an escape 
hatch, but is there a better way to get there?

So, I started to think about ways to enhance Panama's cast operation in 
a safe fashion. After all, the main point here is that the source and 
target layout have the same _flattened_ representation:

[ f64 f64 ............ f64 ] //repeated nComplexPairs * 2 times

It follows that it should be safe to go from one to the other - we are 
just trying to enforce *more* structure on the same flattened layout!

But this is not the only thing that is safe to do; consider another 
example, where I want to temporarily view the buffer above as a _single_ 
complex number (e.g. array with size 1). In this case, we want to go from:

[ f64 f64 ............ f64 ] //repeated nComplexPairs * 2 times

to

[ f64 f64 ]

This is also safe, because the target flattened representation is the 
start of the source flattened representation.

What about expanding a layout? Is that safe? E.g. go from:

[ f64 f64 ]

to

[ f64 f64 f64 f64 ... ]

In general no, this is not safe. But there is a particular use case in 
which we might be interested in allowing this operation:

struct foo {
     int nelems;
     double elems[] //incomplete array
}

In this case we know that, at runtime, there will be some storage 
allocated immediately after the struct, containing the values for 'elems'.

What is the layout for foo? Something like this:

[ i32 [0: f64] ]

Now, if we flatten this to

[ i32 ]

it doesn't help much, as we have lost information on the fact that we 
had an incomplete array at the end. So, let's say that the flattened 
representation of an incomplete array is the incomplete array itself. 
That is:

[ i32 [0: f64] ]

trivially flattens to:

[ i32 [0: f64] ]

Then, let's also say that, if the source layout *ends with an incomplete 
array whose element layout is L*, it should be safe to cast this to a 
target layout where the incomplete array is replaced by any number of L 
(assuming we remain within the bounds of the memory region associated 
with the layout).

So, the above layout would be compatible with this:

[ i32 ]

by shrinking (as seen above), but also with this:

[ i32 i32 i32 i32 ]

by expanding.

Concluding, we have seen 3 ways in which we can establish 
'compatibility' between different layouts:

* flattened-equivalence: source and target have same flattened layout
* shrink: flattened source layout 'starts with' flattened target layout
* expand: a layout whose last element is an incomplete array of element 
L is compatible with any layout where the trailing incomplete array is 
replaced by any number of occurrences of L.

I think that, if Pointer::cast and Array::cast supported these 
transforms, it would be a lot easier to switch back and forth between 
layout-equivalent representations, which would ultimately make it easier 
to program in cases like the one discussed in [1].

Thoughts?

Cheers
Maurizio

[1] - 
https://mail.openjdk.java.net/pipermail/panama-dev/2019-February/004626.html
[2] - http://www.fftw.org/