[nicl] branch is now closed

Tue Aug 21 09:34:32 UTC 2018

On 18/08/18 00:16, Samuel Audet wrote:
> About the Array type, it's still not clear to me why we can't just 
> extend the Pointer with a "length"? It's not like pointers in C don't 
> typically point to memory regions with length > 0: It's just often 
> unknown. I've modeled the Pointer type in JavaCPP after Buffer (with 
> position, limit, and capacity) and I haven't had any issues at all in 
> years. None whatsoever. What is the benefit you perceive in mapping 
> artificially 1 C concept into 2 Java concepts? I was not able to find 
> the rationale from your document... If the idea is to provide safe 
> get()/set() methods, why not simply use a layout? What does the Array 
> type do that a layout could not do? That's basically what I do with 
> the indexer package and it works fine:
> http://bytedeco.org/news/2014/12/23/third-release/
> Again, no issues whatsoever in years, without complicating the API! 
I'm not sure I follow your argument fully. In the C language spec 
there's not one concept, but two - we have array types and pointer 
types. Their behavior is quite different, so, for instance sizeof 
returns quite different things if you compare sizeof(int[]) with 
sizeof(int*). More generally, an array refers to a contiguous region of 
memory filled by N objects of the same kind. Layout-wise, an array is 
much more similar to a struct than it is to a pointer. E.g. int[3] and 
struct { int, int, int } have the same layout.

Of course there is a relationship between arrays and pointers, and that 
is caused by the fact that an array always decays to a 'base pointer' - 
when the context requires it. But the layout of this base pointer is 
rather different - e.g. if you take the base pointer of an int[3], you 
and up with a pointer whose pointee is 'int'.

Now, if we had a calculus for layouts which allowed to specify sizes as 
well - and maybe ours can do that with layout annotations:

u64(length=3):i32

Then it kind of seems that it would be possible to lump arrays and 
pointers together. But there's one detail that is omitted from this 
layout description: this description does not mandate that the elements 
occur 'right here right now'. That is, if I have a struct like this:

[ u64(length=3):i32 ]

Does the struct have a field of size 32 * 3 inline, or does the struct 
have a field of size 64 which points to some contiguous region of memory 
storing 3 ints? I believe the layout expression above suggests the latter.

The Array API point is there to recognize this distinction: an array is 
a lump of memory; if you want you can take a pointer to the first array 
element out of it, and work that way - but the fact that it's easy to go 
in that direction doesn't mean there must be an 'is-a' relationship 
between the two entities. Of course, when you design an API there are 
many such decisions - John calls them with the apt definition splitting 
vs. lumping. At the time we considered both lumping (e.g. just use 
pointers) and splitting (the model we have now) and found that the 
latter offered much more clarity to the programmer. While arrays and 
pointers do overlap, there are certain differences (e.g. copy semantics 
on array struct field access) that, at the time we looked at them, 
pushed us towards the 'split' choice. It's not, of course, a choice set 
in stone, but I think it's one we're pretty happy with at the moment 
(unlike that, e.g. for function pointer automatic conversion).

As for your point on the relationship between Panama and Java arrays - 
Java arrays are essentially unsuitable for the kind of raw mapping we 
wanted to achieve at this stage. This does not preclude us to add a 
'civilized' binding layer which autoconverts native arrays into Java 
arrays - but since that's an expensive move, it will be an opt in.

The fact that Java arrays are not interfaces is biting us here, as well 
- Array 2.0 explored the possibility of having an Array<any X> 
interface, specialized in primitive types; with such support available 
we could indeed treat Panama arrays as specific instances of such a 
generalized interface. Until that support is available, we have to paper 
over the differences in other ways, which is what this API is attempting 
to do.

Cheers
Maurizio