[nicl] branch is now closed

Sat Aug 25 08:34:48 UTC 2018

On 08/21/2018 06:34 PM, Maurizio Cimadamore wrote:
> 
> 
> On 18/08/18 00:16, Samuel Audet wrote:
>> About the Array type, it's still not clear to me why we can't just 
>> extend the Pointer with a "length"? It's not like pointers in C don't 
>> typically point to memory regions with length > 0: It's just often 
>> unknown. I've modeled the Pointer type in JavaCPP after Buffer (with 
>> position, limit, and capacity) and I haven't had any issues at all in 
>> years. None whatsoever. What is the benefit you perceive in mapping 
>> artificially 1 C concept into 2 Java concepts? I was not able to find 
>> the rationale from your document... If the idea is to provide safe 
>> get()/set() methods, why not simply use a layout? What does the Array 
>> type do that a layout could not do? That's basically what I do with 
>> the indexer package and it works fine:
>> http://bytedeco.org/news/2014/12/23/third-release/
>> Again, no issues whatsoever in years, without complicating the API! 
> I'm not sure I follow your argument fully. In the C language spec 
> there's not one concept, but two - we have array types and pointer 
> types. Their behavior is quite different, so, for instance sizeof 
> returns quite different things if you compare sizeof(int[]) with 
> sizeof(int*). More generally, an array refers to a contiguous region of 
> memory filled by N objects of the same kind. Layout-wise, an array is 
> much more similar to a struct than it is to a pointer. E.g. int[3] and 
> struct { int, int, int } have the same layout.

I fully agree there are a few differences that we can't deal with just 
pointers, but my point is, why not leave that up to a layout instead of 
introducing one more class to the mix? What precisely are the layouts 
not capable of accommodating that you had to introduce the Array class for?

> Of course there is a relationship between arrays and pointers, and that 
> is caused by the fact that an array always decays to a 'base pointer' - 
> when the context requires it. But the layout of this base pointer is 
> rather different - e.g. if you take the base pointer of an int[3], you 
> and up with a pointer whose pointee is 'int'.
> 
> Now, if we had a calculus for layouts which allowed to specify sizes as 
> well - and maybe ours can do that with layout annotations:
> 
> u64(length=3):i32
> 
> Then it kind of seems that it would be possible to lump arrays and 
> pointers together. But there's one detail that is omitted from this 
> layout description: this description does not mandate that the elements 
> occur 'right here right now'. That is, if I have a struct like this:
> 
> [ u64(length=3):i32 ]
> 
> Does the struct have a field of size 32 * 3 inline, or does the struct 
> have a field of size 64 which points to some contiguous region of memory 
> storing 3 ints? I believe the layout expression above suggests the latter.

So, maybe we should fix this in the layouts instead of coming up with 
Array just to patch over this?

> The Array API point is there to recognize this distinction: an array is 
> a lump of memory; if you want you can take a pointer to the first array 
> element out of it, and work that way - but the fact that it's easy to go 
> in that direction doesn't mean there must be an 'is-a' relationship 
> between the two entities. Of course, when you design an API there are 
> many such decisions - John calls them with the apt definition splitting 
> vs. lumping. At the time we considered both lumping (e.g. just use 
> pointers) and splitting (the model we have now) and found that the 
> latter offered much more clarity to the programmer. While arrays and 
> pointers do overlap, there are certain differences (e.g. copy semantics 
> on array struct field access) that, at the time we looked at them, 
> pushed us towards the 'split' choice. It's not, of course, a choice set 
> in stone, but I think it's one we're pretty happy with at the moment 
> (unlike that, e.g. for function pointer automatic conversion).

Hum, it's sounds like you're effectively saying that layouts are 
inappropriate to map, for example, Protocol buffers or FlatBuffers, 
which are fast becoming the industry standard. Have you given any 
thoughts to Protocol Buffers or FlatBuffers and if so where can we find 
preliminary notes about those?

> As for your point on the relationship between Panama and Java arrays - 
> Java arrays are essentially unsuitable for the kind of raw mapping we 
> wanted to achieve at this stage. This does not preclude us to add a 
> 'civilized' binding layer which autoconverts native arrays into Java 
> arrays - but since that's an expensive move, it will be an opt in.

Good to hear that!

> The fact that Java arrays are not interfaces is biting us here, as well 
> - Array 2.0 explored the possibility of having an Array<any X> 
> interface, specialized in primitive types; with such support available 
> we could indeed treat Panama arrays as specific instances of such a 
> generalized interface. Until that support is available, we have to paper 
> over the differences in other ways, which is what this API is attempting 
> to do.

Right, but by not supporting normal arrays, you're preventing, for 
example, the vector team from benefiting from your work.

Vladimir expressed interest in the SVML intrinsics offered by the Intel 
C++ Compiler. That's basically a library of inline functions. We could 
have a header file with something like:

__m128 fast_sin(__m128 x) {
     return __m128 _mm_sin_ps(x)
}

__m256 fast_sin(__m256 x) {
     return __m256 _mm256_sin_ps(x)
}

And if Panama could inline this in Java, that's it! You've got a fast 
parallel version of sin() that is inlined with zero overhead when called 
from Java.

So, you see, this kind of feature is in strong demand, and by your own 
team! I say we should do some dog fooding. Create something that others 
will actually want to use!

If Panama could provide that kind of feature, many other users, at 
Skymind and elsewhere, could also use it to implement their own 
libraries of "fast math", so we wouldn't have to bother you guys all the 
time to implement some fast version of math. We'd just do it all 
ourselves as libraries! That would be great, wouldn't?

As always, please let us know how we can help. As Johan keeps reminding 
me, we could write JEPs and what not, but before we get there I would 
need to understand what would be required to make you interested in such 
a feature. In other words, what would you be looking for in the JEP to 
accept it yourself?

Samuel