[nicl] branch is now closed

Sun Aug 26 14:19:49 UTC 2018

I'm sorry, the code snippet in quotes below should actually be something 
more like this:

void fast_sin(float x[4], float y[4]) {
     *(__m128*)y = _mm_sin_ps(*(__m128*)x);
}

void fast_sin256(float x[8], float y[8]) {
     *(__m256*)y = _mm256_sin_ps(*(__m256*)x);
}

This maps well to Java arrays (and thus to the vector API) with JNI, but 
not with Panama. Please correct me if I'm wrong, but since Panama does 
not support inline functions either, JNI is probably going to be faster 
in this case, right?

Anyway, I plan to start benchmarking things like that when the early 
access binaries start shipping...

Samuel

On 08/25/2018 05:34 PM, Samuel Audet wrote:
> On 08/21/2018 06:34 PM, Maurizio Cimadamore wrote:
>>
>>
>> On 18/08/18 00:16, Samuel Audet wrote:
>>> About the Array type, it's still not clear to me why we can't just 
>>> extend the Pointer with a "length"? It's not like pointers in C don't 
>>> typically point to memory regions with length > 0: It's just often 
>>> unknown. I've modeled the Pointer type in JavaCPP after Buffer (with 
>>> position, limit, and capacity) and I haven't had any issues at all in 
>>> years. None whatsoever. What is the benefit you perceive in mapping 
>>> artificially 1 C concept into 2 Java concepts? I was not able to find 
>>> the rationale from your document... If the idea is to provide safe 
>>> get()/set() methods, why not simply use a layout? What does the Array 
>>> type do that a layout could not do? That's basically what I do with 
>>> the indexer package and it works fine:
>>> http://bytedeco.org/news/2014/12/23/third-release/
>>> Again, no issues whatsoever in years, without complicating the API! 
>> I'm not sure I follow your argument fully. In the C language spec 
>> there's not one concept, but two - we have array types and pointer 
>> types. Their behavior is quite different, so, for instance sizeof 
>> returns quite different things if you compare sizeof(int[]) with 
>> sizeof(int*). More generally, an array refers to a contiguous region 
>> of memory filled by N objects of the same kind. Layout-wise, an array 
>> is much more similar to a struct than it is to a pointer. E.g. int[3] 
>> and struct { int, int, int } have the same layout.
> 
> I fully agree there are a few differences that we can't deal with just 
> pointers, but my point is, why not leave that up to a layout instead of 
> introducing one more class to the mix? What precisely are the layouts 
> not capable of accommodating that you had to introduce the Array class for?
> 
>> Of course there is a relationship between arrays and pointers, and 
>> that is caused by the fact that an array always decays to a 'base 
>> pointer' - when the context requires it. But the layout of this base 
>> pointer is rather different - e.g. if you take the base pointer of an 
>> int[3], you and up with a pointer whose pointee is 'int'.
>>
>> Now, if we had a calculus for layouts which allowed to specify sizes 
>> as well - and maybe ours can do that with layout annotations:
>>
>> u64(length=3):i32
>>
>> Then it kind of seems that it would be possible to lump arrays and 
>> pointers together. But there's one detail that is omitted from this 
>> layout description: this description does not mandate that the 
>> elements occur 'right here right now'. That is, if I have a struct 
>> like this:
>>
>> [ u64(length=3):i32 ]
>>
>> Does the struct have a field of size 32 * 3 inline, or does the struct 
>> have a field of size 64 which points to some contiguous region of 
>> memory storing 3 ints? I believe the layout expression above suggests 
>> the latter.
> 
> So, maybe we should fix this in the layouts instead of coming up with 
> Array just to patch over this?
> 
>> The Array API point is there to recognize this distinction: an array 
>> is a lump of memory; if you want you can take a pointer to the first 
>> array element out of it, and work that way - but the fact that it's 
>> easy to go in that direction doesn't mean there must be an 'is-a' 
>> relationship between the two entities. Of course, when you design an 
>> API there are many such decisions - John calls them with the apt 
>> definition splitting vs. lumping. At the time we considered both 
>> lumping (e.g. just use pointers) and splitting (the model we have now) 
>> and found that the latter offered much more clarity to the programmer. 
>> While arrays and pointers do overlap, there are certain differences 
>> (e.g. copy semantics on array struct field access) that, at the time 
>> we looked at them, pushed us towards the 'split' choice. It's not, of 
>> course, a choice set in stone, but I think it's one we're pretty happy 
>> with at the moment (unlike that, e.g. for function pointer automatic 
>> conversion).
> 
> Hum, it's sounds like you're effectively saying that layouts are 
> inappropriate to map, for example, Protocol buffers or FlatBuffers, 
> which are fast becoming the industry standard. Have you given any 
> thoughts to Protocol Buffers or FlatBuffers and if so where can we find 
> preliminary notes about those?
> 
>> As for your point on the relationship between Panama and Java arrays - 
>> Java arrays are essentially unsuitable for the kind of raw mapping we 
>> wanted to achieve at this stage. This does not preclude us to add a 
>> 'civilized' binding layer which autoconverts native arrays into Java 
>> arrays - but since that's an expensive move, it will be an opt in.
> 
> Good to hear that!
> 
>> The fact that Java arrays are not interfaces is biting us here, as 
>> well - Array 2.0 explored the possibility of having an Array<any X> 
>> interface, specialized in primitive types; with such support available 
>> we could indeed treat Panama arrays as specific instances of such a 
>> generalized interface. Until that support is available, we have to 
>> paper over the differences in other ways, which is what this API is 
>> attempting to do.
> 
> Right, but by not supporting normal arrays, you're preventing, for 
> example, the vector team from benefiting from your work.
> 
> Vladimir expressed interest in the SVML intrinsics offered by the Intel 
> C++ Compiler. That's basically a library of inline functions. We could 
> have a header file with something like:
> 
> __m128 fast_sin(__m128 x) {
>      return __m128 _mm_sin_ps(x)
> }
> 
> __m256 fast_sin(__m256 x) {
>      return __m256 _mm256_sin_ps(x)
> }
> 
> And if Panama could inline this in Java, that's it! You've got a fast 
> parallel version of sin() that is inlined with zero overhead when called 
> from Java.
> 
> So, you see, this kind of feature is in strong demand, and by your own 
> team! I say we should do some dog fooding. Create something that others 
> will actually want to use!
> 
> If Panama could provide that kind of feature, many other users, at 
> Skymind and elsewhere, could also use it to implement their own 
> libraries of "fast math", so we wouldn't have to bother you guys all the 
> time to implement some fast version of math. We'd just do it all 
> ourselves as libraries! That would be great, wouldn't?
> 
> As always, please let us know how we can help. As Johan keeps reminding 
> me, we could write JEPs and what not, but before we get there I would 
> need to understand what would be required to make you interested in such 
> a feature. In other words, what would you be looking for in the JEP to 
> accept it yourself?
> 
> Samuel