an opencl binding - zcl/panama
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Jan 28 23:28:53 UTC 2020
On 28/01/2020 23:12, Michael Zucchi wrote:
>
> On 28/1/20 11:10 pm, Maurizio Cimadamore wrote:
>> So, I took a better look and I have some news.
>>
>> The first thing tripping the benchmark up is this:
>>
>> int len = (int)(seg.byteSize() >>> 3);
>>
>> If you replace it with:
>>
>> int len = ((int)seg.byteSize() >>> 3);
>>
>> Or, even better, with:
>>
>> int len = ((int)seg.byteSize() / 8);
>>
>
> A bit pedantic perhaps but the first allows a maximum equivalent long
> length of 0x7fffffff entries (matching long[].length max). The second
> 0x1fffffff, and the third 0x0fffffff.
>
>>
>> Then the segment version comes out on top:
>>
>> 0.497758726 array
>> 0.836574479 bb stream
>> 0.446651107 segment
>> 0.482202441 bb index
>> 2.767206835 bb over segment
>>
>> Of course I'm not suggesting that the code you wrote doesn't make
>> sense - I think this shows that (a) segments have the potential to be
>> very fast but (b) we have some work to do on the VM side to smooth
>> out the performance side of things.
>>
> Nice! And thanks for the detail.
>
> I guess it means the bulk interface isn't really necessary if it isn't
> otherwise more convenient or you're doing more than a copy. (with the
> obvious caveat that bulk copies can hide internal / jvm specific
> details like the long opcode issue).
>
>
It depends what you are doing - if all you are doing is really moving
data from A to B, doing it in bulk is way faster than doing it element
by element. For instance, if you want to copy a 1000 element heap array
off heap, and you do a comparison between copying element by element and
copying in bulk, you should see a difference. This is why the ByteBuffer
API also has bulk get/put methods.
Maurizio
>
>
More information about the panama-dev
mailing list