Foreign + Vectors - benchmarks for copying and swapping

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Jun 17 16:12:22 UTC 2021


Would be interesting to see if passing in regular byte buffers (e.g. not 
derived from segments) improve things. A regular byte buffer should not 
have any liveness check, so the overhead might be somewhat lower, 
although it seems, as Paul says, that the benchmark is affected by 
non-optimized bound checks.

Also - maybe it's a silly comment - but did you double check the 
endianness of the returned buffer? The memory segment API returns 
BIG_ENDIAN buffers (as it's the case for buffers allocated with 
ByteBuffer.allocateDirect). Is it possible you are using mismatched 
endiannes?

Maurizio

On 17/06/2021 17:00, Paul Sandoz wrote:
> Hi Rado,
>
> Thanks, an interesting experiment.
>
> We would need to look the generated code to spot issues. Hard to compete with a specialized and highly optimized intrinsic copy. Still, it seems we should be able to do better.
>
> I suspect there might be some un-hoisted bounds checks, or non-optimal addressing of loads/stores.
>
> Something odd going on with shared access.
>
> I doubt the use of shuffle can compete in general with the byte swapping in the intrinsic copy, but we are still ironing out performance issues with shuffle so maybe there is some room for improvement. (Further, we don’t do anything special with for certain constant shuffle patterns from which we might be able to select more optimal instructions.)
>
> Paul.
>
>> On Jun 16, 2021, at 5:15 PM, Radosław Smogura <mail at smogura.eu> wrote:
>>
>> Hi all,
>>
>> I could not stop my self, from this simple experiment of glueing together foreign with vectors, at least via byte buffers for now.
>>
>> Results are not the best, but still could be interesting, as there was some interest with this.
>>
>> Below please find results, and the link to benchmark:
>>
>> Benchmark                                 Mode  Cnt   Score   Error  Units
>> VectorCopySegments.copyWithNative         avgt   10  20.987 ? 1.819  ns/op
>> VectorCopySegments.copyWithNativeShared   avgt   10  12.528 ? 0.183  ns/op
>> VectorCopySegments.copyWithNativeToArray  avgt   10  19.800 ? 3.985  ns/op
>> VectorCopySegments.copyWithVector         avgt   10  31.151 ? 1.929  ns/op
>> VectorCopySegments.copyWithVectorShared   avgt   10  56.752 ? 1.754  ns/op
>> VectorCopySegments.copyWithVectorShuffle  avgt   10  52.409 ? 0.390  ns/op
>> VectorCopySegments.copyWithVectorToArray  avgt   10  29.573 ? 0.485  ns/op
>>
>> https://github.com/rsmogura/panama-foreign/blob/foreign_and_vectors/test/micro/org/openjdk/bench/jdk/incubator/foreign/VectorCopySegments.java
>>
>> Feedback is welcome.
>>
>> Kind regards,
>> Rado


More information about the panama-dev mailing list