Foreign + Vectors - benchmarks for copying and swapping
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Jun 17 16:12:22 UTC 2021
Would be interesting to see if passing in regular byte buffers (e.g. not
derived from segments) improve things. A regular byte buffer should not
have any liveness check, so the overhead might be somewhat lower,
although it seems, as Paul says, that the benchmark is affected by
non-optimized bound checks.
Also - maybe it's a silly comment - but did you double check the
endianness of the returned buffer? The memory segment API returns
BIG_ENDIAN buffers (as it's the case for buffers allocated with
ByteBuffer.allocateDirect). Is it possible you are using mismatched
endiannes?
Maurizio
On 17/06/2021 17:00, Paul Sandoz wrote:
> Hi Rado,
>
> Thanks, an interesting experiment.
>
> We would need to look the generated code to spot issues. Hard to compete with a specialized and highly optimized intrinsic copy. Still, it seems we should be able to do better.
>
> I suspect there might be some un-hoisted bounds checks, or non-optimal addressing of loads/stores.
>
> Something odd going on with shared access.
>
> I doubt the use of shuffle can compete in general with the byte swapping in the intrinsic copy, but we are still ironing out performance issues with shuffle so maybe there is some room for improvement. (Further, we don’t do anything special with for certain constant shuffle patterns from which we might be able to select more optimal instructions.)
>
> Paul.
>
>> On Jun 16, 2021, at 5:15 PM, Radosław Smogura <mail at smogura.eu> wrote:
>>
>> Hi all,
>>
>> I could not stop my self, from this simple experiment of glueing together foreign with vectors, at least via byte buffers for now.
>>
>> Results are not the best, but still could be interesting, as there was some interest with this.
>>
>> Below please find results, and the link to benchmark:
>>
>> Benchmark Mode Cnt Score Error Units
>> VectorCopySegments.copyWithNative avgt 10 20.987 ? 1.819 ns/op
>> VectorCopySegments.copyWithNativeShared avgt 10 12.528 ? 0.183 ns/op
>> VectorCopySegments.copyWithNativeToArray avgt 10 19.800 ? 3.985 ns/op
>> VectorCopySegments.copyWithVector avgt 10 31.151 ? 1.929 ns/op
>> VectorCopySegments.copyWithVectorShared avgt 10 56.752 ? 1.754 ns/op
>> VectorCopySegments.copyWithVectorShuffle avgt 10 52.409 ? 0.390 ns/op
>> VectorCopySegments.copyWithVectorToArray avgt 10 29.573 ? 0.485 ns/op
>>
>> https://github.com/rsmogura/panama-foreign/blob/foreign_and_vectors/test/micro/org/openjdk/bench/jdk/incubator/foreign/VectorCopySegments.java
>>
>> Feedback is welcome.
>>
>> Kind regards,
>> Rado
More information about the panama-dev
mailing list