[foreign-memaccess+abi] RFR: Add benchmarks to MemorySegmentVsBits [v2]
Maurizio Cimadamore
mcimadamore at openjdk.org
Thu Jan 5 21:51:09 UTC 2023
On Wed, 4 Jan 2023 08:00:15 GMT, Per Minborg <pminborg at openjdk.org> wrote:
>> This PR proposes the addition of some benchmarks, for example using a LonBuffer and a VarHandle.
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
>
> Change to big endian for some variants
Bulk copy seems the best. It is likely that using a bulk "put" on ByteBuffer is also way better than using separate stores.
The way I arrived at using bulk copy was because I noted that the code was not being vectorized - but the code C2 emits for a bulk copy has vector instructions in it, so it should perform optimally (which it does). Since ByteBuffer and MemorySegment use the same underlying primitive for bulk copy (Unsafe::copyMemory) I'd expect the two to behave the same.
That said, I don't think there's any reason for the non-bulk versions to be slower, and I hope that it turns out to be some "near miss" in the autovectorization code (we're looking into it).
> According to this data, if your application primarily works with reading/writing/copying of buffers & binary data
>
> Then you're better off (perf-wise) using Panama FFM `MemorySegment` with `MemorySegment#copy` rather than ByteBuffers or byte-arrays?
>
> I'd be curious if you lose any of this benefit if you have to adapt the MemorySegment with `.asByteBuffer()` for calling existing API's that require BB's too, like for example `java.nio.AsynchronousSocketServer` callbacks and the like.
Turning a segment into a BB is a rather cheap O(1) operation, so I wouldn't expect that to result in performance degradation.
>
> (The size here is in `long-bytes`, right? So it's x8 with `256` being a 2kb buffer?)
yes
-------------
PR: https://git.openjdk.org/panama-foreign/pull/762
More information about the panama-dev
mailing list