performance: arrayElementVarHandle / calculated index / aligned vs unaligned
Matthias Ernst
matthias at mernst.org
Wed Dec 18 08:26:42 UTC 2024
Hi,
I'm trying to use the foreign memory api to interpret some variable-length
encoded data, where an offset vector encodes the start offset of each
stride. Accessing element `i` in this case involves reading `offset[i+1]`
in addition to `offset[i]`. The offset vector is modeled as a
`JAVA_LONG.arrayElementVarHandle()`.
Just out of curiosity about bounds and alignment checks I switched the
layout to JAVA_LONG_UNALIGNED for reading (data is still aligned) and I saw
a large difference in performance where I didn't expect one, and it seems
to boil down to the computed index `endOffset[i+1]` access, not for the
`[i]` case. My expectation would have been that all variants exhibit the
same performance, since alignment checks would be moved out of the loop.
A micro-benchmark (attached) to demonstrate:
long-aligned memory segment, looping over the same elements in 6 different
ways:
{aligned, unaligned} x {segment[i] , segment[i+1], segment[i+1] (w/ base
offset) } gives very different results for aligned[i+1] (but not for
aligned[i]):
Benchmark Mode Cnt Score Error Units
Alignment.findAligned thrpt 217.050 ops/s
Alignment.findAlignedPlusOne thrpt 110.366 ops/s. <=
#####
Alignment.findAlignedNext thrpt 110.377 ops/s. <= #####
Alignment.findUnaligned thrpt 216.591 ops/s
Alignment.findUnalignedPlusOne thrpt 215.843 ops/s
Alignment.findUnalignedNext thrpt 216.483 ops/s
openjdk version "23.0.1" 2024-10-15
OpenJDK Runtime Environment (build 23.0.1+11-39)
OpenJDK 64-Bit Server VM (build 23.0.1+11-39, mixed mode, sharing)
Macbook Air M3
Needless to say that the difference was smaller with more app code in play,
but large enough to give me pause. Likely it wouldn't matter at all but I
want to have a better idea which design choices to pay attention to. With
the foreign memory api, I find it a bit difficult to distinguish
convenience from performance-relevant options (e.g. using path expressions
vs computed offsets vs using a base offset. Besides "make layouts and
varhandles static final" what would be other rules of thumb?)
Thx
Matthias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241218/139228d2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Alignment.java
Type: application/octet-stream
Size: 3569 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241218/139228d2/Alignment-0001.java>
More information about the panama-dev
mailing list