RFR: 8310843: Reimplement ByteArray and ByteArrayLittleEndian with Unsafe [v10]

Thu Jul 20 17:01:45 UTC 2023

On Thu, 20 Jul 2023 16:47:14 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> On a newer processor/OS (Alder Lake/Ubuntu 22.04) I see a bit more difference:
>> 
>> 
>> Benchmark                      Mode  Cnt       Score       Error   Units
>> ByteArray.readByte            thrpt    5  680397.046 ± 27504.022  ops/ms
>> ByteArray.readByteFromBuffer  thrpt    5  576449.569 ±  3633.135  ops/ms
>> ByteArray.readInt             thrpt    5  685419.089 ±  6740.268  ops/ms
>> ByteArray.readIntFromBuffer   thrpt    5  542887.418 ±  2863.907  ops/ms
>> ByteArray.readLong            thrpt    5  687949.037 ±  3510.613  ops/ms
>> ByteArray.readLongFromBuffer  thrpt    5  548120.950 ±  5461.145  ops/ms
>> 
>> 
>> But still far from 2x.
>
> For full disclosure, I'm running the benchmark using the JDK microbenchmark support. My JMH version is a bit behind. I've updated it to 1.35 (which is the latest I see being used around here) and getting similar results.

By the way, I ran `LoopOverNonConstantHeap` on the 3700x platform, and the performance of ByteBuffer was also poor:

Benchmark                                                (polluteProfile)  Mode  Cnt  Score    Error  Units
LoopOverNonConstantHeap.BB_get                                      false  avgt   30  1.855 ±  0.092  ns/op
LoopOverNonConstantHeap.BB_get                                       true  avgt   30  1.655 ±  0.010  ns/op
LoopOverNonConstantHeap.BB_loop                                     false  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.BB_loop                                      true  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_get                                 false  avgt   30  2.333 ±  0.043  ns/op
LoopOverNonConstantHeap.segment_get                                  true  avgt   30  2.362 ±  0.006  ns/op
LoopOverNonConstantHeap.segment_loop                                false  avgt   30  0.251 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop                                 true  avgt   30  0.251 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_instance                       false  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_instance                        true  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_instance_unaligned             false  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_instance_unaligned              true  avgt   30  0.254 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_readonly                       false  avgt   30  0.252 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_readonly                        true  avgt   30  0.251 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_slice                          false  avgt   30  0.251 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_slice                           true  avgt   30  0.252 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_unaligned                      false  avgt   30  0.251 ±  0.001  ms/op
LoopOverNonConstantHeap.segment_loop_unaligned                       true  avgt   30  0.252 ±  0.001  ms/op
LoopOverNonConstantHeap.unsafe_get                                  false  avgt   30  0.628 ±  0.004  ns/op
LoopOverNonConstantHeap.unsafe_get                                   true  avgt   30  0.628 ±  0.003  ns/op
LoopOverNonConstantHeap.unsafe_loop                                 false  avgt   30  0.255 ±  0.001  ms/op
LoopOverNonConstantHeap.unsafe_loop                                  true  avgt   30  0.256 ±  0.001  ms/op

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14636#discussion_r1269733363