RFR: 8310843: Reimplement ByteArray and ByteArrayLittleEndian with Unsafe [v10]
Glavo
duke at openjdk.org
Thu Jul 20 15:19:45 UTC 2023
On Thu, 20 Jul 2023 14:53:36 GMT, Glavo <duke at openjdk.org> wrote:
>> @mcimadamore I compared the performance of `ByteBuffer` and `VarHandle` using a JMH benchmark:
>>
>>
>> public class ByteArray {
>>
>> private byte[] array;
>> private ByteBuffer byteBuffer;
>>
>> private static final VarHandle INT = MethodHandles.byteArrayViewVarHandle(int[].class, LITTLE_ENDIAN);
>> private static final VarHandle LONG = MethodHandles.byteArrayViewVarHandle(long[].class, LITTLE_ENDIAN);
>>
>> @Setup
>> public void setup() {
>> array = new byte[8];
>> byteBuffer = ByteBuffer.wrap(array).order(LITTLE_ENDIAN);
>>
>> new Random(0).nextBytes(array);
>> }
>>
>> @Benchmark
>> public byte readByte() {
>> return array[0];
>> }
>>
>> @Benchmark
>> public byte readByteFromBuffer() {
>> return byteBuffer.get(0);
>> }
>>
>> @Benchmark
>> public int readInt() {
>> return (int) INT.get(array, 0);
>> }
>>
>> @Benchmark
>> public int readIntFromBuffer() {
>> return byteBuffer.getInt(0);
>> }
>>
>>
>> @Benchmark
>> public long readLong() {
>> return (long) LONG.get(array, 0);
>> }
>>
>> @Benchmark
>> public long readLongFromBuffer() {
>> return byteBuffer.getLong(0);
>> }
>> }
>>
>>
>> Result:
>>
>> Benchmark Mode Cnt Score Error Units
>> ByteArray.readByte thrpt 5 1270230.180 ± 29172.551 ops/ms
>> ByteArray.readByteFromBuffer thrpt 5 623862.080 ± 12167.410 ops/ms
>> ByteArray.readInt thrpt 5 1252719.463 ± 77598.672 ops/ms
>> ByteArray.readIntFromBuffer thrpt 5 571070.474 ± 1500.426 ops/ms
>> ByteArray.readLong thrpt 5 1262720.686 ± 728.100 ops/ms
>> ByteArray.readLongFromBuffer thrpt 5 571594.800 ± 3376.735 ops/ms
>>
>>
>> In this result, ByteBuffer is much slower than VarHandle. Am I doing something wrong? What conditions are needed to make the performance of ByteBuffer close to that of Unsafe?
>
> I tried a few more. It looks like the JIT is able to optimize the ByteBuffer away pretty well by keeping it only as a local variable without escaping.
It seems that as long as the `ByteBuffer` is stored in a field (even if it is `static final`), the JIT compiler cannot completely eliminate the overhead of the `ByteBuffer`.
@mcimadamore I think your suggested changes for `DataInputStream` is dubious, it's likely to introduce non-trivial additional overhead. The correct change may be like this:
public final double readDouble() throws IOException {
readFully(readBuffer, 0, 8);
- return ByteArray.getDouble(readBuffer, 0);
+ return ByteBuffer.wrap(readBuffer).getDouble(0);
}
However this change can also increase the warmup time and allocate many small objects before C2 compiles it.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14636#discussion_r1269617039
More information about the core-libs-dev
mailing list