RFR: 8292698: Improve performance of DataInputStream
Alan Bateman
alanb at openjdk.org
Sun Aug 21 07:24:30 UTC 2022
On Sun, 21 Aug 2022 06:29:43 GMT, Сергей Цыпанов <duke at openjdk.org> wrote:
> I found out that reading from `DataInputStream` wrapping `ByteArrayInputStream` (as well as `BufferedInputStream` or any `InputStream` relying on `byte[]`) can be significantly improved by accessing volatile `in` field only once per operation.
>
> Current implementation does it for each call of `in.read()`, i.e. in `readInt()` method we do it 4 times:
>
> public final int readInt() throws IOException {
> int ch1 = in.read();
> int ch2 = in.read();
> int ch3 = in.read();
> int ch4 = in.read();
> if ((ch1 | ch2 | ch3 | ch4) < 0)
> throw new EOFException();
> return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
> }
>
> Apparently accessing volatile reference with underlying `byte[]` prevents runtime from doing some optimizations, so dereferencing local variable should be more efficient.
>
> Benchmarking:
>
> baseline:
>
> Benchmark Mode Cnt Score Error Units
> DataInputStreamTest.readChar avgt 20 22,889 ± 0,648 us/op
> DataInputStreamTest.readInt avgt 20 21,804 ± 0,197 us/op
>
> patch:
>
> Benchmark Mode Cnt Score Error Units
> DataInputStreamTest.readChar avgt 20 11,018 ± 0,089 us/op
> DataInputStreamTest.readInt avgt 20 5,608 ± 0,087 us/op
'in' is a protected field so it requires thinking about subclasses that might change 'in', say when the input stream asynchronously closed. BufferedInputStream is an example of a FilterInputStream that sets 'in' to null when asynchronously closed. There are other examples that change the underlying input stream to one that returns EOF when closed. So it might be that changing readChar/readInt/readLong is okay but it would have a bigger effect on readFully, skip and the other methods. So I think this is case of proceeding with caution as there may be more gong on.
-------------
PR: https://git.openjdk.org/jdk/pull/9956
More information about the core-libs-dev
mailing list