RFR: 8299807: String.newStringUTF8NoRepl and getBytesUTF8NoRepl always copy arrays
Glavo
duke at openjdk.org
Wed Jan 11 13:12:14 UTC 2023
On Wed, 11 Jan 2023 09:56:58 GMT, Alan Bateman <alanb at openjdk.org> wrote:
>> `JavaLangAccess::newStringUTF8NoRepl` and `JavaLangAccess::getBytesUTF8NoRepl` are not implemented correctly. They always copy arrays, rather than avoiding copying as much as possible as javadoc says.
>>
>> I ran the tier1 test without any new errors.
>
> Would it be possible to provide some context on which public API you are testing with and the micro benchmark that you are using?
@AlanBateman
This PR mainly affects `Files.readString` and `java.util.zip.ZipCoder`.
I designed a micro benchmark for `Files.readString`: [NoRepl.java](https://gist.github.com/Glavo/0aa47d47f329ceabf7dd4c3b9d2848e2).
This benchmark tests the performance of `Files.readString`. To avoid interference, the test is based on the memory file system.
This is the baseline:
Benchmark (length) Mode Cnt Score Error Units
NoRepl.testReadAscii 0 avgt 5 192.584 ± 1.670 ns/op
NoRepl.testReadAscii 1024 avgt 5 296.760 ± 2.599 ns/op
NoRepl.testReadAscii 8192 avgt 5 427.220 ± 0.809 ns/op
NoRepl.testReadAscii 1048576 avgt 5 29082.579 ± 34.780 ns/op
NoRepl.testReadAscii 33554432 avgt 5 1168901.308 ± 240228.024 ns/op
NoRepl.testReadUTF8 0 avgt 5 206.196 ± 2.296 ns/op
NoRepl.testReadUTF8 1024 avgt 5 1290.403 ± 3.920 ns/op
NoRepl.testReadUTF8 8192 avgt 5 9371.318 ± 55.165 ns/op
NoRepl.testReadUTF8 1048576 avgt 5 1203194.297 ± 5787.171 ns/op
NoRepl.testReadUTF8 33554432 avgt 5 44567374.591 ± 170568.947 ns/op
This is the result based on this PR:
Benchmark (length) Mode Cnt Score Error Units
NoRepl.testReadAscii 0 avgt 5 210.050 ± 22.174 ns/op
NoRepl.testReadAscii 1024 avgt 5 285.811 ± 4.448 ns/op
NoRepl.testReadAscii 8192 avgt 5 350.318 ± 0.504 ns/op
NoRepl.testReadAscii 1048576 avgt 5 19565.571 ± 33.153 ns/op
NoRepl.testReadAscii 33554432 avgt 5 857566.083 ± 18352.548 ns/op
NoRepl.testReadUTF8 0 avgt 5 196.632 ± 0.633 ns/op
NoRepl.testReadUTF8 1024 avgt 5 1295.354 ± 4.450 ns/op
NoRepl.testReadUTF8 8192 avgt 5 9381.675 ± 127.045 ns/op
NoRepl.testReadUTF8 1048576 avgt 5 1200648.741 ± 4259.763 ns/op
NoRepl.testReadUTF8 33554432 avgt 5 44481499.656 ± 284353.880 ns/op
This PR has very slight performance degradation (about 0.1%, almost negligible) when reading files containing non-ASCII characters.
For large ASCII files, the performance is improved by 30%~50%.
Although such a significant performance improvement on the memory file system cannot be achieved on the hard disk file system, this PR can still reduce one copy of the array and temporary memory allocation for ASCII files.
-------------
PR: https://git.openjdk.org/jdk/pull/11897
More information about the core-libs-dev
mailing list