RFR: 8301958: Avoid Arrays.copyOfRange overhead in java.lang.String [v5]
Francesco Nigro
duke at openjdk.org
Tue Feb 7 21:07:05 UTC 2023
On Tue, 7 Feb 2023 20:32:11 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> src/java.base/share/classes/java/lang/String.java line 698:
>>
>>> 696: }
>>> 697:
>>> 698: static byte[] copyBytes(byte[] bytes, int offset, int length) {
>>
>> Given that the stub generated for array copy seems highly dependent by the call site constrains, did you tried adding a check for offset == 0 and/or length == bytes.length?
>>
>> If (offset == 0 && bytes.length == length) {
>> System.arrayCopy(bytes, 0, dst, 0, bytes.length);
>> // etc etc the other combinations
>>
>> This should have different generated stubs with much smaller ASM depending by the enforced constrains (and shouldn't affect terribly the code size of the method, given that the stub won't be inlined AFAIK)
>>
>> Beware, as noted by others, I'm not suggesting that's the way to fix this, but it would be interesting to check how much perf we leave on the ground due to the this supposed "inefficient" stub generation (if that's the issue).
>
> I did some quick experiments but saw no clear win from doing anything like this here. Feel free to experiment and see if there's some particular configuration that comes out ahead.
>
> FTR I did not intend for this RFE to solve https://bugs.openjdk.org/browse/JDK-8295496 completely, but provide a small, partial win that might possibly clear a path to solving that likely orthogonal issue.
I've created a separate benchmark for this (named as your by accident - given that I've used it as a blueprint):
https://gist.github.com/franz1981/658c2bf6796aab4ae04a84bef1ef34b6
results are
Benchmark (offset) (size) Mode Cnt Score Error Units
StringConstructor.arrayCopy 0 7 avgt 10 9.519 ± 0.131 ns/op
StringConstructor.arrayCopy 1 7 avgt 10 9.194 ± 0.232 ns/op
StringConstructor.copyOf 0 7 avgt 10 11.548 ± 0.133 ns/op
StringConstructor.copyOf 1 7 avgt 10 9.812 ± 0.018 ns/op
StringConstructor.optimizedArrayCopy 0 7 avgt 10 6.854 ± 0.355 ns/op <---- THAT'S COOL
StringConstructor.optimizedArrayCopy 1 7 avgt 10 9.088 ± 0.049 ns/op
the optimized array copy is helping C2 on stub generation.
I didn't checked yet if this applies to the `String` case and I didn't created a long enough dataset array to check the effects on the branch predictor with the newly introduced conditions too, but in term of generated stub, there's a difference.
-------------
PR: https://git.openjdk.org/jdk/pull/12453
More information about the core-libs-dev
mailing list