RFR: 8345465: Fix performance regression on x64 after JDK-8345120 [v3]
Jorn Vernee
jvernee at openjdk.org
Thu Dec 5 18:15:39 UTC 2024
On Thu, 5 Dec 2024 11:43:16 GMT, Per Minborg <pminborg at openjdk.org> wrote:
>> This PR proposes to fix a performance regression (on x64 platforms) for 32-bit strings introduced by [JDK-8345120](https://bugs.openjdk.org/browse/JDK-8345120).
>>
>> The PR also fixes a performance regression in the benchmarks caused by using the wrong type for `MemorySegment`.
>>
>> Regrettably, this PR uses different code paths for various architectures. This gives optimum performance for all platforms at the expense of slightly more code complexity.
>>
>> Base (macOS, M1) (Before https://github.com/openjdk/jdk/pull/22451)
>>
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> InternalStrLen.changedElementQuad 1 avgt 30 2.057 ? 0.012 ns/op
>> InternalStrLen.changedElementQuad 4 avgt 30 3.776 ? 0.031 ns/op
>> InternalStrLen.changedElementQuad 16 avgt 30 6.690 ? 0.060 ns/op
>> InternalStrLen.changedElementQuad 251 avgt 30 48.581 ? 0.764 ns/op
>> InternalStrLen.changedElementQuad 1024 avgt 30 196.188 ? 3.484 ns/op
>> InternalStrLen.chunkedDouble 1 avgt 30 1.903 ? 0.013 ns/op
>> InternalStrLen.chunkedDouble 4 avgt 30 3.446 ? 0.025 ns/op
>> InternalStrLen.chunkedDouble 16 avgt 30 5.759 ? 0.062 ns/op
>> InternalStrLen.chunkedDouble 251 avgt 30 26.892 ? 0.141 ns/op
>> InternalStrLen.chunkedDouble 1024 avgt 30 72.940 ? 1.562 ns/op
>> InternalStrLen.chunkedSingle 1 avgt 30 1.897 ? 0.015 ns/op
>> InternalStrLen.chunkedSingle 4 avgt 30 5.357 ? 0.560 ns/op
>> InternalStrLen.chunkedSingle 16 avgt 30 3.821 ? 0.052 ns/op
>> InternalStrLen.chunkedSingle 251 avgt 30 19.482 ? 0.190 ns/op
>> InternalStrLen.chunkedSingle 1024 avgt 30 38.938 ? 0.411 ns/op
>> InternalStrLen.chunkedSingleMisaligned 1 avgt 30 2.230 ? 0.147 ns/op
>> InternalStrLen.chunkedSingleMisaligned 4 avgt 30 5.424 ? 0.688 ns/op
>> InternalStrLen.chunkedSingleMisaligned 16 avgt 30 9.573 ? 0.063 ns/op
>> InternalStrLen.chunkedSingleMisaligned 251 avgt 30 22.242 ? 0.182 ns/op
>> InternalStrLen.chunkedSingleMisaligned 1024 avgt 30 45.442 ? 0.252 ns/op
>> InternalStrLen.elementByteMisaligned 1 avgt 30 1.616 ? 0.041 ns/op
>> InternalStrLen.elementByteMisaligned 4 avgt 30 2.982 ? 0.018 ns/op
>> InternalStrLen.elementByteMis...
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
>
> Improve short string cases
I'm a little bit confused by the numbers, which are for AArch64, while the patch fixes a regression on x64. Can you share any numbers on x64? Do you have an idea why long scanning doesn't help on x64?
src/java.base/share/classes/jdk/internal/foreign/StringSupport.java line 134:
> 132: segment.checkBounds(fromOffset, length);
> 133: if (length < 3) {
> 134: switch ((int) length) {
How much do things like this actually help? I'd think that the added bytecode size might adversely affect inlining as well.
-------------
PR Review: https://git.openjdk.org/jdk/pull/22539#pullrequestreview-2482534043
PR Review Comment: https://git.openjdk.org/jdk/pull/22539#discussion_r1871877536
More information about the core-libs-dev
mailing list