RFR: 8360520: RISC-V: C1: Fix primitive array clone intrinsic regression after JDK-8333154 [v5]
Galder Zamarreño
galder at openjdk.org
Mon Jul 21 08:30:41 UTC 2025
On Fri, 11 Jul 2025 11:50:33 GMT, Feilong Jiang <fjiang at openjdk.org> wrote:
>> Hi, please consider.
>> [JDK-8333154](https://bugs.openjdk.org/browse/JDK-8333154) Implemented C1 clone intrinsic that reuses arraycopy code for primitive arrays for RISC-V.
>> The new instruction flag `OmitChecksFlag` (introduced by [JDK-8302850](https://bugs.openjdk.org/browse/JDK-8302850)) is used to avoid instantiation of array copy stubs for primitive array clones.
>> If `OmitChecksFlag` is set, all flags (including the `unaligned` flag) will be cleared before generating the `LIR_OpArrayCopy` node.
>> This may lead to incorrect selection of the arraycopy function when `-XX:+UseCompactObjectHeaders` is enabled, causing the `unaligned` flag to be set for arraycopy.
>> We observed performance regression on P550 SBC through the corresponding JMH tests when COH is enabled.
>>
>> This pr keeps the `unaligned` flag on RISC-V to ensure the arraycopy function is selected correctly.
>> The other platforms are not affected as the flag is always `0` when `OmitChecksFlag` is true.
>>
>> Test on linux-riscv64:
>> - [x] Tier1-3
>>
>> JMH data on P550 SBC for reference (w/o and w/ the patch):
>>
>> Before:
>>
>> Without COH:
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> ArrayClone.byteArraycopy 0 avgt 15 50.854 ± 0.379 ns/op
>> ArrayClone.byteArraycopy 10 avgt 15 74.294 ± 0.449 ns/op
>> ArrayClone.byteArraycopy 100 avgt 15 81.847 ± 0.082 ns/op
>> ArrayClone.byteArraycopy 1000 avgt 15 480.106 ± 0.369 ns/op
>> ArrayClone.byteClone 0 avgt 15 90.146 ± 0.299 ns/op
>> ArrayClone.byteClone 10 avgt 15 130.525 ± 0.384 ns/op
>> ArrayClone.byteClone 100 avgt 15 251.942 ± 0.122 ns/op
>> ArrayClone.byteClone 1000 avgt 15 407.580 ± 0.318 ns/op
>> ArrayClone.intArraycopy 0 avgt 15 49.984 ± 0.436 ns/op
>> ArrayClone.intArraycopy 10 avgt 15 76.302 ± 1.388 ns/op
>> ArrayClone.intArraycopy 100 avgt 15 267.487 ± 0.329 ns/op
>> ArrayClone.intArraycopy 1000 avgt 15 1157.444 ± 1.588 ns/op
>> ArrayClone.intClone 0 avgt 15 90.130 ± 0.257 ns/op
>> ArrayClone.intClone 10 avgt 15 183.619 ± 0.588 ns/op
>> ArrayClone.intClone 100 avgt 15 296.491 ± 0.246 ns/op
>> ArrayClone.intClone 1000 avgt 15 828.695 ± 1.501 ns/op
>>
>> -------------------------------------------------------------------------
>> With COH:
>>
>> Benchmark (size) Mode Cnt Score Error Un...
>
> Feilong Jiang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
>
> - Merge branch 'master' of https://github.com/openjdk/jdk into riscv-fix-c1-primitive-clone
> - also keep overlapping flag
> - Merge branch 'master' of https://github.com/openjdk/jdk into riscv-fix-c1-primitive-clone
> - Merge branch 'master' of https://github.com/openjdk/jdk into riscv-fix-c1-primitive-clone
> - Revert RISCV Macro modification
> - Merge branch 'master' of https://github.com/openjdk/jdk into riscv-fix-c1-primitive-clone
> - check unaligned flag at LIR_OpArrayCopy to avoid using AvoidUnalignedAccesses
> - riscv: fix c1 primitive array clone intrinsic regression
Changes requested by galder (Author).
Also, some comments on why these flags are needed when doing array copies would be good for future reference.
src/hotspot/cpu/riscv/c1_LIRGenerator_riscv.cpp line 775:
> 773: arraycopy_helper(x, &flags, &expected_type);
> 774: if (x->check_flag(Instruction::OmitChecksFlag)) {
> 775: flags = (flags & (LIR_OpArrayCopy::unaligned | LIR_OpArrayCopy::overlapping));
The changes in the two files need to be in synch, so I wonder if `LIR_OpArrayCopy::unaligned | LIR_OpArrayCopy::overlapping` could be abstracted away within a function in `LIR_OpArrayCopy`.
So something like this (apologies for any syntactic/semantic errors):
flags = (flags & LIR_OpArrayGopy::get_array_copy_flags());
Then on the other method something like:
((flags & ~(LIR_OpArrayGopy::get_array_copy_flags())) == 0)
Function name is just an example, feel free to suggest some other if you think it fits better.
Thoughts?
-------------
PR Review: https://git.openjdk.org/jdk/pull/25976#pullrequestreview-3037040733
PR Comment: https://git.openjdk.org/jdk/pull/25976#issuecomment-3095714032
PR Review Comment: https://git.openjdk.org/jdk/pull/25976#discussion_r2218510943
More information about the hotspot-compiler-dev
mailing list