RFR: 8351468: C2: array fill optimization assigns wrong type to intrinsic call [v3]

Emanuel Peter epeter at openjdk.org
Thu Mar 20 13:40:14 UTC 2025


On Tue, 18 Mar 2025 12:34:39 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:

>> The [array fill optimization](https://github.com/openjdk/jdk/blob/1d147ccb4cfcb1da23664ac941e56ac542a7ac61/src/hotspot/share/opto/loopTransform.cpp#L3533) replaces simple innermost loops that fill an array with copies of the same primitive value:
>> 
>> 
>> for (int i = 0; i < array.length; i++) {
>>     array[i] = 0;
>> }
>> 
>> with a call to an [array filling intrinsic](https://github.com/openjdk/jdk/blob/1d147ccb4cfcb1da23664ac941e56ac542a7ac61/src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp#L1665) that is specialized for the array element type:
>> 
>> 
>> arrayof_jint_fill(array, 0, array.length)
>> 
>> 
>> The optimization retrieves the (basic) array element type from calling `MemNode::memory_type()` on the original filling store. This is incorrect for stores of `short` values, since these are represented by `StoreC` nodes [whose `memory_type()` is `T_CHAR`](https://github.com/openjdk/jdk/blob/1fe45265e446eeca5dc496085928ce20863a3172/src/hotspot/share/opto/memnode.hpp#L680). As a result, the optimization wrongly assigns the address type `char[]` to `short` array fill loops. This can cause miscompilations due to missing anti-dependences, see the [issue description for further detail](https://bugs.openjdk.org/projects/JDK/issues/JDK-8351468).
>> 
>> This changeset proposes retrieving the (basic) array element type from the store address type instead. This ensures that the accurate address type is assigned to the intrinsic call, preventing missed anti-dependences and other potential issues caused by mismatching types. Additionally, the changeset makes it easier to reason about correctness by explicitly disabling the optimization for mismatched stores (where the type of the value to be stored differs from the element type of the destination array). Such stores were not optimized before, but only due to pattern matching limitations.
>> 
>> Assuming mismatched stores are discarded (as proposed here), an alternative solution would be to define a StoreS node returning the appropriate `memory_type()`. This could be desirable even as a complement to this fix, to prevent similar bugs in the future. I propose to investigate the introduction of a StoreS node in a separate RFE, because it is a much larger and more intrusive changeset, and go with this minimal, local, and non-intrusive fix for backportability.
>> 
>> **Testing:** tier1-5, stress testing (linux-x64, windows-x64, macosx-x64, linux-aarch64, macosx-aarch64).
>
> Roberto Castañeda Lozano has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Explicitly disable optimization for mismatching stores; add positive and negative tests

@robcasloz Nice catch, I'm glad you dug this up and found a reproducer :partying_face: 

Yes, taking the element type from the address is the best, that way you actually depend on the array, not the type of the store.

test/hotspot/jtreg/compiler/loopopts/TestArrayFillAntiDependence.java line 34:

> 32:  *          scheduled correctly, for different load and array fill types.
> 33:  *          See detailed comments in testShort() below.
> 34:  * @requires vm.compiler2.enabled

Is the test so expensive that C2 is required? Or can you just put `-XX:+IgnoreUnrecognizedVMOptions` in the run that has C2 flags?

test/hotspot/jtreg/compiler/loopopts/TestArrayFillIntrinsic.java line 46:

> 44:         // compiler might decide to unroll the array-filling loop instead of
> 45:         // replacing it with an intrinsic call even if OptimizeFill is enabled.
> 46:         TestFramework.runWithFlags("-XX:LoopUnrollLimit=0", "-XX:+OptimizeFill");

What about a run without flags just in case?

test/hotspot/jtreg/compiler/loopopts/TestArrayFillIntrinsic.java line 46:

> 44:         // compiler might decide to unroll the array-filling loop instead of
> 45:         // replacing it with an intrinsic call even if OptimizeFill is enabled.
> 46:         TestFramework.runWithFlags("-XX:LoopUnrollLimit=0", "-XX:+OptimizeFill");

What about a run without flags just in case?

-------------

Marked as reviewed by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24005#pullrequestreview-2702771147
PR Review Comment: https://git.openjdk.org/jdk/pull/24005#discussion_r2005661671
PR Review Comment: https://git.openjdk.org/jdk/pull/24005#discussion_r2005664073
PR Review Comment: https://git.openjdk.org/jdk/pull/24005#discussion_r2005664394


More information about the hotspot-compiler-dev mailing list