RFR: 8324751: C2 SuperWord: Aliasing Analysis runtime check [v22]

Emanuel Peter epeter at openjdk.org
Mon Aug 25 10:41:13 UTC 2025


On Fri, 22 Aug 2025 13:34:58 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> TODO work that arose during review process / recent merges with master:
>> 
>> - Vladimir asked for benchmark where predicate is disabled, only multiversioning. Show that peek performance is identical but compilation time a bit higher. Investigation ongoing.
>> - See if we can harden some of the IR rules in `TestAliasingFuzzer.java` after JDK-8356176. Probably file a follow-up RFE.
>> 
>> ---------------
>> 
>> This is a big patch, but about 3.5k lines are tests. And a large part of the VM changes is comments / proofs.
>> 
>> I am adding a dynamic (runtime) aliasing check to the auto-vectorizer (SuperWord). We use the infrastructure from https://github.com/openjdk/jdk/pull/22016:
>> - Use the auto-vectorization `predicate` when available: we speculate that there is no aliasing, else we trap and re-compile without the predicate.
>> - If the predicate is not available, we use `multiversioning`, i.e. we have a `fast_loop` where there is no aliasing, and hence vectorization. And a `slow_loop` if the check fails, with no vectorization.
>> 
>> --------------------------
>> 
>> **Where to start reviewing**
>> 
>> - `src/hotspot/share/opto/mempointer.hpp`:
>>   - Read the class comment for `MemPointerRawSummand`.
>>   - Familiarize yourself with the `MemPointer Linearity Corrolary`. We need it for the proofs of the aliasing runtime checks.
>> 
>> - `src/hotspot/share/opto/vectorization.cpp`:
>>   - Read the explanations and proofs above `VPointer::can_make_speculative_aliasing_check_with`. It explains how the aliasing runtime check works.
>> 
>> - `src/hotspot/share/opto/vtransform.hpp`:
>>   - Understand the difference between weak and strong edges.
>> 
>> If you need to see some examples, then look at the tests:
>> - `test/hotspot/jtreg/compiler/loopopts/superword/TestAliasing.java`: simple array cases. IR rules that check for vectors and in somecases if we used multiversioning.
>> - `test/micro/org/openjdk/bench/vm/compiler/VectorAliasing.java`: the miro-benchmarks I show below. Simple array cases.
>> - `test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegmentAliasing.java`: a bit advanced, but similar cases.
>> - `test/hotspot/jtreg/compiler/loopopts/superword/TestAliasingFuzzer.java`: very large and rather compliex. Generates random loops, some with and some without aliasing at runtime. IR verification, but mostly currently only for array cases, MemorySegment cases have some issues (see comments).
>> --------------------------
>> 
>> **Details**
>> 
>> Most fundamentally:
>> - I had to...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add test for related report for JDK-8365982

I think the results in https://github.com/openjdk/jdk/pull/24278#issuecomment-3213393035 already motivate the 2-staged approach:
- First use predicate and only generate vectorized loop
- If the predicate deopts, then use multiversioning

I expect the real-world cases to look like this:
- In most cases, we never have an aliasing case, the predicate never leads to deopt. We don't want to pay the extra compile time for multiversioning.
- In a few cases, we will have occasional aliasing cases, and we have to pay the price of deopt/recompile with multiversioning. While recompilation is a price, it is more than worth it in the long-run, given we can get vectorized performance in most cases now.
- In rare cases, we only have aliasing cases. We have to recompile, and could suffer from the regressions mentioned above. Speculative compilation always has a price, but that's ok if it affects only edge cases.

Here some `CITime` numbers, with `-XX:RepeatCompilation=100`:
- Never aliasing, aliasing runtime check never fails:
  - `patch` (only predicate): `3.454` on C2 (`2.427` in IdealLoop, `0.368` in AutoVectorize)
  - `no_predicate` (directly multiversion): `4.709` in C2 (`3.252` in IdealLoop, `0.425` in AutoVectorize)
- With aliasing, runtime check fails:
- `patch` (first predicate, then multiversioning): `5.956` on C2 (`4.198` in IdealLoop, `0.620` in AutoVectorize)
- `no_predicate` (directly multiversion): `4.633` in C2 (`3.205` in IdealLoop, `0.418` in AutoVectorize)

(I used the [example](https://github.com/openjdk/jdk/pull/24278#issuecomment-3210290629) and extended it with a non-aliasing case)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24278#issuecomment-3219728235


More information about the hotspot-compiler-dev mailing list