RFR: 8324751: C2 SuperWord: Aliasing Analysis runtime check [v9]

Fri Aug 22 16:17:04 UTC 2025

On Fri, 22 Aug 2025 13:34:56 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Thank you for addressing my feedback! This looks good to me now.
>
> @mhaessig @vnkozlov 
> Update: I also had to fix the `TestAliasingFuzzer.java`: I can no longer assert that there is no `multiversioning` because there are some edge-cases where we have issues. I filed bugs for those, and already integrated an IR test for each.
> https://bugs.openjdk.org/browse/JDK-8359688
> https://bugs.openjdk.org/browse/JDK-8360204
> https://bugs.openjdk.org/browse/JDK-8365982
> 
> So if anybody accidentally, or intentionally fixes those, we should come back to `TestAliasingFuzzer.java` and tighten the IR rules.
> 
> Asserting that there is no `multiversioning` in the IR rules makes sure that we made the runtime check as exact as possible, and do not fail in cases where it would have been safe to keep the predicate, rather than deoptimizing and compiling with multiversioning (more compile time, more code -> just worse).
> 
> I also filed an RFE to eventually fix the IR rules in the test `TestAliasingFuzzer.java`:
> https://bugs.openjdk.org/browse/JDK-8365985
> 
> Note: for now, `TestAliasingFuzzer.java` still has some IR rules, but just for the `array` examples, see `generateIRRulesArray`. These should already work well with RCE. We are mostly having issues with long-address MemorySegments currently, see the filed RFE's above.

@eme64 I noticed In first (no_patch, fastest) assembler we don't have "strip mining" outer loop. While in other cases we have it. Do you know why?

Yes, it could be a lot of reasons we get such regression. 

Did you tried **reduce** unrolling of slow path.

> Might it be the runtime check and related branch misprediction? 

Could be since you added outer loop in slow path.

> tma_backend_bound: 21.3 vs 24.8 - there seems to be a bottleneck in the backend for patch of 10% 

This seems indicate more time spent on data access. Did main-loop starts copying from the same offset/element in no_patch vs patch loops?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24278#issuecomment-3214913908