RFR: 8369902: C2 SuperWord: wrong result because filterin NaN instead of zero in MemPointerParser::canonicalize_raw_summands

Manuel Hässig mhaessig at openjdk.org
Fri Oct 17 13:19:04 UTC 2025


On Thu, 16 Oct 2025 14:20:29 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> **TLDR** `is_NaN` -> `is_zero`, just like the code comment says.
> 
> Thanks to @mhaessig for debugging the ARM32 bug below. He found the buggy line of code.
> 
> ----------------------------------------
> 
> **Details**
> 
> It seems there is a little "typo" (logic error) in `MemPointerParser::canonicalize_raw_summands` that slipped through the cracks in https://github.com/openjdk/jdk/pull/24278. The JavaFuzzer now found an example, and independently the issue was also reported on ARM32 [JDK-8368578](https://bugs.openjdk.org/browse/JDK-8368578).
> 
> Filtering out `NaN` instead of `zero` for the `scaleL` has two manifestations:
> - If `scaleL` is zero, but does not get filtered out even though it should be: we hit the assert in `MemPointerSummand` constructor, `assert(!_scale.is_zero(), "non-zero scale");`.
>   - See [JDK-8368578](https://bugs.openjdk.org/browse/JDK-8368578), though those tests seem to only fail on ARM32, and nowhere else.
>   - I was able to construct a `MemorySegment` regression test, see `TestMemorySegmentFilterSummands.test1`. I suspect that the ARM32 failures happened on an array, as it failed in places like `BigInteger::implMultiplyToLen`. But now I was able to reproduce it with native memory, to get a pointer expression that has the same cancellation issue.
> - If `scaleL` is `NaN`, and gets filtered even though it should not be: We get a non-trivial MemPointer that is missing a summand. So we will succeed in optimizing, but with wrong assumptions. We generate a runtime aliasing check that is incorrect, leading to wrong results.
>   - This was reported by the fuzzer, see attached `TestDoNotFilterNaNSummands`.
>   - I was also able to create a simpler example with `MemorySegments`, see attached `TestMemorySegmentFilterSummands.test2`.
> 
> **Why did this slip through the cracks?**
> 
> In https://github.com/openjdk/jdk/pull/24278 I added pretty extensive testing, even fuzzer style tests, see `TestAliasingFuzzer.java`. But I think all of those tests exercise `scale` that are in "nice" [int ranges](https://github.com/openjdk/jdk/pull/24278/files#diff-26de03e864a492fe8aa8178818968f2097b99cf36a763605e2fb11fbc04eedffR303-R322). Also the JavaFuzzer does not directly generate such long constants for array accesses (not possible without Unsafe I think), we were lucky that it generated the index with `%` that got optimized to some magic long constant.
> 
> There is already an RFE filed for improvements to `TestAliasingFuzzer.java`: [JDK-8365985](https://bugs.openjdk.org/browse/JDK-836...

Thank you for fixing my bug as well and doing the work to find a 64-bit reproducer, @eme64! Also, thanks for providing an explanation for the NaNs in the MemPointer parsing.

The change looks good to me. I only have a suggestion to simplify your scenarios.

test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegmentFilterSummands.java line 71:

> 69:                        new Scenario(1, "-XX:+AlignVector", "-XX:-ShortRunningLongLoop"),
> 70:                        new Scenario(2, "-XX:-AlignVector", "-XX:+ShortRunningLongLoop"),
> 71:                        new Scenario(3, "-XX:+AlignVector", "-XX:+ShortRunningLongLoop"));

That might be the perfect opportunity to break out the cross-product scenario:
Suggestion:

        f.addCrossProductScenarios(Set.of("-XX:-AlignVector", "-XX:+AlignVector"),
                                   Set.of("-XX:-ShortRunningLongLoop", "-XX:+ShortRunningLoop);

test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegmentFilterSummands.java line 82:

> 80:         applyIfPlatform = {"64-bit", "true"},
> 81:         applyIf = {"AlignVector", "false"},
> 82:         applyIfCPUFeatureOr = {"avx", "true", "asimd", "true"})

I always forget what the best practice is regarding detecting CPU features to not break the CIs of riscv and others. But this should be fine, since you are matching for the CPU features, right?

-------------

Changes requested by mhaessig (Committer).

PR Review: https://git.openjdk.org/jdk/pull/27848#pullrequestreview-3350310410
PR Review Comment: https://git.openjdk.org/jdk/pull/27848#discussion_r2439983427
PR Review Comment: https://git.openjdk.org/jdk/pull/27848#discussion_r2439992482


More information about the hotspot-compiler-dev mailing list