RFR: 8346664: C2: Optimize mask check with constant offset [v7]

Emanuel Peter epeter at openjdk.org
Thu Jan 30 08:37:54 UTC 2025


On Wed, 29 Jan 2025 17:23:40 GMT, Matthias Ernst <duke at openjdk.org> wrote:

>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>> 
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>> 
>> 
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>> 
>> 
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>> 
>> 
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>> 
>> 
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1"  (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision:
> 
>  - Merge branch 'openjdk:master' into mernst/JDK-8346664
>  - make the check more clear: shift >= mask_width
>  - fully randomized
>  - JLS: only the lower bits of the shift are taken into account (aka we don't assert).
>  - (c)
>  - (c)
>  - Assert that MulNode::Ideal already masks constant shift amounts for us.
>    Avoid accidental zero mask breaking test.
>  - "element".
>  - avoid redundant comment
>  - addConstNonConstMaskLong
>  - ... and 10 more: https://git.openjdk.org/jdk/compare/52ca3118...490cc2fb

There is also this failure:
`compiler/ciReplay/TestDumpReplayCommandLine.java`
With flags: `-XX:-MonomorphicArrayCheck -XX:-UncommonNullCast`


java.lang.RuntimeException: should find a C1 and a C2 replay file expected: 2 but was: 3
	at jdk.test.lib.Asserts.fail(Asserts.java:689)
	at jdk.test.lib.Asserts.assertEquals(Asserts.java:204)
	at jdk.test.lib.Asserts.assertEQ(Asserts.java:180)
	at compiler.ciReplay.TestDumpReplayCommandLine.testAction(TestDumpReplayCommandLine.java:62)
	at compiler.ciReplay.DumpReplayBase.runTest(DumpReplayBase.java:54)
	at compiler.ciReplay.TestDumpReplayCommandLine.main(TestDumpReplayCommandLine.java:56)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
	at java.base/java.lang.Thread.run(Thread.java:1447)


No idea what this is. Maybe some crash, or bailout?

Similar here:
`compiler/ciReplay/TestIncrementalInlining.java`


java.lang.Error: Can't run replay process: java.lang.RuntimeException: Test should only dump 1 replay file when trying to replay compile expected: 2 but was: 1
	at compiler.ciReplay.CiReplayBase.startTest(CiReplayBase.java:240)
	at compiler.ciReplay.CiReplayBase.positiveTest(CiReplayBase.java:280)
	at compiler.ciReplay.TestIncrementalInlining.testAction(TestIncrementalInlining.java:77)
	at compiler.ciReplay.DumpReplayBase.runTest(DumpReplayBase.java:54)
	at compiler.ciReplay.InliningBase.runTest(InliningBase.java:62)
	at compiler.ciReplay.TestIncrementalInlining.<init>(TestIncrementalInlining.java:72)
	at compiler.ciReplay.TestIncrementalInlining.main(TestIncrementalInlining.java:57)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
	at java.base/java.lang.Thread.run(Thread.java:1447)
Caused by: java.lang.RuntimeException: Test should only dump 1 replay file when trying to replay compile expected: 2 but was: 1
	at jdk.test.lib.Asserts.fail(Asserts.java:689)
	at jdk.test.lib.Asserts.assertEquals(Asserts.java:204)
	at jdk.test.lib.Asserts.assertEQ(Asserts.java:180)
	at compiler.ciReplay.DumpReplayBase.getReplayFileName(DumpReplayBase.java:68)
	at compiler.ciReplay.CiReplayBase.startTest(CiReplayBase.java:234)
	... 10 more

More IR rule failures:
`compiler/loopopts/superword/TestEquivalentInvariants.java`

**Actually, these are exciting - we now vectorize more cases that we did not vectorize before!**

Example:

Failed IR Rules (6) of Methods (6)
----------------------------------
1) Method "static java.lang.Object[] compiler.loopopts.superword.TestEquivalentInvariants.testMemorySegmentIInvarL3d(java.lang.foreign.MemorySegment,int,int,int,int)" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={"sse4.1", "true", "asimd", "true"}, counts={"_#V#LOAD_VECTOR_I#_", "= 0", "_#STORE_VECTOR#_", "= 0"}, applyIfPlatform={"64-bit", "true"}, applyIfPlatformOr={}, failOn={}, applyIfOr={}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(LoadVector.*)+(\\s){2}===.*vector[A-Za-z]<I,\\d+>)"
           - Failed comparison: [found] 10 = 0 [given]
             - Matched nodes (10):
               * 1737  LoadVector  === 2000 1843 1601  |448  [[ 1738 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1740  LoadVector  === 2000 1843 1443  |448  [[ 1741 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1771  LoadVector  === 1782 1786 1772  |448  [[ 1768 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1777  LoadVector  === 1782 1786 1778  |448  [[ 1770 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1847  LoadVector  === 2000 1969 1848  |448  [[ 1844 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1853  LoadVector  === 2000 1969 1854  |448  [[ 1846 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1973  LoadVector  === 2000 1975 1979  |448  [[ 1972 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1974  LoadVector  === 2000 1975 1988  |448  [[ 1970 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1990  LoadVector  === 2000 2003 1991  |448  [[ 1976 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>
               * 1993  LoadVector  === 2000 2003 1994  |448  [[ 1978 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=12; mismatched #vectory<I,8>

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22856#issuecomment-2623842168
PR Comment: https://git.openjdk.org/jdk/pull/22856#issuecomment-2623844134
PR Comment: https://git.openjdk.org/jdk/pull/22856#issuecomment-2623847275


More information about the hotspot-compiler-dev mailing list