RFR: 8346664: C2: Optimize mask check with constant offset [v10]

Emanuel Peter epeter at openjdk.org
Thu Jan 30 15:01:05 UTC 2025


On Thu, 30 Jan 2025 14:37:17 GMT, Matthias Ernst <duke at openjdk.org> wrote:

>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>> 
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>> 
>> 
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>> 
>> 
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>> 
>> 
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>> 
>> 
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1"  (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
> 
>   expect vectorization of i|7 instead of i&7
>   i&7 variant expect "= 0".

Did you split the method to have an additional `@IR` rule, for the `AlilgnVector` true and false case? If so, you can actually have multiple `@IR` annotations per method. That would be preferrable ;)

Also: the exact number of vectors that are generated can vary on platforms, it depends on how much we unroll the vectorized loop for example. So it is best not to have an exact count like `= 10`, but rather just a `> 0`, so that we don't get issues on the various platforms we run these tests on ;)

Background to `AlignVector`:
It is about strict alignment requirements for vector loads / stores. If we cannot prove that at runtime the vectors are alignable, then we refuse to vectorize under `AlignVector=true`. Having invariants often already gets us to that unfortunate case.

Why don't you reorganize the IR rules so that we have one for true/false, and then I can help with additional comment lines that explain why they are correct :)

-------------

PR Review: https://git.openjdk.org/jdk/pull/22856#pullrequestreview-2584140912


More information about the hotspot-compiler-dev mailing list