RFR: 8346664: C2: Optimize mask check with constant offset [v12]
Matthias Ernst
duke at openjdk.org
Thu Jan 30 19:30:51 UTC 2025
On Thu, 30 Jan 2025 18:14:21 GMT, Matthias Ernst <duke at openjdk.org> wrote:
>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>>
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>>
>>
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>>
>>
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>>
>>
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>>
>>
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
>
> "should never vectorize" only holds for long[] input.
Almost(?) there, the `| 7` PopulateIndex test doesn't seem to vectorize yet.
> regular aarch64 machine
That's correct, M3 MBA.
> Github Actions
🤯 that's good to learn, I had been wondering... I have been working off https://openjdk.org/groups/build/doc/testing.html, maybe it would be worth adding a link there first thing.
test/hotspot/jtreg/compiler/vectorization/TestPopulateIndex.java line 84:
> 82: public void exprWithIndex1() {
> 83: for (int i = 0; i < count; i++) {
> 84: dst[i] = src[i] * (i | 7);
This doesn't want to vectorize.
-------------
PR Review: https://git.openjdk.org/jdk/pull/22856#pullrequestreview-2584813095
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1936153606
More information about the hotspot-compiler-dev
mailing list