RFR: 8346664: C2: Optimize mask check with constant offset [v17]
Emanuel Peter
epeter at openjdk.org
Wed Feb 5 09:21:17 UTC 2025
On Sun, 2 Feb 2025 21:36:03 GMT, Matthias Ernst <duke at openjdk.org> wrote:
>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>>
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>>
>>
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>>
>>
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>>
>>
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>>
>>
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
>
> jlong, not long
It would be nice if we had some kind of "proof" here.
`is zero wrt addition under mask`:
Maybe this is a well-known mathematical property I'm not super familiar with, and then it would be good to define it more properly here for everybody ;)
I'm a little worried that it is not straight forward to prove without making some assumptions about `mask` and `add1` that we only check inside of `AndIL_is_zero_element_under_mask`.
That may be ok for now, but what if we somehow can improve `AndIL_is_zero_element_under_mask` at some point to also handle this case:
mask = 0000 1110
add1 = 1111 0001
We could probably strenghten `AndIL_is_zero_element_under_mask` to detect that `add1 & mask = 0`. It is the zero element with the mask, so that seems ok now.
But then what if `add2 = 0000 0011`?
`(add1 + add2) & mask = (1111 0100) & (0000 1110) = 0000 0100`
I'm not saying it is not correct now, but out here we make assumptions about the implementation of `AndIL_is_zero_element_under_mask` that the next person might not be aware of.
Mabe we can adjust the name of `AndIL_is_zero_element_under_mask` somehow, so that the assumption is made explicit?
Hmm, ok now I see why you mentioned the stuff about
`Checks whether expr is neutral wrt addition under mask`
in the description of `AndIL_is_zero_element_under_mask`....
Well ok then maybe we need to revisit my suggestion there to drop that stuff 😅
But it needs to be more clear why it is there, and what it guarantees.
Maybe we can rename the method to:
`AndIL_is_expr_additive_neutral_element_under_mask`
I'll leave this to you to think about. You may have a better idea.
But what I would like to see:
- Good definition(s)
- Proof (if possible formal)
src/hotspot/share/opto/mulnode.cpp line 2122:
> 2120: // Because the AddX operands can come in either
> 2121: // order, we check for both orders.
> 2122: Node* MulNode::AndIL_sum_and_mask(PhaseGVN* phase, BasicType bt) {
Suggestion:
// Pattern:
// (AndX (AddX add1 add2) mask)
//
// Assume:
// (AndX add1 mask) == 0
//
// ... prove why we know that we can return:
// (AndX add2 mask)
Node* MulNode::AndIL_sum_and_mask(PhaseGVN* phase, BasicType bt) {
-------------
PR Review: https://git.openjdk.org/jdk/pull/22856#pullrequestreview-2595070278
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942504257
More information about the hotspot-compiler-dev
mailing list