RFR: 8346664: C2: Optimize mask check with constant offset [v22]

Emanuel Peter epeter at openjdk.org
Thu Feb 13 18:32:17 UTC 2025


On Sat, 8 Feb 2025 18:30:56 GMT, Matthias Ernst <duke at openjdk.org> wrote:

>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>> 
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>> 
>> 
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>> 
>> 
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>> 
>> 
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>> 
>> 
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1"  (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Reword correctness (fixes).

Let me suggest this:

// Checks whether expr is neutral additive element (zero) under mask,
// i.e. whether an expression of the form:
//   (AndX (AddX (expr addend) mask)
//   (expr + addend) & mask
// is equivalent to
//   (AndX addend mask)
//   addend & mask
// for any addend.
// (The X in AndX must be I or L, depending on bt).
//
// We check for the sufficient condition when the lowest set bit in expr is higher than
// the highest set bit in mask, i.e.:
// expr: eeeeee0000000000000
// mask: 000000mmmmmmmmmmmmm
//             <--w bits--->
// We do not test for other cases.
//
// Correctness:
//   Given "expr" with at least "w" trailing zeros,
//   let "mod = 2^w", "suffix_mask = mod - 1"
//
//   Since "mask" only has bits set where "suffix_mask" does, we have:
//     mask = suffix_mask & mask     (SUFFIX_MASK)
//
//   And since expr only has bits set above w, and suffix_mask only below:
//     expr & suffix_mask == 0     (NO_BIT_OVERLAP)
//
//   From unsigned modular arithmetic (with unsigned modulo %), and since mod is
//   a power of 2, and we are computing in a ring of powers of 2, we know that
//     (x + y) % mod         = (x % mod         + y) % mod
//     (x + y) & suffix_mask = (x & suffix_mask + y) & suffix_mask       (MOD_ARITH)
//
//   We can now prove the equality:
//     (expr               + addend)               & mask
//   = (expr               + addend) & suffix_mask & mask    (SUFFIX_MASK)
//   = (expr & suffix_mask + addend) & suffix_mask & mask    (MOD_ARITH)
//   = (0                  + addend) & suffix_mask & mask    (NO_BIT_OVERLAP)
//   =                       addend                & mask    (SUFFIX_MASK)
//
// Hence, an expr with at least w trailing zeros is a neutral additive element under any mask with bit width w.

I somewhat agree with @merykitty that `%` can be misleading, but that's why I mention unsigned modulo.
My `MOD_ARITH` step is a little hand-wavy, but also short. @merykitty does it a bit more thoroughly.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22856#issuecomment-2657418539


More information about the hotspot-compiler-dev mailing list