RFR: 8346664: C2: Optimize mask check with constant offset [v5]

Matthias Ernst duke at openjdk.org
Fri Jan 24 15:17:51 UTC 2025


On Thu, 23 Jan 2025 14:28:31 GMT, Matthias Ernst <duke at openjdk.org> wrote:

>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>> 
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>> 
>> 
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>> 
>> 
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>> 
>> 
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>> 
>> 
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1"  (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fully randomized

src/hotspot/share/opto/mulnode.cpp line 2112:

> 2110: 
> 2111:   jint zeros = AndIL_min_trailing_zeros(phase, expr, bt);
> 2112:   return zeros > 0 && ((((jlong)1) << zeros) > mask_t->hi_as_long() && mask_t->lo_as_long() >= 0);

FWIW I came up with another way to formulate this. I think it makes even more clear how this is all about "expr is shifted to be completely left of the mask", i.e. we're comparing right-most value bit position to left-most mask bit position, all other bits are irrelevant.

This could also inform how to bias the random generator. Let me know what you think.

Suggestion:

  jint expr_trailing_zeros = AndIL_min_trailing_zeros(phase, expr, bt);
  if (expr_trailing_zeros == 0) return false;

  const TypeInteger* mask_t = phase->type(mask)->isa_integer(bt);
  if (mask_t == nullptr || mask_t->lo_as_long() <= 0) return false; // mask = 0 handled in MulNode::Value
  jint mask_bit_width = BitsPerLong - count_leading_zeros(mask_t->hi_as_long());

  return expr_trailing_zeros >= mask_bit_width;

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1928848314


More information about the hotspot-compiler-dev mailing list