RFR: 8346664: C2: Optimize mask check with constant offset [v17]
Emanuel Peter
epeter at openjdk.org
Wed Feb 5 08:19:19 UTC 2025
On Sun, 2 Feb 2025 21:36:03 GMT, Matthias Ernst <duke at openjdk.org> wrote:
>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>>
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>>
>>
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>>
>>
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>>
>>
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>>
>>
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
>
> jlong, not long
I have a few more coming, but need to hop off the train real quick ;)
src/hotspot/share/opto/mulnode.cpp line 2059:
> 2057:
> 2058: // Returns a lower bound on the number of trailing zeros in expr, or -1 if the number
> 2059: // cannot be determined.
Why not just return `0` if we cannot determine it? That would still be a correct lower bound, right?
src/hotspot/share/opto/mulnode.cpp line 2081:
> 2079: const TypeInt* rhs_t = phase->type(expr->in(2))->isa_int();
> 2080: if (rhs_t == nullptr || !rhs_t->is_con()) {
> 2081: return -1;
Suggestion:
// Pattern: expr = (x << shift)
if (expr->Opcode() == Op_LShift(bt)) {
const TypeInt* shift_t = phase->type(expr->in(2))->isa_int();
if (shift_t == nullptr || !shift_t->is_con()) {
return -1;
src/hotspot/share/opto/mulnode.cpp line 2083:
> 2081: return -1;
> 2082: }
> 2083: return rhs_t->get_con() & (type2aelembytes(bt) * BitsPerByte - 1);
Suggestion:
// We need to truncate the shift, as it may not have been canonicalized yet.
// T_INT: 0..31 -> shift_mask = 4 * 8 - 1 = 31
// T_LONG: 0..63 -> shift_mask = 8 * 8 - 1 = 63
jint shift_mask = type2aelembytes(bt) * BitsPerByte - 1;
return shift_t->get_con() & shift_mask;
src/hotspot/share/opto/mulnode.cpp line 2102:
> 2100: // (AndL (ConL (_ << #N)) #M)
> 2101: // The M and N values must satisfy ((-1 << N) & M) == 0.
> 2102: static bool AndIL_is_zero_element_under_mask(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt) {
Suggestion:
// Checks whether expr is neutral element (zero) under mask. We have:
// (AndX expr mask)
// The X in AndX must be I or L, depending on bt.
//
// We split the bits of expr into MSB and LSB, where LSB represents
// all trailing zeros of expr:
// MSB LSB
// xxxxxx 0000000000
//
// We check if the mask has no one bits in the corresponding higher
// bits, i.e. if the number of trailing zeros is larger or equal to the
// bit width of the expr, i.e. if the number of leading zeros for mask
// is greater or equal to the number of bits in MSB:
// 000000 00000yyyyy -> (AndX expr mask) = 0 -> return true
// 0000yy yyyyyyyyyy -> (AndX expr mask) = 0000zz 0000000000 -> return false
//
static bool AndIL_is_zero_element_under_mask(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt) {
src/hotspot/share/opto/mulnode.cpp line 2104:
> 2102: static bool AndIL_is_zero_element_under_mask(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt) {
> 2103: jint expr_trailing_zeros = AndIL_min_trailing_zeros(phase, expr, bt);
> 2104: if (expr_trailing_zeros < 0) {
It feels a little strange that the number of trailing zeros could be negative...
That's why I would return 0 if we can prove nothing. It is still clear that we can do nothing here if it is zero, so we can just compare `<= 0`.
Or what was the reason for returning `-1`?
src/hotspot/share/opto/mulnode.cpp line 2111:
> 2109: if (mask_t == nullptr || mask_t->lo_as_long() < 0) {
> 2110: return false;
> 2111: }
Suggestion:
// When the mask is negative, it has the most significant bit set.
const TypeInteger* mask_t = phase->type(mask)->isa_integer(bt);
if (mask_t == nullptr || mask_t->lo_as_long() < 0) {
return false;
}
src/hotspot/share/opto/mulnode.cpp line 2113:
> 2111: }
> 2112:
> 2113: jint mask_bit_width = mask_t->hi_as_long() == 0 ? 0 : (BitsPerLong - count_leading_zeros(mask_t->hi_as_long()));
I would just split this into 2 separate things.
Suggestion:
// Is the mask always zero?
if (mask_t->hi_as_long() == 0) {
assert(mask_t->lo_as_long() == 0, "checked earlier");
return true;
}
jint mask_bit_width = BitsPerLong - count_leading_zeros(mask_t->hi_as_long());
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/22856#pullrequestreview-2594812563
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942353461
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942358554
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942370863
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942402331
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942405802
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942407646
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1942411441
More information about the hotspot-compiler-dev
mailing list