RFR: 8346664: C2: Optimize mask check with constant offset

Matthias Ernst duke at openjdk.org
Fri Jan 17 14:53:02 UTC 2025


On Mon, 23 Dec 2024 16:22:18 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>> 
>> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>> 
>> 
>> (base + (index + 1) << 8) & 255
>> => MulNode
>> (base + (index << 8 + 256)) & 255
>> => AddNode
>> ((base + index << 8) + 256) & 255
>> 
>> 
>> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>> 
>> 
>> ((base + index << 8) + 256) & 255
>> => MulNode (this PR)
>> (base + index << 8) & 255
>> => MulNode (PR #6697)
>> base & 255 (loop invariant)
>> 
>> 
>> Implementation notes:
>> * I verified that the originating issue "scaled varhandle indexed with i+1"  (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
>> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
>> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
>
> src/hotspot/share/opto/mulnode.cpp line 2056:
> 
>> 2054: 
>> 2055: // Returns a lower bound of the number of trailing zeros in expr.
>> 2056: jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
> 
> This could be a static function, I don't see much value in it being a method in `MulNode`.

Done.

> src/hotspot/share/opto/mulnode.cpp line 2058:
> 
>> 2056: jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
>> 2057:   expr = expr->uncast();
>> 2058:   if (expr == nullptr) {
> 
> This should not be `nullptr`, you can safely remove it.

Done.

> src/hotspot/share/opto/mulnode.cpp line 2068:
> 
>> 2066:   if (type->is_con()) {
>> 2067:     long con = type->get_con_as_long(type->basic_type());
>> 2068:     return con == 0L ? 0 : count_trailing_zeros(con);
> 
> For the sake of consistency, we should return the type width for `con == 0`, you can obtain this by `type2aelementbytes(bt) * 8`

Done.

> src/hotspot/share/opto/mulnode.cpp line 2073:
> 
>> 2071:   if (expr->Opcode() == Op_ConvI2L) {
>> 2072:     expr = expr->in(1);
>> 2073:     if (expr == nullptr) {
> 
> This cannot be `nullptr`, you can safely remove it, the same for `expr->uncast()` below. In general, the only case when the input of a `ConvI2L` (and other nodes) not being an `int` is when it is `top`, which means it is empty. A.k.a unreachable code.

Done for all (also original inputs expr and mask). I don't think I understand under which conditions null/`top` may occur, so pls double-check.

> src/hotspot/share/opto/mulnode.cpp line 2080:
> 
>> 2078:       return 0;
>> 2079:     }
>> 2080:     type = phase->type(expr)->isa_int();
> 
> You are trying to look through a `ConvI2L`, I think for the sake of consistency, you can reassign `bt` to `T_INT` at this point.

Done.

> src/hotspot/share/opto/mulnode.cpp line 2089:
> 
>> 2087:     }
>> 2088:     const TypeInt* rhs_t = phase->type(rhs)->isa_int();
>> 2089:     if (!rhs_t || !rhs_t->is_con()) {
> 
> We are trying to avoid implicit conversion to `bool`, you can use an explicit `rhs_t != nullptr` here.

Done.

> src/hotspot/share/opto/mulnode.cpp line 2092:
> 
>> 2090:       return 0;
>> 2091:     }
>> 2092:     return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1);
> 
> If you reassign `bt`, you can do `type2aelementbytes(bt)`, which IMO is clearer.

Done.

> src/hotspot/share/opto/mulnode.cpp line 2114:
> 
>> 2112: // Because the optimization might work for a non-constant
>> 2113: // mask M, we check for both operand orders.
>> 2114: bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) {
> 
> Actually you cannot conclude that `((x + y) & m) == 0 iff (x & m) == 0` when `(y & m) == 0` because the addition `x + y` can carry some bit into the positions at which `m` is set. Consider this example for illustration:
> 
>     (0b1010 + 0b0010) & 0b0100 == 0b1100 & 0b0100 == 0b0100 != 0
> 
> even when
> 
>     0b1010 & 0b0100 == 0
>     0b0010 & 0b0100 == 0
> 
> The most trivial sufficient condition we are using here is that the lowest bit set of `y` is larger than the highest bit set of `m`. Because then adding `y` into `x` does not carry any bit into the result that is set in `m` but not set in `x`. This method can be a static function, too IMO.

Very good point. Adjusted naming and comments a bit, and added a negative test case.

> test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java line 123:
> 
>> 121:     @IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I })
>> 122:     public static int addShiftMaskInt(int i, int j) {
>> 123:         return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3;
> 
> I would prefer you adding other test cases instead of modifying existing ones.

Done.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896564610
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896567144
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896568234
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896568116
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896568348
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896568417
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896568579
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896569229
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1896576894


More information about the hotspot-compiler-dev mailing list