RFR: 8346664: C2: Optimize mask check with constant offset
Quan Anh Mai
qamai at openjdk.org
Fri Jan 17 14:53:01 UTC 2025
On Sat, 21 Dec 2024 14:08:02 GMT, Matthias Ernst <duke at openjdk.org> wrote:
> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>
> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>
>
> (base + (index + 1) << 8) & 255
> => MulNode
> (base + (index << 8 + 256)) & 255
> => AddNode
> ((base + index << 8) + 256) & 255
>
>
> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>
>
> ((base + index << 8) + 256) & 255
> => MulNode (this PR)
> (base + index << 8) & 255
> => MulNode (PR #6697)
> base & 255 (loop invariant)
>
>
> Implementation notes:
> * I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
LGTM, thanks a lot.
src/hotspot/share/opto/mulnode.cpp line 2056:
> 2054:
> 2055: // Returns a lower bound of the number of trailing zeros in expr.
> 2056: jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
This could be a static function, I don't see much value in it being a method in `MulNode`.
src/hotspot/share/opto/mulnode.cpp line 2058:
> 2056: jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
> 2057: expr = expr->uncast();
> 2058: if (expr == nullptr) {
This should not be `nullptr`, you can safely remove it.
src/hotspot/share/opto/mulnode.cpp line 2068:
> 2066: if (type->is_con()) {
> 2067: long con = type->get_con_as_long(type->basic_type());
> 2068: return con == 0L ? 0 : count_trailing_zeros(con);
For the sake of consistency, we should return the type width for `con == 0`, you can obtain this by `type2aelementbytes(bt) * 8`
src/hotspot/share/opto/mulnode.cpp line 2073:
> 2071: if (expr->Opcode() == Op_ConvI2L) {
> 2072: expr = expr->in(1);
> 2073: if (expr == nullptr) {
This cannot be `nullptr`, you can safely remove it, the same for `expr->uncast()` below. In general, the only case when the input of a `ConvI2L` (and other nodes) not being an `int` is when it is `top`, which means it is empty. A.k.a unreachable code.
src/hotspot/share/opto/mulnode.cpp line 2080:
> 2078: return 0;
> 2079: }
> 2080: type = phase->type(expr)->isa_int();
You are trying to look through a `ConvI2L`, I think for the sake of consistency, you can reassign `bt` to `T_INT` at this point.
src/hotspot/share/opto/mulnode.cpp line 2089:
> 2087: }
> 2088: const TypeInt* rhs_t = phase->type(rhs)->isa_int();
> 2089: if (!rhs_t || !rhs_t->is_con()) {
We are trying to avoid implicit conversion to `bool`, you can use an explicit `rhs_t != nullptr` here.
src/hotspot/share/opto/mulnode.cpp line 2092:
> 2090: return 0;
> 2091: }
> 2092: return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1);
If you reassign `bt`, you can do `type2aelementbytes(bt)`, which IMO is clearer.
src/hotspot/share/opto/mulnode.cpp line 2103:
> 2101: // (AndL (ConvI2L (LShiftI _ #N)) #M) => #0
> 2102: // as well as for constant operands:
> 2103: // (AndI (ConI [+-] _ << #N) #M) => #0
`(AndI (ConI (_ << #N)) #M)`
I think writing like this is clearer, it is confusing talking about signs in bitwise operations. Also, please remove `=> #0` in these.
src/hotspot/share/opto/mulnode.cpp line 2114:
> 2112: // Because the optimization might work for a non-constant
> 2113: // mask M, we check for both operand orders.
> 2114: bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) {
Actually you cannot conclude that `((x + y) & m) == 0 iff (x & m) == 0` when `(y & m) == 0` because the addition `x + y` can carry some bit into the positions at which `m` is set. Consider this example for illustration:
(0b1010 + 0b0010) & 0b0100 == 0b1100 & 0b0100 == 0b0100 != 0
even when
0b1010 & 0b0100 == 0
0b0010 & 0b0100 == 0
The most trivial sufficient condition we are using here is that the lowest bit set of `y` is larger than the highest bit set of `m`. Because then adding `y` into `x` does not carry any bit into the result that is set in `m` but not set in `x`. This method can be a static function, too IMO.
test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java line 123:
> 121: @IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I })
> 122: public static int addShiftMaskInt(int i, int j) {
> 123: return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3;
I would prefer you adding other test cases instead of modifying existing ones.
-------------
Marked as reviewed by qamai (Committer).
PR Review: https://git.openjdk.org/jdk/pull/22856#pullrequestreview-2527419094
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895924146
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895924444
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895925367
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895926684
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895927672
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895928689
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895929425
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1900784977
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895937247
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1895938727
More information about the hotspot-compiler-dev
mailing list