RFR: 8346664: C2: Optimize mask check with constant offset
Emanuel Peter
epeter at openjdk.org
Fri Jan 17 14:53:02 UTC 2025
On Sat, 21 Dec 2024 14:08:02 GMT, Matthias Ernst <duke at openjdk.org> wrote:
> Fixes [JDK-8346664](https://bugs.openjdk.org/browse/JDK-8346664): extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
>
> Such constant values arise in an expression of the following form, for example from `MemorySegmentImpl#isAlignedForElement`:
>
>
> (base + (index + 1) << 8) & 255
> => MulNode
> (base + (index << 8 + 256)) & 255
> => AddNode
> ((base + index << 8) + 256) & 255
>
>
> Currently, `256` is not being recognized as a shifted value. This PR enables further reduction:
>
>
> ((base + index << 8) + 256) & 255
> => MulNode (this PR)
> (base + index << 8) & 255
> => MulNode (PR #6697)
> base & 255 (loop invariant)
>
>
> Implementation notes:
> * I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
> * ~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~
> * ~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~
src/hotspot/share/opto/mulnode.cpp line 674:
> 672:
> 673: // Is expr a neutral element wrt addition under mask?
> 674: static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt);
I would prefer a more expressive function name over a comment here. The comment is a little confusing to me too.
The old name at least talked about shift and mask - is that not relevant any more?
Or you just decide to name it `AndIL_is_zero`, and drop out the comment. Because who knows someone might add other things that check for zero in that method, and then your comment would be out-dated (but probably people would forget to adjust it).
src/hotspot/share/opto/mulnode.cpp line 677:
> 675:
> 676: const Type* AndINode::Value(PhaseGVN* phase) const {
> 677: // patterns similar to (v << 2) & 3
Should this comment be updated now for a more general pattern?
src/hotspot/share/opto/mulnode.cpp line 2063:
> 2061:
> 2062: // Returns a lower bound on the number of trailing zeros in expr.
> 2063: static jint AndIL_min_trailing_zeros(const PhaseGVN* phase, const Node* expr, BasicType bt) {
Is this method restricted to use in AndIL? Because it looks like it is doing something more generic: trying to figure out a lower bound on the trailing zeros of an expression.
If that is the case: Why not put it in `Node::get_trailing_zeros_lower_bound(phase, bt)`, so it can be used elsewhere too?
src/hotspot/share/opto/mulnode.cpp line 2113:
> 2111:
> 2112: jint zeros = AndIL_min_trailing_zeros(phase, expr, bt);
> 2113: return zeros > 0 && ((((jlong)1) << zeros) > mask_t->hi_as_long() && mask_t->lo_as_long() >= 0);
This line indicates that the `mask` could be a variable. You should make sure to add some tests for that in your patterns. You can create a variable in a specific range like this `Math.min(5, Math.max(1, x))`, should get you `x` clamped into the region `1..5`.
test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java line 221:
> 219: }
> 220: }
> 221:
Nice to see that you have some examples here!
I think it would be great to have some more though. The divil hides in the details. In the edge cases usually.
You currently have patterns like this:
`(j + ((i + c1) << c2)) & c3;`
What if you generate the constants `c1, c2, c3` randomly:
`public static final int C1 = random.nextInt()` (or some other random distribution that makes more sense).
Then the compiler will see them as constants (because final), and attempt constant folding.
You can then do result verification: You create a method copy that you restrict to the interpreter, and the other copy can be compiled. Then you test the method with all sorts of random inputs for `i, j`, and verify the results of the two methods (compiled vs interpreted).
Maybe you can add some more patterns as well, just to have a better test coverage.
Does that make sense?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1906619159
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1906620853
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1906634027
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1906615575
PR Review Comment: https://git.openjdk.org/jdk/pull/22856#discussion_r1906599399
More information about the hotspot-compiler-dev
mailing list