RFR: 8276116: C2: optimize long range checks in int counted loops
Roland Westrelin
roland at openjdk.java.net
Fri Nov 26 14:42:24 UTC 2021
Maurizio noticed that some of his panama micro benchmarks don't
perform better avec 8259609 (C2: optimize long range checks in long
counted loops). The reason is that 8259609 optimizes long range checks
in long counted loops but some of his benchmarks include long range
checks in int counted loops:
for (int i = start; i < stop; i += inc) {
Objects.checkIndex(scale * ((long)i) + offset, length);
}
This change applies the transformation from 8259609 for long counted
loop/long range checks to int counted loop/long range checks. That
includes creating a loop nest and transforming the long range check to
an int range check that's subject to range elimination in the inner
loop.
The reason it's required to create a loop nest is that the long range
check transformation logic depends on no overflow of scale * i for the
range of values that the transformed range check is applied to.
As a consequence, this change is mostly refactoring to make the loop
nest creation and range check transformation parameterized by the type
of the transformed loop.
I think this transformation needs to be applied as late as possible
but, in the case of an int counted loop, before pre/main/post loops
are created. I had to move it to IdealLoopTree::iteration_split_impl()
because of that.
There's an alternate shape for a long range check in an int counted
loop that Maurizio insisted needs to be supported:
for (int i = start; i < stop; i += inc) {
Objects.checkIndex(((long)(scale * i)) + offset, length);
}
scale * i can overflow in that case. This is also supported but as a
corner case of the previous one. The code in
PhaseIdealLoop::transform_long_range_checks() has a comment about
that.
Note also that this transformation works best if loop strip mining is
enabled (that is for G1, ZGC, Shenandoah by default). The reason is
that it needs a safepoint and when loop strip mining is enabled, the
outer loop contains one that's always available. A way to have this
work as well for all GCs would be to always construct the loop strip
mining loop nest (whether loop strip mining is enabled or not) and
then only once loop opts are over remove the outer loop when loop
strip mining is disabled. I'm looking for feedback on this.
BTW, something doesn't seem right in IdealLoopTree::iteration_split_impl():
https://github.com/rwestrel/jdk/blob/master/src/hotspot/share/opto/loopTransform.cpp#L3475
should_peel causes transformations to be skipped but peeling is never
applied AFAICT. Does it make sense to anyone?
-------------
Commit messages:
- whitespaces
- test & fix
Changes: https://git.openjdk.java.net/jdk/pull/6576/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6576&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8276116
Stats: 609 lines in 7 files changed: 510 ins; 18 del; 81 mod
Patch: https://git.openjdk.java.net/jdk/pull/6576.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/6576/head:pull/6576
PR: https://git.openjdk.java.net/jdk/pull/6576
More information about the hotspot-compiler-dev
mailing list