RFR: 8259609: C2: optimize long range checks in long counted loops [v3]

Tobias Hartmann thartmann at openjdk.java.net
Mon Jan 25 10:09:44 UTC 2021


On Wed, 20 Jan 2021 07:44:12 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> JDK-8255150 makes it possible for java code to explicitly perform a
>> range check on long values. JDK-8223051 provides a transformation of
>> long counted loops into loop nests with an inner int counted
>> loop. With this change I propose transforming long range checks that
>> operate on the iv of a long counted loop into range checks that
>> operate on the iv of the int inner loop once it has been
>> created. Existing range check eliminations can then kick in.
>> 
>> Transformation of range checks is piggy backed on the loop nest
>> creation for 2 reasons:
>> 
>> - pattern matching range checks is easier right before the loop nest
>>   is created
>> 
>> - the number of iterations of the inner loop is adjusted so scale *
>>   inner_iv doesn't overflow
>> 
>> C2 has logic to delay some split if transformations so they don't
>> break the scale * iv + offset pattern. I reused that logic for long
>> range checks and had to relax what's considered a range check because
>> initially a range check from Object.checkIndex() may include a test
>> for range > 0 that needs a round of loop opts to be hoisted. I realize
>> there's some code duplication but I didn't see a way to share logic
>> between IdealLoopTree::may_have_range_check()
>> IdealLoopTree::policy_range_check() that would feel right.
>> 
>> I realize the comment in PhaseIdealLoop::transform_long_range_checks()
>> is scary. FWIW, it's not as complicated as it looks. I found drawing
>> the range covered by the entire long loop and the range covered by the
>> inner loop help see how range checks can be transformed. Then the
>> comment helps make sure all cases are covered and verify the generated
>> code actually covers all of them.
>> 
>> One issue is overflow. I think the fact that inner_iv * scale doesn't
>> overflow helps simplify thing. One possible overflow is that of scale
>> * upper + offset which is handled by forcing all range checks in that
>> case to deoptimize. I don't think other case of overflow needs special
>> handling.
>> 
>> This was tested with a Memory Segment micro benchmark (and patched
>> Memory Segment support to take advantage of the new checkIndex
>> intrinsic, both provided by Maurizio). Range checks in the micro
>> benchmark are properly optimized (and performance increases
>> significantly).
>
> Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - min_jint overflow fix
>  - Revert "assert(static_cast<T1>(result) == thing) fix"
>    
>    This reverts commit e234477df097475d503ea6f94ab6a258132d165e.

I did a first pass over the changes and added some comments but I need more time to review.

src/hotspot/share/opto/loopPredicate.cpp line 611:

> 609: bool IdealLoopTree::is_range_check_if(IfNode *iff, PhaseIdealLoop *phase, BasicType bt, Node *iv, Node *&range,
> 610:                                       Node *&offset,
> 611:                                       jlong &scale) const {

Newline should be removed.

src/hotspot/share/opto/loopPredicate.cpp line 665:

> 663: }
> 664:   return false;
> 665: }

Code is hard to read because the indentation of above lines is broken. Some `*` could also be moved to the type.

src/hotspot/share/opto/loopTransform.cpp line 2584:

> 2582:       if (p_offset != NULL) {
> 2583:         if (*p_scale == min_signed_integer(bt)) {
> 2584:           return false;

I find it suspicious that this edge case needs to be handled here. Could you explain why and add a corresponding comment?

src/hotspot/share/opto/loopnode.cpp line 4192:

> 4190:       // on just their loop-phi's for this pass of loop opts
> 4191:       if (SplitIfBlocks && do_split_ifs) {
> 4192:         if (lpt->may_have_range_check(this)) {

Can be merged with above if.

src/hotspot/share/opto/loopnode.cpp line 1216:

> 1214:         if (loop->is_range_check_if(rc, this, T_LONG, phi, range, offset, scale) &&
> 1215:             loop->is_invariant(range) && loop->is_invariant(offset)) {
> 1216:           if (iters_limit_as_long / ABS(scale * stride_con) > min_iters/* && UseNewCode*/) {

`UseNewCode` should be removed and if should be merged with if above.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2045


More information about the hotspot-compiler-dev mailing list