RFR: 8259609: C2: optimize long range checks in long counted loops [v5]

Roland Westrelin roland at openjdk.java.net
Fri Jul 30 14:18:06 UTC 2021


> JDK-8255150 makes it possible for java code to explicitly perform a
> range check on long values. JDK-8223051 provides a transformation of
> long counted loops into loop nests with an inner int counted
> loop. With this change I propose transforming long range checks that
> operate on the iv of a long counted loop into range checks that
> operate on the iv of the int inner loop once it has been
> created. Existing range check eliminations can then kick in.
> 
> Transformation of range checks is piggy backed on the loop nest
> creation for 2 reasons:
> 
> - pattern matching range checks is easier right before the loop nest
>   is created
> 
> - the number of iterations of the inner loop is adjusted so scale *
>   inner_iv doesn't overflow
> 
> C2 has logic to delay some split if transformations so they don't
> break the scale * iv + offset pattern. I reused that logic for long
> range checks and had to relax what's considered a range check because
> initially a range check from Object.checkIndex() may include a test
> for range > 0 that needs a round of loop opts to be hoisted. I realize
> there's some code duplication but I didn't see a way to share logic
> between IdealLoopTree::may_have_range_check()
> IdealLoopTree::policy_range_check() that would feel right.
> 
> I realize the comment in PhaseIdealLoop::transform_long_range_checks()
> is scary. FWIW, it's not as complicated as it looks. I found drawing
> the range covered by the entire long loop and the range covered by the
> inner loop help see how range checks can be transformed. Then the
> comment helps make sure all cases are covered and verify the generated
> code actually covers all of them.
> 
> One issue is overflow. I think the fact that inner_iv * scale doesn't
> overflow helps simplify thing. One possible overflow is that of scale
> * upper + offset which is handled by forcing all range checks in that
> case to deoptimize. I don't think other case of overflow needs special
> handling.
> 
> This was tested with a Memory Segment micro benchmark (and patched
> Memory Segment support to take advantage of the new checkIndex
> intrinsic, both provided by Maurizio). Range checks in the micro
> benchmark are properly optimized (and performance increases
> significantly).

Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits:

 - rework
 - Merge branch 'master' into JDK-8259609
 - John's review 1
 - merge with master
 - Tobias' comments
 - Merge branch 'master' into JDK-8259609
 - min_jint overflow fix
 - Revert "assert(static_cast<T1>(result) == thing) fix"
   
   This reverts commit e234477df097475d503ea6f94ab6a258132d165e.
 - Merge branch 'master' into JDK-8259609
 - assert(static_cast<T1>(result) == thing) fix
 - ... and 3 more: https://git.openjdk.java.net/jdk/compare/fb859600...a4c81ceb

-------------

Changes: https://git.openjdk.java.net/jdk/pull/2045/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2045&range=04
  Stats: 867 lines in 12 files changed: 703 ins; 69 del; 95 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2045.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2045/head:pull/2045

PR: https://git.openjdk.java.net/jdk/pull/2045


More information about the hotspot-compiler-dev mailing list