RFR: 8342692: C2: long counted loop/long range checks: don't create loop-nest for short running loops [v8]

Roland Westrelin roland at openjdk.org
Mon Jan 13 12:44:25 UTC 2025


> To optimize a long counted loop and long range checks in a long or int
> counted loop, the loop is turned into a loop nest. When the loop has
> few iterations, the overhead of having an outer loop whose backedge is
> never taken, has a measurable cost. Furthermore, creating the loop
> nest usually causes one iteration of the loop to be peeled so
> predicates can be set up. If the loop is short running, then it's an
> extra iteration that's run with range checks (compared to an int
> counted loop with int range checks).
> 
> This change doesn't create a loop nest when:
> 
> 1- it can be determined statically at loop nest creation time that the
>    loop runs for a short enough number of iterations
>   
> 2- profiling reports that the loop runs for no more than ShortLoopIter
>    iterations (1000 by default).
>   
> For 2-, a guard is added which is implemented as yet another predicate.
> 
> While this change is in principle simple, I ran into a few
> implementation issues:
> 
> - while c2 has a way to compute the number of iterations of an int
>   counted loop, it doesn't have that for long counted loop. The
>   existing logic for int counted loops promotes values to long to
>   avoid overflows. I reworked it so it now works for both long and int
>   counted loops.
> 
> - I added a new deoptimization reason (Reason_short_running_loop) for
>   the new predicate. Given the number of iterations is narrowed down
>   by the predicate, the limit of the loop after transformation is a
>   cast node that's control dependent on the short running loop
>   predicate. Because once the counted loop is transformed, it is
>   likely that range check predicates will be inserted and they will
>   depend on the limit, the short running loop predicate has to be the
>   one that's further away from the loop entry. Now it is also possible
>   that the limit before transformation depends on a predicate
>   (TestShortRunningLongCountedLoopPredicatesClone is an example), we
>   can have: new predicates inserted after the transformation that
>   depend on the casted limit that itself depend on old predicates
>   added before the transformation. To solve this cicular dependency,
>   parse and assert predicates are cloned between the old predicates
>   and the loop head. The cloned short running loop parse predicate is
>   the one that's used to insert the short running loop predicate.
> 
> - In the case of a long counted loop, the loop is transformed into a
>   regular loop with a new limit and transformed range checks that's
>   later turned into an in counted loop. The int ...

Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits:

 - refactor
 - Merge branch 'master' into JDK-8342692
 - Merge branch 'master' into JDK-8342692
 - Merge branch 'master' into JDK-8342692
 - Merge branch 'master' into JDK-8342692
 - review
 - reviews
 - Update src/hotspot/share/opto/loopTransform.cpp
   
   Co-authored-by: Emanuel Peter <emanuel.peter at oracle.com>
 - Merge branch 'master' into JDK-8342692
 - whitespaces
 - ... and 19 more: https://git.openjdk.org/jdk/compare/3b9732ed...0f137359

-------------

Changes: https://git.openjdk.org/jdk/pull/21630/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21630&range=07
  Stats: 1308 lines in 24 files changed: 1252 ins; 16 del; 40 mod
  Patch: https://git.openjdk.org/jdk/pull/21630.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21630/head:pull/21630

PR: https://git.openjdk.org/jdk/pull/21630


More information about the hotspot-compiler-dev mailing list