RFR(S): 8231291: C2: loop opts before EA should maximally unroll loops
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Dec 18 10:02:01 UTC 2019
Hi Roland,
> http://cr.openjdk.java.net/~roland/8231291/webrev.01/
As I understand, the intention of the change you propose is to perform
complete loop unrolling earlier so EA can benefit from it.
Some comments:
src/hotspot/share/opto/loopnode.cpp:
+ if (mode == LoopOptsMaxUnroll) {
+ for (LoopTreeIterator iter(_ltree_root); !iter.done(); iter.next()) {
+ IdealLoopTree* lpt = iter.current();
+ if (lpt->is_innermost() && lpt->_allow_optimizations &&
!lpt->_has_call && lpt->is_counted()) {
+ lpt->compute_trip_count(this);
+ if (!lpt->do_one_iteration_loop(this) &&
+ !lpt->do_remove_empty_loop(this)) {
+ AutoNodeBudget node_budget(this);
+ if (lpt->policy_maximally_unroll(this)) {
+ memset( worklist.adr(), 0, worklist.Size()*sizeof(Node*) );
+ do_maximally_unroll(lpt, worklist);
+ }
+ }
+ }
+ }
It looks like LoopOptsMaxUnroll is a shortened version of
IdealLoopTree::iteration_split/iteration_split_impl().
Have you considered factoring out the common code? Right now, its hard
to correlate the checks for LoopOptsMaxUnroll with iteration_split() and
there's a risk they'll diverge eventually.
Do you need the following steps from the original version?
================================================
// Look for loop-exit tests with my 50/50 guesses from the Parsing stage.
// Replace with a 1-in-10 exit guess.
if (!is_root() && is_loop()) {
adjust_loop_exit_prob(phase);
}
// Compute loop trip count from profile data
compute_profile_trip_cnt(phase);
No use of profiling data since full unrolling is happening anyway?
========================
if (!cl->is_valid_counted_loop()) return true; // Ignore various
kinds of broken loops
========================
// Do nothing special to pre- and post- loops
if (cl->is_pre_loop() || cl->is_post_loop()) return true;
I assume there are no pre-/post-loops exist at that point, so these
checks are redundant. Turn them into asserts?
========================
if (cl->is_normal_loop()) {
if (policy_unswitching(phase)) {
phase->do_unswitching(this, old_new);
return true;
}
if (policy_maximally_unroll(phase)) {
// Here we did some unrolling and peeling. Eventually we will
// completely unroll this loop and it will no longer be a loop.
phase->do_maximally_unroll(this, old_new);
return true;
}
You don't perform loop unswitching at all. So, the order of operations
changes. Do you see any problems with that?
========================
src/hotspot/share/opto/compile.cpp
// Perform escape analysis
if (_do_escape_analysis && ConnectionGraph::has_candidates(this)) {
if (has_loops()) {
// Cleanup graph (remove dead nodes).
TracePhase tp("idealLoop", &timers[_t_idealLoop]);
- PhaseIdealLoop::optimize(igvn, LoopOptsNone);
+ PhaseIdealLoop::optimize(igvn, LoopOptsMaxUnroll);
if (major_progress()) print_method(PHASE_PHASEIDEAL_BEFORE_EA, 2);
if (failing()) return;
}
ConnectionGraph::do_analysis(this, &igvn);
Does it make sense to do more elaborate checks before performing early
full loop unrolling? Like whether candidates are used inside loop bodies?
Best regards,
Vladimir Ivanov
>
> As discussed before:
>
> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-September/035094.html
> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-November/036171.html
>
> Fully unrolling loops early helps EA. The change to cfgnode.cpp is
> required because full unroll sometimes needs peeling which may add a phi
> between a memory access and its AddP, a pattern that EA doesn't
> recognize.
>
> Roland.
>
More information about the hotspot-compiler-dev
mailing list