fix and improvements to commoning of barriers during loop optimizations

Wed Aug 24 11:07:19 UTC 2016

Hi Roland,

this is great!

For the sake of readability I would replace:

+    n->set_req(0, NULL);

with

n->set_req(ShenandoahBarrierNode::Control)

or something similar.

For Shenandoah-specific snippets in shared code, we need to think about
how to eventually convince upstream people ;-)

Other than that, please commit!

Roman

Am Mittwoch, den 24.08.2016, 10:21 +0200 schrieb Roland Westrelin:
> http://cr.openjdk.java.net/~roland/shenandoah/loopopts-commoning-v2/w
> ebrev.00/
> 
> In a previous change, I moved the optimization that replaces one
> barrier
> with a write barrier if they share the same memory from gvn to loop
> optimizations because the initial optimization would not keep the
> memory
> graph consistent and it was easier to fix the memory graph during
> loop
> optimizations:
> 
> http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/93aa873d631e
> 
> It turns out that that change wasn't sufficient and the memory graph
> can
> still become inconsistent (which could lead to a load barrier
> bypassing
> a write barrier in some rare cases):
> 
> - Because the code in PhaseIdealLoop::get_late_ctrl() that finds anti
>   dependences is conservative, a write barrier that has load uses can
> be
>   assigned a control where all paths from the control don't use the
>   barrier's memory projection. To fix that, I had to set the control
> of
>   the write barrier at parse time and I clear it at the end of
>   optimizations so scheduling can freely move the barrier around.
> 
> - When moving a write barrier up in the control graph, some region
> that
>   the barrier now dominates may not have a memory phi so the code
> can't
>   add a use for the barrier's memory projection on those paths. I
> fixed
>   it by adding logic to add phis to those regions. I also added code
> to
>   verify that a write barrier can indeed be moved to a new location
>   without breaking the memory graph.
> 
> I also extended that optimization pass so it finds more candidate
> pair
> of barriers to common. Previously the barriers had to share a memory
> edge so with this for instance:
> 
> if (flag) {
>     a.f = v;
> } else {
>     notinlined();
>     a.f = 0x42;
> }
> 
> 2 write barriers would have been kept. With the new code, a write
> barrier can be moved up if it makes another barrier be optimized out
> as
> long as the write barrier doesn't move in a loop.
> 
> I also added an option, ShenandoahDontIncreaseWBFreq that optimizes
> pair
> of barriers only if they don't cause write barriers to move to a more
> frequent code path. So this:
> 
> if (flag) {
>     a.f = v;
> } else {
>     a.f = 0x42;
> }
> 
> would result in a single barrier but not this:
> 
> if (flag1) {
>     a.f = v;
> } else {
>     if (flag2) {
>         a.f = 0x42;
>     }
> }
> 
> I don't know whether that option is useful or not but I suppose it's
> nice to have to experiment. It's now on by default mostly because
> this
> new code I'm proposing is tricky to get right and having it on
> restrict
> how much commoning is happening. I would like to do some more testing
> before I feel comfortable having it off.
> 
> Roland.
>