RFR(XL): 8224675: Late GC barrier insertion for ZGC

Roland Westrelin rwestrel at redhat.com
Wed Jun 5 08:20:23 UTC 2019


> Ah - I think I get it. You mean like this:
>
> void ZBarrierSetC2::barrier_insertion_phase(PhaseIterGVN& igvn)const {
>    PhaseIdealLoop ideal_loop(igvn,LoopOptsNone);
>
>    // First make sure all loads between call and catch are moved to the 
> catch block clean_catch_blocks(&ideal_loop);
>
>    // Then expand barriers on all loads insert_load_barriers(&ideal_loop);
>
>    // Handle all Unsafe that need barriers. insert_barriers_on_unsafe(&ideal_loop);
>
>    // Cleanup any modified bits igvn.optimize();
>
>    igvn.C->clear_major_progress();
> }
>
> An excellent idea. Then I can remove the new LoopOptsMode::BarrierInsertion.

I was thinking something like what we do in Shenandoah for barrier
expansion:

PhaseIdealLoop ideal_loop(igvn, LoopOptsShenandoahExpand);

ShenandoahBarrierSetC2::is_gc_specific_loop_opts_pass() returns true for
LoopOptsShenandoahExpand and ShenandoahBarrierSetC2::optimize_loops()
handles LoopOptsShenandoahExpand. So there's nothing shenandoah specific
in PhaseIdealLoop::build_and_optimize().

> I've been running some experiments with asserts on the clone code.
>
> 1) There can never be any control flow here - so now phis or such.
>
> 2) Stores have explicit control - and would never be scheduled here either.
>
> 3) Loads - they end up here because they can float. They only matter if 
> there is a use dominated by the catch (after a merge of catch control 
> flow), or uses in more than one catch-proj branch. The only nodes 
> observed being cloned is LoadPNodes with barriers, BoolNodes, and CmpP 
> nodes. It's the same pattern of comparing a pointer. All other load has 
> it's control in the catch-projs.
>
> I will add asserts to the clone in fixup_uses_in_catch to reflect this 
> conclusion and make sure that I catch any change in behavior.

I was thinking of something like:

try {
  non_inlined_call1();
  int v = some_object.object_field.int_field;
  non_inlined_call2(v, v);
} catch (..) {
  int v = some_object.object_field.int_field;
  // some use for v   
}

So there would be a LoadP, a load barrier and a LoadI right after the
call. The LoadI is the first to be cloned. It has 3 uses, so it's cloned
3 times? Which would mean non_inlined_call2 is actually called with:

  SomeObject object = some_object.object_field;
  non_inlined_call2(object.int_field, object.int_field);

the field is reloaded and that code doesn't have the same effect as
above. Or am I missing something?

Roland.


More information about the hotspot-compiler-dev mailing list