RFR(XL): 8224675: Late GC barrier insertion for ZGC
Roland Westrelin
rwestrel at redhat.com
Wed Jun 5 08:20:23 UTC 2019
> Ah - I think I get it. You mean like this:
>
> void ZBarrierSetC2::barrier_insertion_phase(PhaseIterGVN& igvn)const {
> PhaseIdealLoop ideal_loop(igvn,LoopOptsNone);
>
> // First make sure all loads between call and catch are moved to the
> catch block clean_catch_blocks(&ideal_loop);
>
> // Then expand barriers on all loads insert_load_barriers(&ideal_loop);
>
> // Handle all Unsafe that need barriers. insert_barriers_on_unsafe(&ideal_loop);
>
> // Cleanup any modified bits igvn.optimize();
>
> igvn.C->clear_major_progress();
> }
>
> An excellent idea. Then I can remove the new LoopOptsMode::BarrierInsertion.
I was thinking something like what we do in Shenandoah for barrier
expansion:
PhaseIdealLoop ideal_loop(igvn, LoopOptsShenandoahExpand);
ShenandoahBarrierSetC2::is_gc_specific_loop_opts_pass() returns true for
LoopOptsShenandoahExpand and ShenandoahBarrierSetC2::optimize_loops()
handles LoopOptsShenandoahExpand. So there's nothing shenandoah specific
in PhaseIdealLoop::build_and_optimize().
> I've been running some experiments with asserts on the clone code.
>
> 1) There can never be any control flow here - so now phis or such.
>
> 2) Stores have explicit control - and would never be scheduled here either.
>
> 3) Loads - they end up here because they can float. They only matter if
> there is a use dominated by the catch (after a merge of catch control
> flow), or uses in more than one catch-proj branch. The only nodes
> observed being cloned is LoadPNodes with barriers, BoolNodes, and CmpP
> nodes. It's the same pattern of comparing a pointer. All other load has
> it's control in the catch-projs.
>
> I will add asserts to the clone in fixup_uses_in_catch to reflect this
> conclusion and make sure that I catch any change in behavior.
I was thinking of something like:
try {
non_inlined_call1();
int v = some_object.object_field.int_field;
non_inlined_call2(v, v);
} catch (..) {
int v = some_object.object_field.int_field;
// some use for v
}
So there would be a LoadP, a load barrier and a LoadI right after the
call. The LoadI is the first to be cloned. It has 3 uses, so it's cloned
3 times? Which would mean non_inlined_call2 is actually called with:
SomeObject object = some_object.object_field;
non_inlined_call2(object.int_field, object.int_field);
the field is reloaded and that code doesn't have the same effect as
above. Or am I missing something?
Roland.
More information about the hotspot-compiler-dev
mailing list