RFR(XL): 8224675: Late GC barrier insertion for ZGC
Nils Eliasson
nils.eliasson at oracle.com
Fri Jun 7 12:45:58 UTC 2019
Hi Erik,
I uploaded webrev with the friend declaration removed.
I also changed back the shape of how the barrier_insertion is called
from Compile.cpp. Then that will call PhaseIdealLoop::optimize and use
it in the same way as Shenandoah does GC specific loop opts.
I also added a clearing of major progress after the barrier insertion.
Otherwise some LoopLimit nodes won't be removed correctly in the
following macro expansion phase.
Thank you!
Regards,
/ Nils
On 2019-06-06 21:06, Erik Österlund wrote:
> Hi Nils,
>
> One final nit... could you please remove the now seemingly unnecessary
> "friend class ZBarrierSetC2" in PhaseIdealLoop? I don't need another
> webrev for that.
>
> This looks good. Great job Nils. This will improve the robustness of
> our barriers.
>
> /Erik
>
> On 2019-06-06 20:03, Nils Eliasson wrote:
>>
>> On 2019-06-05 10:20, Roland Westrelin wrote:
>>>> Ah - I think I get it. You mean like this:
>>>>
>>>> void ZBarrierSetC2::barrier_insertion_phase(PhaseIterGVN& igvn)const {
>>>> PhaseIdealLoop ideal_loop(igvn,LoopOptsNone);
>>>>
>>>> // First make sure all loads between call and catch are moved
>>>> to the
>>>> catch block clean_catch_blocks(&ideal_loop);
>>>>
>>>> // Then expand barriers on all loads
>>>> insert_load_barriers(&ideal_loop);
>>>>
>>>> // Handle all Unsafe that need barriers.
>>>> insert_barriers_on_unsafe(&ideal_loop);
>>>>
>>>> // Cleanup any modified bits igvn.optimize();
>>>>
>>>> igvn.C->clear_major_progress();
>>>> }
>>>>
>>>> An excellent idea. Then I can remove the new
>>>> LoopOptsMode::BarrierInsertion.
>>> I was thinking something like what we do in Shenandoah for barrier
>>> expansion:
>>>
>>> PhaseIdealLoop ideal_loop(igvn, LoopOptsShenandoahExpand);
>>>
>>> ShenandoahBarrierSetC2::is_gc_specific_loop_opts_pass() returns true
>>> for
>>> LoopOptsShenandoahExpand and ShenandoahBarrierSetC2::optimize_loops()
>>> handles LoopOptsShenandoahExpand. So there's nothing shenandoah
>>> specific
>>> in PhaseIdealLoop::build_and_optimize().
>>
>> I followed your example and changed my implementation inline with that.
>>
>>
>>>
>>>> I've been running some experiments with asserts on the clone code.
>>>>
>>>> 1) There can never be any control flow here - so now phis or such.
>>>>
>>>> 2) Stores have explicit control - and would never be scheduled here
>>>> either.
>>>>
>>>> 3) Loads - they end up here because they can float. They only
>>>> matter if
>>>> there is a use dominated by the catch (after a merge of catch control
>>>> flow), or uses in more than one catch-proj branch. The only nodes
>>>> observed being cloned is LoadPNodes with barriers, BoolNodes, and CmpP
>>>> nodes. It's the same pattern of comparing a pointer. All other load
>>>> has
>>>> it's control in the catch-projs.
>>>>
>>>> I will add asserts to the clone in fixup_uses_in_catch to reflect this
>>>> conclusion and make sure that I catch any change in behavior.
>>> I was thinking of something like:
>>>
>>> try {
>>> non_inlined_call1();
>>> int v = some_object.object_field.int_field;
>>> non_inlined_call2(v, v);
>>> } catch (..) {
>>> int v = some_object.object_field.int_field;
>>> // some use for v
>>> }
>>>
>>> So there would be a LoadP, a load barrier and a LoadI right after the
>>> call. The LoadI is the first to be cloned. It has 3 uses, so it's
>>> cloned
>>> 3 times? Which would mean non_inlined_call2 is actually called with:
>>>
>>> SomeObject object = some_object.object_field;
>>> non_inlined_call2(object.int_field, object.int_field);
>>>
>>> the field is reloaded and that code doesn't have the same effect as
>>> above. Or am I missing something?
>>
>> The object_field usually gets a null-check which creates a control
>> flow that anchors the LoadI further down. I managed to find one case
>> in specjbb with a static call followed by a LoadL.
>>
>> I fixed this by calling the call_catch_cleanup recursively on these
>> load, when encountered. They will then be handled in the same way as
>> the LoadPs with a barrier.
>>
>> I also found an issue that I could get duplicate phis in blocks when
>> connecting the loads together. I fixed this by keeping track on which
>> regions has gotten a phi, and caching them.
>>
>> I've also added a number of small fixes after feedback from Erik Ö.
>>
>> New webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.06/
>>
>> Regards,
>>
>> Nils
>>
>>
>>>
>>> Roland.
More information about the hotspot-compiler-dev
mailing list