RFR(XL): 8224675: Late GC barrier insertion for ZGC

Nils Eliasson nils.eliasson at oracle.com
Fri Jun 7 12:45:58 UTC 2019


Hi Erik,

I uploaded webrev with the friend declaration removed.

I also changed back the shape of how the barrier_insertion is called 
from Compile.cpp. Then that will call PhaseIdealLoop::optimize and use 
it in the same way as Shenandoah does GC specific loop opts.

I also added a clearing of major progress after the barrier insertion. 
Otherwise some LoopLimit nodes won't be removed correctly in the 
following macro expansion phase.

Thank you!

Regards,

/ Nils

On 2019-06-06 21:06, Erik Österlund wrote:
> Hi Nils,
>
> One final nit... could you please remove the now seemingly unnecessary 
> "friend class ZBarrierSetC2" in PhaseIdealLoop? I don't need another 
> webrev for that.
>
> This looks good. Great job Nils. This will improve the robustness of 
> our barriers.
>
> /Erik
>
> On 2019-06-06 20:03, Nils Eliasson wrote:
>>
>> On 2019-06-05 10:20, Roland Westrelin wrote:
>>>> Ah - I think I get it. You mean like this:
>>>>
>>>> void ZBarrierSetC2::barrier_insertion_phase(PhaseIterGVN& igvn)const {
>>>>     PhaseIdealLoop ideal_loop(igvn,LoopOptsNone);
>>>>
>>>>     // First make sure all loads between call and catch are moved 
>>>> to the
>>>> catch block clean_catch_blocks(&ideal_loop);
>>>>
>>>>     // Then expand barriers on all loads 
>>>> insert_load_barriers(&ideal_loop);
>>>>
>>>>     // Handle all Unsafe that need barriers. 
>>>> insert_barriers_on_unsafe(&ideal_loop);
>>>>
>>>>     // Cleanup any modified bits igvn.optimize();
>>>>
>>>>     igvn.C->clear_major_progress();
>>>> }
>>>>
>>>> An excellent idea. Then I can remove the new 
>>>> LoopOptsMode::BarrierInsertion.
>>> I was thinking something like what we do in Shenandoah for barrier
>>> expansion:
>>>
>>> PhaseIdealLoop ideal_loop(igvn, LoopOptsShenandoahExpand);
>>>
>>> ShenandoahBarrierSetC2::is_gc_specific_loop_opts_pass() returns true 
>>> for
>>> LoopOptsShenandoahExpand and ShenandoahBarrierSetC2::optimize_loops()
>>> handles LoopOptsShenandoahExpand. So there's nothing shenandoah 
>>> specific
>>> in PhaseIdealLoop::build_and_optimize().
>>
>> I followed your example and changed my implementation inline with that.
>>
>>
>>>
>>>> I've been running some experiments with asserts on the clone code.
>>>>
>>>> 1) There can never be any control flow here - so now phis or such.
>>>>
>>>> 2) Stores have explicit control - and would never be scheduled here 
>>>> either.
>>>>
>>>> 3) Loads - they end up here because they can float. They only 
>>>> matter if
>>>> there is a use dominated by the catch (after a merge of catch control
>>>> flow), or uses in more than one catch-proj branch. The only nodes
>>>> observed being cloned is LoadPNodes with barriers, BoolNodes, and CmpP
>>>> nodes. It's the same pattern of comparing a pointer. All other load 
>>>> has
>>>> it's control in the catch-projs.
>>>>
>>>> I will add asserts to the clone in fixup_uses_in_catch to reflect this
>>>> conclusion and make sure that I catch any change in behavior.
>>> I was thinking of something like:
>>>
>>> try {
>>>    non_inlined_call1();
>>>    int v = some_object.object_field.int_field;
>>>    non_inlined_call2(v, v);
>>> } catch (..) {
>>>    int v = some_object.object_field.int_field;
>>>    // some use for v
>>> }
>>>
>>> So there would be a LoadP, a load barrier and a LoadI right after the
>>> call. The LoadI is the first to be cloned. It has 3 uses, so it's 
>>> cloned
>>> 3 times? Which would mean non_inlined_call2 is actually called with:
>>>
>>>    SomeObject object = some_object.object_field;
>>>    non_inlined_call2(object.int_field, object.int_field);
>>>
>>> the field is reloaded and that code doesn't have the same effect as
>>> above. Or am I missing something?
>>
>> The object_field usually gets a null-check which creates a control 
>> flow that anchors the LoadI further down. I managed to find one case 
>> in specjbb with a static call followed by a LoadL.
>>
>> I fixed this by calling the call_catch_cleanup recursively on these 
>> load, when encountered. They will then be handled in the same way as 
>> the LoadPs with a barrier.
>>
>> I also found an issue that I could get duplicate phis in blocks when 
>> connecting the loads together. I fixed this by keeping track on which 
>> regions has gotten a phi, and caching them.
>>
>> I've also added a number of small fixes after feedback from Erik Ö.
>>
>> New webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.06/
>>
>> Regards,
>>
>> Nils
>>
>>
>>>
>>> Roland.


More information about the hotspot-compiler-dev mailing list