RFR: 8210498: nmethod entry barriers

Tue Oct 9 14:42:03 UTC 2018

On 10/9/18 12:24 AM, Erik Österlund wrote:
> Hi Vladimir,
> 
> Thank you for reviewing this.
> 
> On 2018-10-08 19:57, Vladimir Kozlov wrote:
>> First, all nmethods have verified entry. MachPrologNode is generated by C2 for all nmethods including OSR (we need 
>> stack bang in all cases). OSR does not have *Unverified* entry represented by MachUEPNode. C1 compiled OSR nmethods 
>> are different - you may need your changes for it but I am not sure - ask C1 experts.
>>
>> There could be complications if an OSR nmethods is not executed (after barrier check) due to state of stack on OSR 
>> entry but it is different issue.
> 
> I should have been more precise. OSR nmethods have a prologue for building frames, yes. But OSR entries do not have a 
> verified entry offset, in the sense that in e.g. c1_LIRAssembler.cpp, no CodeOffsets::Verified_Entry entry is set for 
> OSR entries (the lir_osr_entry case). It is set only for lir_std_entry. Similarly for C2, while they both have a 
> prologue, there is no CodeOffsets::Verified_Entry set for OSR nmethods. Read for example opto/compile.cpp:
> 
>      if (is_osr_compilation()) {
>        _code_offsets.set_value(CodeOffsets::Verified_Entry, 0);
>        _code_offsets.set_value(CodeOffsets::OSR_Entry, _first_block_size);
>      } else {
>        _code_offsets.set_value(CodeOffsets::Verified_Entry, _first_block_size);
>        _code_offsets.set_value(CodeOffsets::OSR_Entry, 0);
>      }
> 
> The different treatment of OSR methods for the entry barrier is analogous to the different treatment of OSR methods when 
> making them not entrant. OSR nmethods are made not entrant by detaching them from the Method* (so the runtime won't find 
> them), while non-OSR nmethods are made not entrant by patching a jump to the 
> SharedRuntime::get_handle_wrong_method_stub() into the verified entry. And that handler stub has particular ABI 
> requirements on the call site, that do not match the OSR call site.
> 
> Similarly for nmethod entry barriers, if the callee is an OSR nmethod, it is much easier to check in the runtime before 
> the call if said nmethod is toast or not, than it is to check when the nmethod has entered, and then figure out how to 
> deoptimize from inside of the nmethod.

Good. Note, I did not argued about changes for OSR nmethods, I was surprise by statement. Your explanation cleared it.

> 
>> I am curious why you are using next 2 lines everywhere?
>>
>> +  BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
>> +  bs->nmethod_entry_barrier(this);
>>
>> instead of one call:
>>
>> BarrierSet::barrier_set()->nmethod_entry_barrier(this);
> 
> Because we never make straight calls to BarrierSet for inserting barriers. Instead, we use one out of the (currently) 4 
> helpers that have their own responsibility. The BarrierSetAssembler is responsible for inserting barriers into assembly 
> code.

So it is by GC Interface design? Okay.

> 
>> The same question about duplicated code in interpreterRuntime.cpp
> 
> Same answer. ;)
> 
>> Would be nice if you also update MachPrologNode::format() for case when barrier is used. It is used in PrintOptoAssembly.
> 
> Yes, that is a good idea. I will adjust the printing code.

Thanks,
Vladimir

> 
> Thanks,
> /Erik
> 
>> Thanks,
>> Vladimir
>>
>> On 10/8/18 1:19 AM, Erik Österlund wrote:
>>> Hi Per and Roman,
>>>
>>> Thank you for having a look at this.
>>> I went with Roman's suggestion and made this an explicit opt-out instead. But I let ModRef pass in NULL, as none of 
>>> the barrier sets in that category currently use nmethod entry barriers. When they start to do so, that can be easily 
>>> changed.
>>>
>>> Incremental:
>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.00_01/
>>>
>>> Full:
>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.01/
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2018-10-08 08:48, Roman Kennke wrote:
>>>> I agree. I wouldn't mind to keep it simple and pass NULL though. I'm not
>>>> a big fan of default arguments anyway.
>>>>
>>>> Roman
>>>>
>>>>> On 2018-10-05 17:07, Erik Österlund wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In order to implement concurrent class unloading for ZGC, a mechanism
>>>>>> will be required to arm nmethods in a safepoint such that the first
>>>>>> entry into an nmethod after the pause triggers a barrier before
>>>>>> continuing.
>>>>>>
>>>>>> The barrier will be used for:
>>>>>>     * Patching immediate oops, and keeping phantomly reachable oops
>>>>>> alive through the nmethod entry
>>>>>>     * Catching calls from stale IC caches (due to class unloading) into
>>>>>> nmethods that are being unloaded (containing broken oops and
>>>>>> metadata), and lazily clean the IC cache and re-resolve the call.
>>>>>>
>>>>>> This implementation is for x86_64 and is based on the prototyping work
>>>>>> of Rickard Bäckman. So big thanks to Rickard, who will co-author of
>>>>>> this patch.
>>>>>> The way it works is that there is a cmp #offset(r15_thread),
>>>>>> simm32_epoch); je #continuation; inserted into the verified entry of
>>>>>> nmethods that conditionally calls a stub trampolining into the VM.
>>>>>> When the simm32_epoch is not equal to some TLS-local field, then a
>>>>>> slow path is taken to run some barrier inside of the VM.
>>>>>> By changing this TLS-local field in the safepoint, all verified
>>>>>> entries will take the slow path into the VM when run, where the
>>>>>> barrier is eventually disarmed by patching the simm32_epoch value of
>>>>>> the instruction stream to the current globally correct value.
>>>>>>
>>>>>> In terms of abstractions, to utilize this mechanism, you need to
>>>>>> create your BarierSet with an BarrierSetNMethod helper object, just
>>>>>> like the BarrierSetAssembler, BarrierSetC1 and BarrierSetC2. The
>>>>>> BarrierSetNMethod base class handles the barrier itself. It has a
>>>>>> virtual function, nmethod_entry_barrier, which gets called by the slow
>>>>>> paths. So each user of this mechanism inherits from BarrierSetNMethod
>>>>>> and puts its barrier in there.
>>>>>>
>>>>>> The OSR nmethods do not have a verified entry. Therefore, the barrier
>>>>>> is run explicitly in the runtime before calling into OSR nmethods to
>>>>>> achieve the same effect of trapping new calls into nmethods.
>>>>>>
>>>>>> The nmethod entry barrier returns true/false depending on whether the
>>>>>> called nmethod may be entered or not. This allows lazily cleaning
>>>>>> nmethod IC caches during unloading, which invokes the same helper
>>>>>> method that is patched into the verified entry when made not entrant.
>>>>>>
>>>>>> Eventually, I would like to also use this mechanism to remove the
>>>>>> nmethod hotness counter maintenance we today perform in safepoint
>>>>>> cleanup instead. These nmethod entry barriers can provide better
>>>>>> heuristics for whether nmethods are used or not. But that would
>>>>>> require buy-in from maintainers of other platforms. So let's take one
>>>>>> step at a time.
>>>>>>
>>>>>> For the curious reader thinking that this might be an expensive
>>>>>> operation, it is not. This check in the nmethod verified entry is very
>>>>>> predictable and cheap. And because of use of inlining, the checks
>>>>>> become more infrequent. In my evaluation, it has gone completely under
>>>>>> the radar in benchmarking (running this with my current version of
>>>>>> concurrent class unloading in ZGC).
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.00/
>>>>> I reviewed this before Erik sent it out, so my comments have already
>>>>> been addressed. Just found one additional thing, the BarrierSet
>>>>> constructor now looks like this:
>>>>>
>>>>> 96   BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>> 97              BarrierSetC1* barrier_set_c1,
>>>>> 98              BarrierSetC2* barrier_set_c2,
>>>>> 99              const FakeRtti& fake_rtti,
>>>>> 100             BarrierSetNMethod* barrier_set_nmethod = NULL) :
>>>>> 101     _fake_rtti(fake_rtti),
>>>>> 102     _barrier_set_assembler(barrier_set_assembler),
>>>>> 103     _barrier_set_c1(barrier_set_c1),
>>>>> 104     _barrier_set_c2(barrier_set_c2),
>>>>> 105     _barrier_set_nmethod(barrier_set_nmethod) {}
>>>>>
>>>>> I find it odd that the fake_rtti argument comes in the middle of the
>>>>> barrier_set_* arguments. I can see that you don't want GC barrier sets
>>>>> to have to supply NULL if they don't have a nmethod barrier set, but
>>>>> maybe we can have two constructors instead, like:
>>>>>
>>>>> BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>>             BarrierSetC1* barrier_set_c1,
>>>>>             BarrierSetC2* barrier_set_c2,
>>>>>             const FakeRtti& fake_rtti) :
>>>>>      _fake_rtti(fake_rtti),
>>>>>      _barrier_set_assembler(barrier_set_assembler),
>>>>>      _barrier_set_c1(barrier_set_c1),
>>>>>      _barrier_set_c2(barrier_set_c2),
>>>>>      _barrier_set_nmethod(NULL) {}
>>>>>
>>>>> BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>>             BarrierSetC1* barrier_set_c1,
>>>>>             BarrierSetC2* barrier_set_c2,
>>>>>             BarrierSetNMethod* barrier_set_nmethod,
>>>>>             const FakeRtti& fake_rtti,) :
>>>>>      _fake_rtti(fake_rtti),
>>>>>      _barrier_set_assembler(barrier_set_assembler),
>>>>>      _barrier_set_c1(barrier_set_c1),
>>>>>      _barrier_set_c2(barrier_set_c2),
>>>>>      _barrier_set_nmethod(barrier_set_nmethod) {}
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Other than that, this looks good to me.
>>>>>
>>>>> cheers,
>>>>> Per
>>>>>
>>>>>> Bug ID:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210498
>>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>
>