RFR: 8210498: nmethod entry barriers

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Oct 9 17:19:09 UTC 2018


Looks good.

Thanks,
Vladimir

On 10/9/18 8:52 AM, Erik Österlund wrote:
> Hi Vladimir,
> 
> Sounds like we are in agreement. Thanks for looking into this change.
> I updated the webrev with the added assembly printing for C2 as requested.
> 
> Incremental:
> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.01_02/
> 
> Full:
> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.02/
> 
> Thanks,
> /Erik
> 
> On 2018-10-09 16:42, Vladimir Kozlov wrote:
>> On 10/9/18 12:24 AM, Erik Österlund wrote:
>>> Hi Vladimir,
>>>
>>> Thank you for reviewing this.
>>>
>>> On 2018-10-08 19:57, Vladimir Kozlov wrote:
>>>> First, all nmethods have verified entry. MachPrologNode is generated by C2 for all nmethods including OSR (we need 
>>>> stack bang in all cases). OSR does not have *Unverified* entry represented by MachUEPNode. C1 compiled OSR nmethods 
>>>> are different - you may need your changes for it but I am not sure - ask C1 experts.
>>>>
>>>> There could be complications if an OSR nmethods is not executed (after barrier check) due to state of stack on OSR 
>>>> entry but it is different issue.
>>>
>>> I should have been more precise. OSR nmethods have a prologue for building frames, yes. But OSR entries do not have a 
>>> verified entry offset, in the sense that in e.g. c1_LIRAssembler.cpp, no CodeOffsets::Verified_Entry entry is set for 
>>> OSR entries (the lir_osr_entry case). It is set only for lir_std_entry. Similarly for C2, while they both have a 
>>> prologue, there is no CodeOffsets::Verified_Entry set for OSR nmethods. Read for example opto/compile.cpp:
>>>
>>>      if (is_osr_compilation()) {
>>>        _code_offsets.set_value(CodeOffsets::Verified_Entry, 0);
>>>        _code_offsets.set_value(CodeOffsets::OSR_Entry, _first_block_size);
>>>      } else {
>>>        _code_offsets.set_value(CodeOffsets::Verified_Entry, _first_block_size);
>>>        _code_offsets.set_value(CodeOffsets::OSR_Entry, 0);
>>>      }
>>>
>>> The different treatment of OSR methods for the entry barrier is analogous to the different treatment of OSR methods 
>>> when making them not entrant. OSR nmethods are made not entrant by detaching them from the Method* (so the runtime 
>>> won't find them), while non-OSR nmethods are made not entrant by patching a jump to the 
>>> SharedRuntime::get_handle_wrong_method_stub() into the verified entry. And that handler stub has particular ABI 
>>> requirements on the call site, that do not match the OSR call site.
>>>
>>> Similarly for nmethod entry barriers, if the callee is an OSR nmethod, it is much easier to check in the runtime 
>>> before the call if said nmethod is toast or not, than it is to check when the nmethod has entered, and then figure 
>>> out how to deoptimize from inside of the nmethod.
>>
>> Good. Note, I did not argued about changes for OSR nmethods, I was surprise by statement. Your explanation cleared it.
>>
>>>
>>>> I am curious why you are using next 2 lines everywhere?
>>>>
>>>> +  BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
>>>> +  bs->nmethod_entry_barrier(this);
>>>>
>>>> instead of one call:
>>>>
>>>> BarrierSet::barrier_set()->nmethod_entry_barrier(this);
>>>
>>> Because we never make straight calls to BarrierSet for inserting barriers. Instead, we use one out of the (currently) 
>>> 4 helpers that have their own responsibility. The BarrierSetAssembler is responsible for inserting barriers into 
>>> assembly code.
>>
>> So it is by GC Interface design? Okay.
>>
>>>
>>>> The same question about duplicated code in interpreterRuntime.cpp
>>>
>>> Same answer. ;)
>>>
>>>> Would be nice if you also update MachPrologNode::format() for case when barrier is used. It is used in 
>>>> PrintOptoAssembly.
>>>
>>> Yes, that is a good idea. I will adjust the printing code.
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Thanks,
>>> /Erik
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 10/8/18 1:19 AM, Erik Österlund wrote:
>>>>> Hi Per and Roman,
>>>>>
>>>>> Thank you for having a look at this.
>>>>> I went with Roman's suggestion and made this an explicit opt-out instead. But I let ModRef pass in NULL, as none of 
>>>>> the barrier sets in that category currently use nmethod entry barriers. When they start to do so, that can be 
>>>>> easily changed.
>>>>>
>>>>> Incremental:
>>>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.00_01/
>>>>>
>>>>> Full:
>>>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.01/
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>> On 2018-10-08 08:48, Roman Kennke wrote:
>>>>>> I agree. I wouldn't mind to keep it simple and pass NULL though. I'm not
>>>>>> a big fan of default arguments anyway.
>>>>>>
>>>>>> Roman
>>>>>>
>>>>>>> On 2018-10-05 17:07, Erik Österlund wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> In order to implement concurrent class unloading for ZGC, a mechanism
>>>>>>>> will be required to arm nmethods in a safepoint such that the first
>>>>>>>> entry into an nmethod after the pause triggers a barrier before
>>>>>>>> continuing.
>>>>>>>>
>>>>>>>> The barrier will be used for:
>>>>>>>>     * Patching immediate oops, and keeping phantomly reachable oops
>>>>>>>> alive through the nmethod entry
>>>>>>>>     * Catching calls from stale IC caches (due to class unloading) into
>>>>>>>> nmethods that are being unloaded (containing broken oops and
>>>>>>>> metadata), and lazily clean the IC cache and re-resolve the call.
>>>>>>>>
>>>>>>>> This implementation is for x86_64 and is based on the prototyping work
>>>>>>>> of Rickard Bäckman. So big thanks to Rickard, who will co-author of
>>>>>>>> this patch.
>>>>>>>> The way it works is that there is a cmp #offset(r15_thread),
>>>>>>>> simm32_epoch); je #continuation; inserted into the verified entry of
>>>>>>>> nmethods that conditionally calls a stub trampolining into the VM.
>>>>>>>> When the simm32_epoch is not equal to some TLS-local field, then a
>>>>>>>> slow path is taken to run some barrier inside of the VM.
>>>>>>>> By changing this TLS-local field in the safepoint, all verified
>>>>>>>> entries will take the slow path into the VM when run, where the
>>>>>>>> barrier is eventually disarmed by patching the simm32_epoch value of
>>>>>>>> the instruction stream to the current globally correct value.
>>>>>>>>
>>>>>>>> In terms of abstractions, to utilize this mechanism, you need to
>>>>>>>> create your BarierSet with an BarrierSetNMethod helper object, just
>>>>>>>> like the BarrierSetAssembler, BarrierSetC1 and BarrierSetC2. The
>>>>>>>> BarrierSetNMethod base class handles the barrier itself. It has a
>>>>>>>> virtual function, nmethod_entry_barrier, which gets called by the slow
>>>>>>>> paths. So each user of this mechanism inherits from BarrierSetNMethod
>>>>>>>> and puts its barrier in there.
>>>>>>>>
>>>>>>>> The OSR nmethods do not have a verified entry. Therefore, the barrier
>>>>>>>> is run explicitly in the runtime before calling into OSR nmethods to
>>>>>>>> achieve the same effect of trapping new calls into nmethods.
>>>>>>>>
>>>>>>>> The nmethod entry barrier returns true/false depending on whether the
>>>>>>>> called nmethod may be entered or not. This allows lazily cleaning
>>>>>>>> nmethod IC caches during unloading, which invokes the same helper
>>>>>>>> method that is patched into the verified entry when made not entrant.
>>>>>>>>
>>>>>>>> Eventually, I would like to also use this mechanism to remove the
>>>>>>>> nmethod hotness counter maintenance we today perform in safepoint
>>>>>>>> cleanup instead. These nmethod entry barriers can provide better
>>>>>>>> heuristics for whether nmethods are used or not. But that would
>>>>>>>> require buy-in from maintainers of other platforms. So let's take one
>>>>>>>> step at a time.
>>>>>>>>
>>>>>>>> For the curious reader thinking that this might be an expensive
>>>>>>>> operation, it is not. This check in the nmethod verified entry is very
>>>>>>>> predictable and cheap. And because of use of inlining, the checks
>>>>>>>> become more infrequent. In my evaluation, it has gone completely under
>>>>>>>> the radar in benchmarking (running this with my current version of
>>>>>>>> concurrent class unloading in ZGC).
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~eosterlund/8210498/webrev.00/
>>>>>>> I reviewed this before Erik sent it out, so my comments have already
>>>>>>> been addressed. Just found one additional thing, the BarrierSet
>>>>>>> constructor now looks like this:
>>>>>>>
>>>>>>> 96   BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>>>> 97              BarrierSetC1* barrier_set_c1,
>>>>>>> 98              BarrierSetC2* barrier_set_c2,
>>>>>>> 99              const FakeRtti& fake_rtti,
>>>>>>> 100             BarrierSetNMethod* barrier_set_nmethod = NULL) :
>>>>>>> 101     _fake_rtti(fake_rtti),
>>>>>>> 102     _barrier_set_assembler(barrier_set_assembler),
>>>>>>> 103     _barrier_set_c1(barrier_set_c1),
>>>>>>> 104     _barrier_set_c2(barrier_set_c2),
>>>>>>> 105     _barrier_set_nmethod(barrier_set_nmethod) {}
>>>>>>>
>>>>>>> I find it odd that the fake_rtti argument comes in the middle of the
>>>>>>> barrier_set_* arguments. I can see that you don't want GC barrier sets
>>>>>>> to have to supply NULL if they don't have a nmethod barrier set, but
>>>>>>> maybe we can have two constructors instead, like:
>>>>>>>
>>>>>>> BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>>>>             BarrierSetC1* barrier_set_c1,
>>>>>>>             BarrierSetC2* barrier_set_c2,
>>>>>>>             const FakeRtti& fake_rtti) :
>>>>>>>      _fake_rtti(fake_rtti),
>>>>>>>      _barrier_set_assembler(barrier_set_assembler),
>>>>>>>      _barrier_set_c1(barrier_set_c1),
>>>>>>>      _barrier_set_c2(barrier_set_c2),
>>>>>>>      _barrier_set_nmethod(NULL) {}
>>>>>>>
>>>>>>> BarrierSet(BarrierSetAssembler* barrier_set_assembler,
>>>>>>>             BarrierSetC1* barrier_set_c1,
>>>>>>>             BarrierSetC2* barrier_set_c2,
>>>>>>>             BarrierSetNMethod* barrier_set_nmethod,
>>>>>>>             const FakeRtti& fake_rtti,) :
>>>>>>>      _fake_rtti(fake_rtti),
>>>>>>>      _barrier_set_assembler(barrier_set_assembler),
>>>>>>>      _barrier_set_c1(barrier_set_c1),
>>>>>>>      _barrier_set_c2(barrier_set_c2),
>>>>>>>      _barrier_set_nmethod(barrier_set_nmethod) {}
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> Other than that, this looks good to me.
>>>>>>>
>>>>>>> cheers,
>>>>>>> Per
>>>>>>>
>>>>>>>> Bug ID:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210498
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Erik
>>>>>
>>>
> 


More information about the hotspot-dev mailing list