RFR (XS): 8023037 : Race between ciEnv::register_method and nmethod::make_not_entrant_or_zombie
Vladimir Kozlov
vladimir.kozlov at oracle.com
Thu Nov 7 10:00:59 PST 2013
It is better.
About next 2 lines. Where you get method? The method() is used to get pointer to java Method. And is_in_use() is
nmethod's method. In comments you need to say nmethod (and not method):
+ bool is_installed = method->code() == this || // method is in state 'alive' and installed
+ !method->is_in_use(); // method is installed and in state: 'not-entrant', 'zombie', or,
'unloaded'
Second comment too big, you can simple say: // or nmethod is not in 'alive' state.
Thanks,
Vladimir
On 11/7/13 6:50 AM, Vladimir Ivanov wrote:
> Vladimir, thank you for the feedback.
>
> What do you think about the following version?
> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.03/
>
> Best regards,
> Vladimir Ivanov
>
> On 11/7/13 2:52 AM, Vladimir Kozlov wrote:
>> Vladimir,
>>
>> Thank you for finding the real cause.
>>
>> I think you need the comment before nm->verify() in new_nmethod()
>> explaining why we should avoid safepoint as you explained here.
>>
>> I am not comfortable about passing new allow_safepoints parameter
>> through all verify calls because it is the only one case. We still have
>> not installed nmethod at this point. There should be some check
>> (nmethod's flags/state) which we can use to skip CompiledIC_lock.
>>
>> I thought about the need to verify scopes in verify_interrupt_point() in
>> all cases. But the verification code in ScopeDesc::verify() is
>> commented. What? In general it is good to call it anyway. And you don't
>> need IC to get the end_of_call() - I think we can inline what
>> CompiledIC_at() does: nativeCall_at(call_site)->end_of_call().
>>
>> This code calls CompiledIC_at() because it does IC verification which we
>> can skip in our case.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/6/13 6:57 AM, Vladimir Ivanov wrote:
>>> Very good point, Vladimir!
>>>
>>> I should have looked for the bug in a different place :-)
>>>
>>> The problem is triggered by class redefinition and it can be performed
>>> under Compile_lock or during safepoint. So, in ciEnv::register_method
>>> it's not enough to grab Compile_lock. In addition, we need to ensure
>>> that there are no safepoints possible during method installation to
>>> guarantee no dependency is invalidated.
>>>
>>> There's one place in nmethod verification code (see MutexLocker
>>> ml_verify (CompiledIC_lock) in nmethod::verify_interrupt_point) where
>>> safepoint is possible and it causes the failure.
>>>
>>> The bug is present only in non-product builds since nmethod verification
>>> is absent in product bits.
>>>
>>> Previous fix also works, but I decided to go another route and forbid
>>> safepoints at all while holding Compile_lock.
>>>
>>> Updated webrev (will appear once cr.ojn is up):
>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.02/
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 11/6/13 2:36 AM, Vladimir Kozlov wrote:
>>>> You need to be careful to avoid deadlock since there is call:
>>>>
>>>> 1035 old->make_not_entrant();
>>>>
>>>> Any way there is comment in register_method():
>>>>
>>>> 936 // Prevent SystemDictionary::add_to_hierarchy from running
>>>> 937 // and invalidating our dependencies until we install this
>>>> method.
>>>> 938 MutexLocker ml(Compile_lock);
>>>>
>>>> So how it happens that the method is invalidated by dependencies?
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/5/13 1:52 PM, Vladimir Ivanov wrote:
>>>>> Ouch, I was misled by it's name. So, I just lowered the likelihood of
>>>>> the failure then.
>>>>>
>>>>> make_not_entrant_or_zombie holds Patching_lock when it patches &
>>>>> unlinks
>>>>> a nmethod.
>>>>>
>>>>> Do you see any problems with using it to guard method installation
>>>>> (method->set_code(method, nm)) in register_method() as well?
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>> On 11/6/13 1:11 AM, Vladimir Kozlov wrote:
>>>>>> Note nmethodLocker is not lock:
>>>>>>
>>>>>> void nmethodLocker::lock_nmethod(nmethod* nm, bool zombie_ok) {
>>>>>> if (nm == NULL) return;
>>>>>> Atomic::inc(&nm->_lock_count);
>>>>>> guarantee(zombie_ok || !nm->is_zombie(), "cannot lock a zombie
>>>>>> method");
>>>>>> }
>>>>>>
>>>>>> The guard is nm->is_locked_by_vm() but neither
>>>>>> make_not_entrant_or_zombie () or register_method() use it.
>>>>>>
>>>>>> So how nmethodLocker helps here?
>>>>>>
>>>>>> Vladimir
>>>>>>
>>>>>> On 11/5/13 12:42 PM, Vladimir Ivanov wrote:
>>>>>>> Thanks! Missed that one. Fixed.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>> On 11/5/13 11:25 PM, Vladimir Kozlov wrote:
>>>>>>>> Should be
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01/
>>>>>>>>
>>>>>>>> What about this place:
>>>>>>>>
>>>>>>>> src/cpu/sparc/vm/sharedRuntime_sparc.cpp: if (StressNonEntrant) {
>>>>>>>>
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/5/13 11:10 AM, Vladimir Ivanov wrote:
>>>>>>>>> Updated fix:
>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>
>>>>>>>>> Removed broken StressNonEntrant code.
>>>>>>>>> Updated comments.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>> On 11/5/13 3:39 PM, Vladimir Ivanov wrote:
>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>>
>>>>>>>>>> There's a race between compiler thread during method registration
>>>>>>>>>> and
>>>>>>>>>> sweeper: sweeper can invalidate a nmethod which hasn't been
>>>>>>>>>> installed
>>>>>>>>>> yet.
>>>>>>>>>>
>>>>>>>>>> Consider the following scenario:
>>>>>>>>>> ciEnv::register_method:
>>>>>>>>>> - new nmethod(...)
>>>>>>>>>>
>>>>>>>>>> sweeper:
>>>>>>>>>> - invalidates newly allocated nmethod and patches VEP to call
>>>>>>>>>> handle_wrong_method
>>>>>>>>>> - tries to unlink it, but method()->from_compiled_entry() !=
>>>>>>>>>> verified_entry_point() since nmethod hasn't been installed yet
>>>>>>>>>>
>>>>>>>>>> ciEnv::register_method:
>>>>>>>>>> - installs already invalidated nmethod
>>>>>>>>>>
>>>>>>>>>> Calling corresponding Java method will lead to infinite loop:
>>>>>>>>>> VEP of
>>>>>>>>>> the
>>>>>>>>>> compiled method calls handle_wrong_method and call site resolution
>>>>>>>>>> returns the very same compiled method.
>>>>>>>>>>
>>>>>>>>>> The fix is to grab a lock after nmethod is allocated in the code
>>>>>>>>>> cache
>>>>>>>>>> and check that it hasn't been already invalidated by the sweeper
>>>>>>>>>> before
>>>>>>>>>> proceeding. Sweeper already synchronizes on a nmethod before
>>>>>>>>>> invalidating it.
>>>>>>>>>>
>>>>>>>>>> Testing: failing test w/ diagnostic output.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list