RFR (XS): 8023037 : Race between ciEnv::register_method and nmethod::make_not_entrant_or_zombie
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Thu Nov 7 13:24:18 PST 2013
Thank you, Vladimir.
I need 1 more Reviewer to look at it. Any volunteers? :-)
Best regards,
Vladimir Ivanov
On 11/8/13 12:24 AM, Vladimir Kozlov wrote:
> Good.
>
> Vladimir
>
> On 11/7/13 12:04 PM, Vladimir Ivanov wrote:
>> Oops, sorry about that... Fixed.
>>
>> This time I built it and ran the test to be 100% sure :-)
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 11/7/13 11:26 PM, Vladimir Kozlov wrote:
>>> No, should be method() and 'this' and add parenthesis:
>>>
>>> + bool is_installed = (method()->code() == this) // nmethod is in
>>> state 'alive' and installed
>>> + || !this->is_in_use(); // nmethod is
>>> installed, but not in 'alive' state
>>>
>>> Vladimir K
>>>
>>> On 11/7/13 10:35 AM, Vladimir Ivanov wrote:
>>>> Sorry, blindly copy-pasted the code.
>>>> Thanks for catching that.
>>>> Fixed (same link, hit refresh).
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> On 11/7/13 10:00 PM, Vladimir Kozlov wrote:
>>>>> It is better.
>>>>>
>>>>> About next 2 lines. Where you get method? The method() is used to get
>>>>> pointer to java Method. And is_in_use() is nmethod's method. In
>>>>> comments
>>>>> you need to say nmethod (and not method):
>>>>>
>>>>> + bool is_installed = method->code() == this || // method is in
>>>>> state
>>>>> 'alive' and installed
>>>>> + !method->is_in_use(); // method is
>>>>> installed
>>>>> and in state: 'not-entrant', 'zombie', or, 'unloaded'
>>>>>
>>>>> Second comment too big, you can simple say: // or nmethod is not in
>>>>> 'alive' state.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 11/7/13 6:50 AM, Vladimir Ivanov wrote:
>>>>>> Vladimir, thank you for the feedback.
>>>>>>
>>>>>> What do you think about the following version?
>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.03/
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> On 11/7/13 2:52 AM, Vladimir Kozlov wrote:
>>>>>>> Vladimir,
>>>>>>>
>>>>>>> Thank you for finding the real cause.
>>>>>>>
>>>>>>> I think you need the comment before nm->verify() in new_nmethod()
>>>>>>> explaining why we should avoid safepoint as you explained here.
>>>>>>>
>>>>>>> I am not comfortable about passing new allow_safepoints parameter
>>>>>>> through all verify calls because it is the only one case. We still
>>>>>>> have
>>>>>>> not installed nmethod at this point. There should be some check
>>>>>>> (nmethod's flags/state) which we can use to skip CompiledIC_lock.
>>>>>>>
>>>>>>> I thought about the need to verify scopes in
>>>>>>> verify_interrupt_point() in
>>>>>>> all cases. But the verification code in ScopeDesc::verify() is
>>>>>>> commented. What? In general it is good to call it anyway. And you
>>>>>>> don't
>>>>>>> need IC to get the end_of_call() - I think we can inline what
>>>>>>> CompiledIC_at() does: nativeCall_at(call_site)->end_of_call().
>>>>>>>
>>>>>>> This code calls CompiledIC_at() because it does IC verification
>>>>>>> which we
>>>>>>> can skip in our case.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 11/6/13 6:57 AM, Vladimir Ivanov wrote:
>>>>>>>> Very good point, Vladimir!
>>>>>>>>
>>>>>>>> I should have looked for the bug in a different place :-)
>>>>>>>>
>>>>>>>> The problem is triggered by class redefinition and it can be
>>>>>>>> performed
>>>>>>>> under Compile_lock or during safepoint. So, in
>>>>>>>> ciEnv::register_method
>>>>>>>> it's not enough to grab Compile_lock. In addition, we need to
>>>>>>>> ensure
>>>>>>>> that there are no safepoints possible during method installation to
>>>>>>>> guarantee no dependency is invalidated.
>>>>>>>>
>>>>>>>> There's one place in nmethod verification code (see MutexLocker
>>>>>>>> ml_verify (CompiledIC_lock) in nmethod::verify_interrupt_point)
>>>>>>>> where
>>>>>>>> safepoint is possible and it causes the failure.
>>>>>>>>
>>>>>>>> The bug is present only in non-product builds since nmethod
>>>>>>>> verification
>>>>>>>> is absent in product bits.
>>>>>>>>
>>>>>>>> Previous fix also works, but I decided to go another route and
>>>>>>>> forbid
>>>>>>>> safepoints at all while holding Compile_lock.
>>>>>>>>
>>>>>>>> Updated webrev (will appear once cr.ojn is up):
>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.02/
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Vladimir Ivanov
>>>>>>>>
>>>>>>>> On 11/6/13 2:36 AM, Vladimir Kozlov wrote:
>>>>>>>>> You need to be careful to avoid deadlock since there is call:
>>>>>>>>>
>>>>>>>>> 1035 old->make_not_entrant();
>>>>>>>>>
>>>>>>>>> Any way there is comment in register_method():
>>>>>>>>>
>>>>>>>>> 936 // Prevent SystemDictionary::add_to_hierarchy from
>>>>>>>>> running
>>>>>>>>> 937 // and invalidating our dependencies until we install
>>>>>>>>> this
>>>>>>>>> method.
>>>>>>>>> 938 MutexLocker ml(Compile_lock);
>>>>>>>>>
>>>>>>>>> So how it happens that the method is invalidated by dependencies?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>> On 11/5/13 1:52 PM, Vladimir Ivanov wrote:
>>>>>>>>>> Ouch, I was misled by it's name. So, I just lowered the
>>>>>>>>>> likelihood of
>>>>>>>>>> the failure then.
>>>>>>>>>>
>>>>>>>>>> make_not_entrant_or_zombie holds Patching_lock when it patches &
>>>>>>>>>> unlinks
>>>>>>>>>> a nmethod.
>>>>>>>>>>
>>>>>>>>>> Do you see any problems with using it to guard method
>>>>>>>>>> installation
>>>>>>>>>> (method->set_code(method, nm)) in register_method() as well?
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>
>>>>>>>>>> On 11/6/13 1:11 AM, Vladimir Kozlov wrote:
>>>>>>>>>>> Note nmethodLocker is not lock:
>>>>>>>>>>>
>>>>>>>>>>> void nmethodLocker::lock_nmethod(nmethod* nm, bool zombie_ok) {
>>>>>>>>>>> if (nm == NULL) return;
>>>>>>>>>>> Atomic::inc(&nm->_lock_count);
>>>>>>>>>>> guarantee(zombie_ok || !nm->is_zombie(), "cannot lock a
>>>>>>>>>>> zombie
>>>>>>>>>>> method");
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> The guard is nm->is_locked_by_vm() but neither
>>>>>>>>>>> make_not_entrant_or_zombie () or register_method() use it.
>>>>>>>>>>>
>>>>>>>>>>> So how nmethodLocker helps here?
>>>>>>>>>>>
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 11/5/13 12:42 PM, Vladimir Ivanov wrote:
>>>>>>>>>>>> Thanks! Missed that one. Fixed.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/5/13 11:25 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>>> Should be
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> What about this place:
>>>>>>>>>>>>>
>>>>>>>>>>>>> src/cpu/sparc/vm/sharedRuntime_sparc.cpp: if
>>>>>>>>>>>>> (StressNonEntrant) {
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/5/13 11:10 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>>>> Updated fix:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Removed broken StressNonEntrant code.
>>>>>>>>>>>>>> Updated comments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/5/13 3:39 PM, Vladimir Ivanov wrote:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There's a race between compiler thread during method
>>>>>>>>>>>>>>> registration
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> sweeper: sweeper can invalidate a nmethod which hasn't been
>>>>>>>>>>>>>>> installed
>>>>>>>>>>>>>>> yet.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Consider the following scenario:
>>>>>>>>>>>>>>> ciEnv::register_method:
>>>>>>>>>>>>>>> - new nmethod(...)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> sweeper:
>>>>>>>>>>>>>>> - invalidates newly allocated nmethod and patches
>>>>>>>>>>>>>>> VEP to
>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>> handle_wrong_method
>>>>>>>>>>>>>>> - tries to unlink it, but
>>>>>>>>>>>>>>> method()->from_compiled_entry() !=
>>>>>>>>>>>>>>> verified_entry_point() since nmethod hasn't been installed
>>>>>>>>>>>>>>> yet
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ciEnv::register_method:
>>>>>>>>>>>>>>> - installs already invalidated nmethod
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Calling corresponding Java method will lead to infinite
>>>>>>>>>>>>>>> loop:
>>>>>>>>>>>>>>> VEP of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> compiled method calls handle_wrong_method and call site
>>>>>>>>>>>>>>> resolution
>>>>>>>>>>>>>>> returns the very same compiled method.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The fix is to grab a lock after nmethod is allocated in the
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> cache
>>>>>>>>>>>>>>> and check that it hasn't been already invalidated by the
>>>>>>>>>>>>>>> sweeper
>>>>>>>>>>>>>>> before
>>>>>>>>>>>>>>> proceeding. Sweeper already synchronizes on a nmethod before
>>>>>>>>>>>>>>> invalidating it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Testing: failing test w/ diagnostic output.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list