RFR (XS): 8023037 : Race between ciEnv::register_method and nmethod::make_not_entrant_or_zombie

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Nov 7 12:24:15 PST 2013


Good.

Vladimir

On 11/7/13 12:04 PM, Vladimir Ivanov wrote:
> Oops, sorry about that... Fixed.
>
> This time I built it and ran the test to be 100% sure :-)
>
> Best regards,
> Vladimir Ivanov
>
> On 11/7/13 11:26 PM, Vladimir Kozlov wrote:
>> No, should be method() and 'this' and add parenthesis:
>>
>> +   bool is_installed = (method()->code() == this) // nmethod is in
>> state 'alive' and installed
>> +                       || !this->is_in_use();     // nmethod is
>> installed, but not in 'alive' state
>>
>> Vladimir K
>>
>> On 11/7/13 10:35 AM, Vladimir Ivanov wrote:
>>> Sorry, blindly copy-pasted the code.
>>> Thanks for catching that.
>>> Fixed (same link, hit refresh).
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 11/7/13 10:00 PM, Vladimir Kozlov wrote:
>>>> It is better.
>>>>
>>>> About next 2 lines. Where you get method? The method() is used to get
>>>> pointer to java Method. And is_in_use() is nmethod's method. In
>>>> comments
>>>> you need to say nmethod (and not method):
>>>>
>>>> +   bool is_installed = method->code() == this || // method is in state
>>>> 'alive' and installed
>>>> +                       !method->is_in_use();     // method is
>>>> installed
>>>> and in state: 'not-entrant', 'zombie', or, 'unloaded'
>>>>
>>>> Second comment too big, you can simple say: // or nmethod is not in
>>>> 'alive' state.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/7/13 6:50 AM, Vladimir Ivanov wrote:
>>>>> Vladimir, thank you for the feedback.
>>>>>
>>>>> What do you think about the following version?
>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.03/
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>> On 11/7/13 2:52 AM, Vladimir Kozlov wrote:
>>>>>> Vladimir,
>>>>>>
>>>>>> Thank you for finding the real cause.
>>>>>>
>>>>>> I think you need the comment before nm->verify() in new_nmethod()
>>>>>> explaining why we should avoid safepoint as you explained here.
>>>>>>
>>>>>> I am not comfortable about passing new allow_safepoints parameter
>>>>>> through all verify calls because it is the only one case. We still
>>>>>> have
>>>>>> not installed nmethod at this point. There should be some check
>>>>>> (nmethod's flags/state) which we can use to skip CompiledIC_lock.
>>>>>>
>>>>>> I thought about the need to verify scopes in
>>>>>> verify_interrupt_point() in
>>>>>> all cases. But the verification code in ScopeDesc::verify() is
>>>>>> commented. What? In general it is good to call it anyway. And you
>>>>>> don't
>>>>>> need IC to get the end_of_call() - I think we can inline what
>>>>>> CompiledIC_at() does: nativeCall_at(call_site)->end_of_call().
>>>>>>
>>>>>> This code calls CompiledIC_at() because it does IC verification
>>>>>> which we
>>>>>> can skip in our case.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 11/6/13 6:57 AM, Vladimir Ivanov wrote:
>>>>>>> Very good point, Vladimir!
>>>>>>>
>>>>>>> I should have looked for the bug in a different place :-)
>>>>>>>
>>>>>>> The problem is triggered by class redefinition and it can be
>>>>>>> performed
>>>>>>> under Compile_lock or during safepoint. So, in
>>>>>>> ciEnv::register_method
>>>>>>> it's not enough to grab Compile_lock. In addition, we need to ensure
>>>>>>> that there are no safepoints possible during method installation to
>>>>>>> guarantee no dependency is invalidated.
>>>>>>>
>>>>>>> There's one place in nmethod verification code (see MutexLocker
>>>>>>> ml_verify (CompiledIC_lock) in nmethod::verify_interrupt_point)
>>>>>>> where
>>>>>>> safepoint is possible and it causes the failure.
>>>>>>>
>>>>>>> The bug is present only in non-product builds since nmethod
>>>>>>> verification
>>>>>>> is absent in product bits.
>>>>>>>
>>>>>>> Previous fix also works, but I decided to go another route and
>>>>>>> forbid
>>>>>>> safepoints at all while holding Compile_lock.
>>>>>>>
>>>>>>> Updated webrev (will appear once cr.ojn is up):
>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.02/
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>> On 11/6/13 2:36 AM, Vladimir Kozlov wrote:
>>>>>>>> You need to be careful to avoid deadlock since there is call:
>>>>>>>>
>>>>>>>> 1035               old->make_not_entrant();
>>>>>>>>
>>>>>>>> Any way there is comment in register_method():
>>>>>>>>
>>>>>>>>   936     // Prevent SystemDictionary::add_to_hierarchy from
>>>>>>>> running
>>>>>>>>   937     // and invalidating our dependencies until we install
>>>>>>>> this
>>>>>>>> method.
>>>>>>>>   938     MutexLocker ml(Compile_lock);
>>>>>>>>
>>>>>>>> So how it happens that the method is invalidated by dependencies?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/5/13 1:52 PM, Vladimir Ivanov wrote:
>>>>>>>>> Ouch, I was misled by it's name. So, I just lowered the
>>>>>>>>> likelihood of
>>>>>>>>> the failure then.
>>>>>>>>>
>>>>>>>>> make_not_entrant_or_zombie holds Patching_lock when it patches &
>>>>>>>>> unlinks
>>>>>>>>> a nmethod.
>>>>>>>>>
>>>>>>>>> Do you see any problems with using it to guard method installation
>>>>>>>>> (method->set_code(method, nm)) in register_method() as well?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>> On 11/6/13 1:11 AM, Vladimir Kozlov wrote:
>>>>>>>>>> Note nmethodLocker is not lock:
>>>>>>>>>>
>>>>>>>>>> void nmethodLocker::lock_nmethod(nmethod* nm, bool zombie_ok) {
>>>>>>>>>>    if (nm == NULL)  return;
>>>>>>>>>>    Atomic::inc(&nm->_lock_count);
>>>>>>>>>>    guarantee(zombie_ok || !nm->is_zombie(), "cannot lock a zombie
>>>>>>>>>> method");
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> The guard is nm->is_locked_by_vm() but neither
>>>>>>>>>> make_not_entrant_or_zombie () or register_method() use it.
>>>>>>>>>>
>>>>>>>>>> So how nmethodLocker helps here?
>>>>>>>>>>
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 11/5/13 12:42 PM, Vladimir Ivanov wrote:
>>>>>>>>>>> Thanks! Missed that one. Fixed.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>
>>>>>>>>>>> On 11/5/13 11:25 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>> Should be
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> What about this place:
>>>>>>>>>>>>
>>>>>>>>>>>> src/cpu/sparc/vm/sharedRuntime_sparc.cpp:  if
>>>>>>>>>>>> (StressNonEntrant) {
>>>>>>>>>>>>
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/5/13 11:10 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>>> Updated fix:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Removed broken StressNonEntrant code.
>>>>>>>>>>>>> Updated comments.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/5/13 3:39 PM, Vladimir Ivanov wrote:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023037/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There's a race between compiler thread during method
>>>>>>>>>>>>>> registration
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> sweeper: sweeper can invalidate a nmethod which hasn't been
>>>>>>>>>>>>>> installed
>>>>>>>>>>>>>> yet.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Consider the following scenario:
>>>>>>>>>>>>>>    ciEnv::register_method:
>>>>>>>>>>>>>>      - new nmethod(...)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    sweeper:
>>>>>>>>>>>>>>      - invalidates newly allocated nmethod and patches VEP to
>>>>>>>>>>>>>> call
>>>>>>>>>>>>>> handle_wrong_method
>>>>>>>>>>>>>>      - tries to unlink it, but
>>>>>>>>>>>>>> method()->from_compiled_entry() !=
>>>>>>>>>>>>>> verified_entry_point() since nmethod hasn't been installed
>>>>>>>>>>>>>> yet
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    ciEnv::register_method:
>>>>>>>>>>>>>>      - installs already invalidated nmethod
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Calling corresponding Java method will lead to infinite loop:
>>>>>>>>>>>>>> VEP of
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> compiled method calls handle_wrong_method and call site
>>>>>>>>>>>>>> resolution
>>>>>>>>>>>>>> returns the very same compiled method.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The fix is to grab a lock after nmethod is allocated in the
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>> cache
>>>>>>>>>>>>>> and check that it hasn't been already invalidated by the
>>>>>>>>>>>>>> sweeper
>>>>>>>>>>>>>> before
>>>>>>>>>>>>>> proceeding. Sweeper already synchronizes on a nmethod before
>>>>>>>>>>>>>> invalidating it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Testing: failing test w/ diagnostic output.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Vladimir Ivanov


More information about the hotspot-compiler-dev mailing list