RFR (S) 5103339: Strengthen NoSafepointVerifier

Wed Aug 14 00:47:47 UTC 2019

On 8/7/19 7:18 PM, David Holmes wrote:
> Hi Coleen,
>
> One follow-up:
>
> >> That said, are we certain that the xxx_or_null functions cannot
> >> safepoint? They won't try to load a class but surely they could
> >> potentially safepoint for other reasons?
> >
> > I don't see a safepoint in this code, or have found one while running
> > the tests.  Hopefully the NSV would have found it!
>
> My question is really whether these functions are semantically 
> guaranteed to never safepoint? 

No, these functions _do_ safepoint taking out a lock.
> Is that a property that these functions expose to their callers? I 
> don't recall seeing anything to that affect, so it seems to me this is 
> just how the implementation happens to be, but is not guaranteed to be 
> so. If there was a NSV in the call path leading to this then I would 
> have to ask the person who put in the NSV on what basis they think 
> this mass of code should not ever hit a safepoint.
There isn't a property of these or any functions that guarantee they 
won't safepoint.  Some people have claimed having a TRAPS argumnent is a 
good indicator, but many functions without TRAPS may take out a 
MutexLocker which can safepoint.  I think it's safe to assume most 
functions safepoint and taking out a NSV must be careful and call into 
code that is contained.  The change is that NSV should assert if that 
condition is not met.

The fact that this code conditionally safepoints makes it less obvious, 
but it's still worth checking possible safepoints here.

In the or_null case, there actually isn't a caller with a NSV.   In a 
future patch, I made calling this code with a MutexLocker with 
no_safepoint_check_flag and allow_vm_block = true imply a NSV.  The 
caller must use the array_klass_or_null() function or it will safepoint.

I made the changes you suggested.

Incremental: http://cr.openjdk.java.net/~coleenp/2019/5103339.02.incr/webrev
Full:  http://cr.openjdk.java.net/~coleenp/2019/5103339.02/webrev

Thanks,
Coleen

>
> Cheers,
> David
>
> On 8/08/2019 1:49 am, coleen.phillimore at oracle.com wrote:
>>
>> Thanks David,
>>
>> On 8/6/19 10:36 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 7/08/2019 8:43 am, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Thanks David for reading this.
>>>>
>>>> On 8/5/19 9:47 PM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Could please update the bug explaining why it was re-opened and 
>>>>> what the intended enhancement is. It was closed in 2005 as not 
>>>>> necessary because of the unhandled-oop checking.
>>>>
>>>> Yes, I added a comment.
>>>
>>> Thanks!
>>>
>>>>>
>>>>> On 6/08/2019 2:00 am, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add NSV check at possible safepoint transition or places 
>>>>>> that could take out locks. Consolidate with clearing unhandled oops.
>>>>>>
>>>>>> This change checks for NoSafepointVerifier no_safepoint_counts at 
>>>>>> possible safepoints.  The starting set is at transitions, and in the 
>>>>>
>>>>> Wouldn't it be better placed in the actual Safepoint checking 
>>>>> methods rather than callers of those methods?
>>>>
>>>> See my reply to Robbin.  We want to make the checks in places that 
>>>> conditionally may not safepoint poll, so that timing doesn't 
>>>> prevent finding when we've violated the NoSafepointVerifier check.
>>>
>>> Okay.
>>>
>>>>>
>>>>>> "else" clauses where CHECK_UNHANDLED_OOPS were clearing unhandled 
>>>>>> oops. Some of these were removed because they weren't places with 
>>>>>> possible safepoints, so were wrong.
>>>>>>
>>>>>> The unhandled oops clearing and no_safepoint counter check are 
>>>>>> now done in the same function.  MemAllocator -> 
>>>>>> check_for_valid_allocation_state calls 
>>>>>> check_for_valid_safepoint_state which calls 
>>>>>> check_possible_safepoint.
>>>>>>
>>>>>> Calls to check_possible_safepoint are in DEBUG_ONLY when 
>>>>>> Thread::current() is called.
>>>>>>
>>>>>> I had to remove it from the else clause in JvmtiThreadState 
>>>>>> because it's called from a place that cannot safepoint (see 
>>>>>> vtableStubs.cpp).
>>>>>
>>>>> Can you elaborate on that further please. It's not obvious, 
>>>>> without following call chains, when any code may or may not hit a 
>>>>> safepoint check.
>>>>
>>>> No, it isn't.   There's quite a large call stack here, but the 
>>>> CompiledICLocker has an embedded NoSafepointVerifier, so here's the 
>>>> call stack:
>>>>
>>>> V  [libjvm.so+0x16c5d05] Thread::check_possible_safepoint()+0x55
>>>> V  [libjvm.so+0x10fb15c] 
>>>> JvmtiExport::post_dynamic_code_generated_while_holding_locks(char 
>>>> const*, unsigned char*, unsigned char*)+0x4c
>>>> V  [libjvm.so+0x17c8fac]  VtableStubs::find_stub(bool, int)+0x2ac
>>>> V  [libjvm.so+0x9dd18e] CompiledIC::set_to_megamorphic(CallInfo*, 
>>>> Bytecodes::Code, bool&, Thread*)+0x9e
>>>> V  [libjvm.so+0x1562bb4] 
>>>> SharedRuntime::handle_ic_miss_helper_internal(Handle, 
>>>> CompiledMethod*, frame const&, methodHandle, Bytecodes::Code, 
>>>> CallInfo&, bool&, Thread*)+0x434
>>>> V  [libjvm.so+0x156d202] 
>>>> SharedRuntime::handle_ic_miss_helper(JavaThread*, Thread*)+0x372
>>>> V  [libjvm.so+0x156d7a4] 
>>>> SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x184
>>>>
>>>> The function handle_ic_miss_helper_internal has a CompiledICLocker, 
>>>> which has an embedded NoSafepointVerifier.
>>>>
>>>> This seems like a lot of code to be protected with a 
>>>> NoSafepointVerifier, but I assume that at least the find_stub code 
>>>> shouldn't get cleaned up by a safepoint, so would require it.
>>>
>>> Okay so because you added the call to jvmti_thread_state() in the 
>>> vtableStubs code you have to remove the check from the else-clause 
>>> in jvmti_thread_state(), because it could fail with the new call path.
>>
>> Yes, but I'm not happy with this so I've reinstated the check in the 
>> 'else' clause and changed 
>> post_dynamic_code_generated_while_holding_locks() to simply get the 
>> jvmti_state directly and not call JvmtiThreadState::state_for().  It 
>> already had an assert that it wasn't null.
>>
>> I'm retesting this and will have an 02 plus incremental next week.
>>
>>>> I also ran tests with an assert that the jvmti_thread_state() != 
>>>> NULL, just to see if it's ever NULL and it wasn't.  I would have 
>>>> liked to keep the else.
>>>>>
>>>>>> os.cpp ResourceMark needed for debugging.
>>>>>
>>>>> Interesting catch - I would have expected the RM to be higher up 
>>>>> the call if needed. How did you detect this?
>>>>
>>>> Debugging, but I don't remember which thing I was debugging though.
>>>>>
>>>>>> Tested with tier1 on all Oracle platforms, and tier 1-3 on 
>>>>>> linux-x64-debug.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/5103339.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-5103339
>>>>>
>>>>> src/hotspot/share/oops/objArrayKlass.cpp
>>>>> src/hotspot/share/oops/typeArrayKlass.cpp
>>>>>
>>>>> Unclear why the check is predicated on !or_null ??
>>>>
>>>> If the parameter passed or_null is true, then the function never 
>>>> safepoints. I added this comment.
>>>>
>>>>      // If passed or_null, this function will not try to safepoint.
>>>>      if (!or_null) THREAD->check_possible_safepoint();
>>>
>>> Sorry I'm confused by the placement. We want to call 
>>> check_possible_safepoint() on code paths that can safepoint, so that 
>>> we ensure there is no NSV higher up the call chain when we take that 
>>> path. So notwithstanding where the check_unhandled_oops was, instead of
>>>
>>>  352   } else {
>>>  353     if (!or_null) THREAD->check_possible_safepoint();
>>>  354   }
>>>  355
>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>  357   if (or_null) {
>>>  358     return ak->array_klass_or_null(n);
>>>  359   }
>>>  360   return ak->array_klass(n, THREAD);
>>>
>>> shouldn't/couldn't we just have:
>>>
>>>  352   }
>>>  353
>>>  354
>>>  355
>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>  357   if (or_null) {
>>>  358     return ak->array_klass_or_null(n);
>>>  359   }
>>>        THREAD->check_possible_safepoint();
>>>  360   return ak->array_klass(n, THREAD);
>>>
>>> ?
>>
>> I couldn't decide if this is better or worse than what I had because 
>> looking at this, one might question why this call is here, and it 
>> checks the safepoint twice (already in the MutexLocker path).  It's a 
>> few less lines of code.  okay, sure I can change it.  I didn't find a 
>> better refactoring.
>>
>>> That said, are we certain that the xxx_or_null functions cannot 
>>> safepoint? They won't try to load a class but surely they could 
>>> potentially safepoint for other reasons?
>>
>> I don't see a safepoint in this code, or have found one while running 
>> the tests.  Hopefully the NSV would have found it!
>>
>> Coleen
>>
>>>
>>>> This is also called with NSV with or_null true.   I can't remember 
>>>> where, I think it was the heap walker.
>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/code/vtableStubs.cpp
>>>>>
>>>>> // cause a safepoint in this code that has NSV.
>>>>>
>>>>> Unclear what "code" has the NSV ??
>>>>
>>>> See above.  I can change the comment to:
>>>>
>>>> 237 // all locks. Only post this event if a new state is not 
>>>> required. Creating a new state would
>>>> 238 // cause a safepoint and the caller of this code has a 
>>>> NoSafepointVerifier.
>>>
>>> Sounds good!
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>
>>>>> As mentioned above why not put the check inside 
>>>>> SafepointMechanism::block_if_requested instead?
>>>>
>>>> The other callers of block_if_requested already have a 
>>>> check_possible_safepoint call. The only caller that made sense to 
>>>> add was transition from or to vm.
>>>
>>> Got it.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>
>>>>> The Mutex constructor cleanup is incidental but okay. (Does make 
>>>>> me wonder about the whole Monitor construction process though ... 
>>>>> ClearMonitor doesn't need to be protected any more.)
>>>>
>>>> This leaked into this change.  I'll take it out and post a 
>>>> different RFR for it.  The no-arg constructor Monitor isn't needed 
>>>> either so ClearMonitor can be deleted.
>>>
>>> Okay.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/thread.hpp
>>>>>
>>>>> // NSV checking
>>>>> void check_for_valid_safepoint_state(bool potential_vm_operation) 
>>>>> NOT_DEBUG_RETURN;
>>>>> void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>>>
>>>>> Please expand NSV. Does the comment apply to both methods or only 
>>>>> the first?
>>>>
>>>> Both methods.  I'll change it to:
>>>>
>>>>    // These functions check conditions on a JavaThread before 
>>>> possibly going to a safepoint,
>>>>    // including NoSafepointVerifier.
>>>>    void check_for_valid_safepoint_state(bool 
>>>> potential_vm_operation) NOT_DEBUG_RETURN;
>>>>    void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>
>>> Okay on the new comment.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>
>>>> Thanks!
>>>> Coleen
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>
>>