RFR (S) 5103339: Strengthen NoSafepointVerifier

Wed Aug 7 17:52:22 UTC 2019

On 8/7/19 11:55 AM, Daniel D. Daugherty wrote:
> Query below...
>
> On 8/7/19 11:49 AM, coleen.phillimore at oracle.com wrote:
>>
>> Thanks David,
>>
>> On 8/6/19 10:36 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 7/08/2019 8:43 am, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Thanks David for reading this.
>>>>
>>>> On 8/5/19 9:47 PM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Could please update the bug explaining why it was re-opened and 
>>>>> what the intended enhancement is. It was closed in 2005 as not 
>>>>> necessary because of the unhandled-oop checking.
>>>>
>>>> Yes, I added a comment.
>>>
>>> Thanks!
>>>
>>>>>
>>>>> On 6/08/2019 2:00 am, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add NSV check at possible safepoint transition or places 
>>>>>> that could take out locks. Consolidate with clearing unhandled oops.
>>>>>>
>>>>>> This change checks for NoSafepointVerifier no_safepoint_counts at 
>>>>>> possible safepoints.  The starting set is at transitions, and in the 
>>>>>
>>>>> Wouldn't it be better placed in the actual Safepoint checking 
>>>>> methods rather than callers of those methods?
>>>>
>>>> See my reply to Robbin.  We want to make the checks in places that 
>>>> conditionally may not safepoint poll, so that timing doesn't 
>>>> prevent finding when we've violated the NoSafepointVerifier check.
>>>
>>> Okay.
>>>
>>>>>
>>>>>> "else" clauses where CHECK_UNHANDLED_OOPS were clearing unhandled 
>>>>>> oops. Some of these were removed because they weren't places with 
>>>>>> possible safepoints, so were wrong.
>>>>>>
>>>>>> The unhandled oops clearing and no_safepoint counter check are 
>>>>>> now done in the same function.  MemAllocator -> 
>>>>>> check_for_valid_allocation_state calls 
>>>>>> check_for_valid_safepoint_state which calls 
>>>>>> check_possible_safepoint.
>>>>>>
>>>>>> Calls to check_possible_safepoint are in DEBUG_ONLY when 
>>>>>> Thread::current() is called.
>>>>>>
>>>>>> I had to remove it from the else clause in JvmtiThreadState 
>>>>>> because it's called from a place that cannot safepoint (see 
>>>>>> vtableStubs.cpp).
>>>>>
>>>>> Can you elaborate on that further please. It's not obvious, 
>>>>> without following call chains, when any code may or may not hit a 
>>>>> safepoint check.
>>>>
>>>> No, it isn't.   There's quite a large call stack here, but the 
>>>> CompiledICLocker has an embedded NoSafepointVerifier, so here's the 
>>>> call stack:
>>>>
>>>> V  [libjvm.so+0x16c5d05] Thread::check_possible_safepoint()+0x55
>>>> V  [libjvm.so+0x10fb15c] 
>>>> JvmtiExport::post_dynamic_code_generated_while_holding_locks(char 
>>>> const*, unsigned char*, unsigned char*)+0x4c
>>>> V  [libjvm.so+0x17c8fac]  VtableStubs::find_stub(bool, int)+0x2ac
>>>> V  [libjvm.so+0x9dd18e] CompiledIC::set_to_megamorphic(CallInfo*, 
>>>> Bytecodes::Code, bool&, Thread*)+0x9e
>>>> V  [libjvm.so+0x1562bb4] 
>>>> SharedRuntime::handle_ic_miss_helper_internal(Handle, 
>>>> CompiledMethod*, frame const&, methodHandle, Bytecodes::Code, 
>>>> CallInfo&, bool&, Thread*)+0x434
>>>> V  [libjvm.so+0x156d202] 
>>>> SharedRuntime::handle_ic_miss_helper(JavaThread*, Thread*)+0x372
>>>> V  [libjvm.so+0x156d7a4] 
>>>> SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x184
>>>>
>>>> The function handle_ic_miss_helper_internal has a CompiledICLocker, 
>>>> which has an embedded NoSafepointVerifier.
>>>>
>>>> This seems like a lot of code to be protected with a 
>>>> NoSafepointVerifier, but I assume that at least the find_stub code 
>>>> shouldn't get cleaned up by a safepoint, so would require it.
>>>
>>> Okay so because you added the call to jvmti_thread_state() in the 
>>> vtableStubs code you have to remove the check from the else-clause 
>>> in jvmti_thread_state(), because it could fail with the new call path.
>>
>> Yes, but I'm not happy with this so I've reinstated the check in the 
>> 'else' clause and changed 
>> post_dynamic_code_generated_while_holding_locks() to simply get the 
>> jvmti_state directly and not call JvmtiThreadState::state_for().  It 
>> already had an assert that it wasn't null.
>
> Not calling JvmtiThreadState::state_for() caught my eye. There's a bit
> of a protocol involved with safely getting the JvmtiThreadState, properly
> caching it, and dealing with it going away.
>
> What's the reason for not using JvmtiThreadState::state_for()?

This is what JvmtiThreadState::state_for() does:

inline JvmtiThreadState* JvmtiThreadState::state_for(JavaThread *thread) {
   JvmtiThreadState *state = thread->jvmti_thread_state();
   if (state == NULL) {
     MutexLocker mu(JvmtiThreadState_lock);
     // check again with the lock held
     state = state_for_while_locked(thread);
   } else {
     // If the state was be null, then this could have safepointed.
     thread->check_possible_safepoint();
   }
   return state;
}

If the state returned is non-null it just returns it.  There's special 
code for the state being NULL by checking whether the thread is 
exiting.   When called from VtableStubs::find_stub, it calls 
post_dynamic_code_generated_while_holding_locks.

I actually copied a comment from several jvmtiExport.cpp functions in this:

// post a DynamicCodeGenerated event while holding locks in the VM.
void JvmtiExport::post_dynamic_code_generated_while_holding_locks(const 
char* name,
address code_begin, address code_end)
{
   // register the stub with the current dynamic code event collector
   // Can not take safepoint here so can not use state_for to get
   // jvmti thread state.
   JvmtiThreadState* state = JavaThread::current()->jvmti_thread_state();
   // state can only be NULL if the current thread is exiting which
   // should not happen since we're trying to post an event
...

Coleen

>
> Dan
>
>>
>> I'm retesting this and will have an 02 plus incremental next week.
>>
>>>> I also ran tests with an assert that the jvmti_thread_state() != 
>>>> NULL, just to see if it's ever NULL and it wasn't.  I would have 
>>>> liked to keep the else.
>>>>>
>>>>>> os.cpp ResourceMark needed for debugging.
>>>>>
>>>>> Interesting catch - I would have expected the RM to be higher up 
>>>>> the call if needed. How did you detect this?
>>>>
>>>> Debugging, but I don't remember which thing I was debugging though.
>>>>>
>>>>>> Tested with tier1 on all Oracle platforms, and tier 1-3 on 
>>>>>> linux-x64-debug.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/5103339.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-5103339
>>>>>
>>>>> src/hotspot/share/oops/objArrayKlass.cpp
>>>>> src/hotspot/share/oops/typeArrayKlass.cpp
>>>>>
>>>>> Unclear why the check is predicated on !or_null ??
>>>>
>>>> If the parameter passed or_null is true, then the function never 
>>>> safepoints. I added this comment.
>>>>
>>>>      // If passed or_null, this function will not try to safepoint.
>>>>      if (!or_null) THREAD->check_possible_safepoint();
>>>
>>> Sorry I'm confused by the placement. We want to call 
>>> check_possible_safepoint() on code paths that can safepoint, so that 
>>> we ensure there is no NSV higher up the call chain when we take that 
>>> path. So notwithstanding where the check_unhandled_oops was, instead of
>>>
>>>  352   } else {
>>>  353     if (!or_null) THREAD->check_possible_safepoint();
>>>  354   }
>>>  355
>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>  357   if (or_null) {
>>>  358     return ak->array_klass_or_null(n);
>>>  359   }
>>>  360   return ak->array_klass(n, THREAD);
>>>
>>> shouldn't/couldn't we just have:
>>>
>>>  352   }
>>>  353
>>>  354
>>>  355
>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>  357   if (or_null) {
>>>  358     return ak->array_klass_or_null(n);
>>>  359   }
>>>        THREAD->check_possible_safepoint();
>>>  360   return ak->array_klass(n, THREAD);
>>>
>>> ?
>>
>> I couldn't decide if this is better or worse than what I had because 
>> looking at this, one might question why this call is here, and it 
>> checks the safepoint twice (already in the MutexLocker path).  It's a 
>> few less lines of code.  okay, sure I can change it.  I didn't find a 
>> better refactoring.
>>
>>> That said, are we certain that the xxx_or_null functions cannot 
>>> safepoint? They won't try to load a class but surely they could 
>>> potentially safepoint for other reasons?
>>
>> I don't see a safepoint in this code, or have found one while running 
>> the tests.  Hopefully the NSV would have found it!
>>
>> Coleen
>>
>>>
>>>> This is also called with NSV with or_null true.   I can't remember 
>>>> where, I think it was the heap walker.
>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/code/vtableStubs.cpp
>>>>>
>>>>> // cause a safepoint in this code that has NSV.
>>>>>
>>>>> Unclear what "code" has the NSV ??
>>>>
>>>> See above.  I can change the comment to:
>>>>
>>>> 237 // all locks. Only post this event if a new state is not 
>>>> required. Creating a new state would
>>>> 238 // cause a safepoint and the caller of this code has a 
>>>> NoSafepointVerifier.
>>>
>>> Sounds good!
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>
>>>>> As mentioned above why not put the check inside 
>>>>> SafepointMechanism::block_if_requested instead?
>>>>
>>>> The other callers of block_if_requested already have a 
>>>> check_possible_safepoint call. The only caller that made sense to 
>>>> add was transition from or to vm.
>>>
>>> Got it.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>
>>>>> The Mutex constructor cleanup is incidental but okay. (Does make 
>>>>> me wonder about the whole Monitor construction process though ... 
>>>>> ClearMonitor doesn't need to be protected any more.)
>>>>
>>>> This leaked into this change.  I'll take it out and post a 
>>>> different RFR for it.  The no-arg constructor Monitor isn't needed 
>>>> either so ClearMonitor can be deleted.
>>>
>>> Okay.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/thread.hpp
>>>>>
>>>>> // NSV checking
>>>>> void check_for_valid_safepoint_state(bool potential_vm_operation) 
>>>>> NOT_DEBUG_RETURN;
>>>>> void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>>>
>>>>> Please expand NSV. Does the comment apply to both methods or only 
>>>>> the first?
>>>>
>>>> Both methods.  I'll change it to:
>>>>
>>>>    // These functions check conditions on a JavaThread before 
>>>> possibly going to a safepoint,
>>>>    // including NoSafepointVerifier.
>>>>    void check_for_valid_safepoint_state(bool 
>>>> potential_vm_operation) NOT_DEBUG_RETURN;
>>>>    void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>
>>> Okay on the new comment.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>
>>>> Thanks!
>>>> Coleen
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>
>>
>