RFR (S) 5103339: Strengthen NoSafepointVerifier

Wed Aug 7 15:55:17 UTC 2019

Query below...

On 8/7/19 11:49 AM, coleen.phillimore at oracle.com wrote:
>
> Thanks David,
>
> On 8/6/19 10:36 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 7/08/2019 8:43 am, coleen.phillimore at oracle.com wrote:
>>>
>>> Thanks David for reading this.
>>>
>>> On 8/5/19 9:47 PM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> Could please update the bug explaining why it was re-opened and 
>>>> what the intended enhancement is. It was closed in 2005 as not 
>>>> necessary because of the unhandled-oop checking.
>>>
>>> Yes, I added a comment.
>>
>> Thanks!
>>
>>>>
>>>> On 6/08/2019 2:00 am, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Add NSV check at possible safepoint transition or places 
>>>>> that could take out locks. Consolidate with clearing unhandled oops.
>>>>>
>>>>> This change checks for NoSafepointVerifier no_safepoint_counts at 
>>>>> possible safepoints.  The starting set is at transitions, and in the 
>>>>
>>>> Wouldn't it be better placed in the actual Safepoint checking 
>>>> methods rather than callers of those methods?
>>>
>>> See my reply to Robbin.  We want to make the checks in places that 
>>> conditionally may not safepoint poll, so that timing doesn't prevent 
>>> finding when we've violated the NoSafepointVerifier check.
>>
>> Okay.
>>
>>>>
>>>>> "else" clauses where CHECK_UNHANDLED_OOPS were clearing unhandled 
>>>>> oops. Some of these were removed because they weren't places with 
>>>>> possible safepoints, so were wrong.
>>>>>
>>>>> The unhandled oops clearing and no_safepoint counter check are now 
>>>>> done in the same function.  MemAllocator -> 
>>>>> check_for_valid_allocation_state calls 
>>>>> check_for_valid_safepoint_state which calls check_possible_safepoint.
>>>>>
>>>>> Calls to check_possible_safepoint are in DEBUG_ONLY when 
>>>>> Thread::current() is called.
>>>>>
>>>>> I had to remove it from the else clause in JvmtiThreadState 
>>>>> because it's called from a place that cannot safepoint (see 
>>>>> vtableStubs.cpp).
>>>>
>>>> Can you elaborate on that further please. It's not obvious, without 
>>>> following call chains, when any code may or may not hit a safepoint 
>>>> check.
>>>
>>> No, it isn't.   There's quite a large call stack here, but the 
>>> CompiledICLocker has an embedded NoSafepointVerifier, so here's the 
>>> call stack:
>>>
>>> V  [libjvm.so+0x16c5d05] Thread::check_possible_safepoint()+0x55
>>> V  [libjvm.so+0x10fb15c] 
>>> JvmtiExport::post_dynamic_code_generated_while_holding_locks(char 
>>> const*, unsigned char*, unsigned char*)+0x4c
>>> V  [libjvm.so+0x17c8fac]  VtableStubs::find_stub(bool, int)+0x2ac
>>> V  [libjvm.so+0x9dd18e] CompiledIC::set_to_megamorphic(CallInfo*, 
>>> Bytecodes::Code, bool&, Thread*)+0x9e
>>> V  [libjvm.so+0x1562bb4] 
>>> SharedRuntime::handle_ic_miss_helper_internal(Handle, 
>>> CompiledMethod*, frame const&, methodHandle, Bytecodes::Code, 
>>> CallInfo&, bool&, Thread*)+0x434
>>> V  [libjvm.so+0x156d202] 
>>> SharedRuntime::handle_ic_miss_helper(JavaThread*, Thread*)+0x372
>>> V  [libjvm.so+0x156d7a4] 
>>> SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x184
>>>
>>> The function handle_ic_miss_helper_internal has a CompiledICLocker, 
>>> which has an embedded NoSafepointVerifier.
>>>
>>> This seems like a lot of code to be protected with a 
>>> NoSafepointVerifier, but I assume that at least the find_stub code 
>>> shouldn't get cleaned up by a safepoint, so would require it.
>>
>> Okay so because you added the call to jvmti_thread_state() in the 
>> vtableStubs code you have to remove the check from the else-clause in 
>> jvmti_thread_state(), because it could fail with the new call path.
>
> Yes, but I'm not happy with this so I've reinstated the check in the 
> 'else' clause and changed 
> post_dynamic_code_generated_while_holding_locks() to simply get the 
> jvmti_state directly and not call JvmtiThreadState::state_for().  It 
> already had an assert that it wasn't null.

Not calling JvmtiThreadState::state_for() caught my eye. There's a bit
of a protocol involved with safely getting the JvmtiThreadState, properly
caching it, and dealing with it going away.

What's the reason for not using JvmtiThreadState::state_for()?

Dan

>
> I'm retesting this and will have an 02 plus incremental next week.
>
>>> I also ran tests with an assert that the jvmti_thread_state() != 
>>> NULL, just to see if it's ever NULL and it wasn't.  I would have 
>>> liked to keep the else.
>>>>
>>>>> os.cpp ResourceMark needed for debugging.
>>>>
>>>> Interesting catch - I would have expected the RM to be higher up 
>>>> the call if needed. How did you detect this?
>>>
>>> Debugging, but I don't remember which thing I was debugging though.
>>>>
>>>>> Tested with tier1 on all Oracle platforms, and tier 1-3 on 
>>>>> linux-x64-debug.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2019/5103339.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-5103339
>>>>
>>>> src/hotspot/share/oops/objArrayKlass.cpp
>>>> src/hotspot/share/oops/typeArrayKlass.cpp
>>>>
>>>> Unclear why the check is predicated on !or_null ??
>>>
>>> If the parameter passed or_null is true, then the function never 
>>> safepoints. I added this comment.
>>>
>>>      // If passed or_null, this function will not try to safepoint.
>>>      if (!or_null) THREAD->check_possible_safepoint();
>>
>> Sorry I'm confused by the placement. We want to call 
>> check_possible_safepoint() on code paths that can safepoint, so that 
>> we ensure there is no NSV higher up the call chain when we take that 
>> path. So notwithstanding where the check_unhandled_oops was, instead of
>>
>>  352   } else {
>>  353     if (!or_null) THREAD->check_possible_safepoint();
>>  354   }
>>  355
>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>  357   if (or_null) {
>>  358     return ak->array_klass_or_null(n);
>>  359   }
>>  360   return ak->array_klass(n, THREAD);
>>
>> shouldn't/couldn't we just have:
>>
>>  352   }
>>  353
>>  354
>>  355
>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>  357   if (or_null) {
>>  358     return ak->array_klass_or_null(n);
>>  359   }
>>        THREAD->check_possible_safepoint();
>>  360   return ak->array_klass(n, THREAD);
>>
>> ?
>
> I couldn't decide if this is better or worse than what I had because 
> looking at this, one might question why this call is here, and it 
> checks the safepoint twice (already in the MutexLocker path).  It's a 
> few less lines of code.  okay, sure I can change it.  I didn't find a 
> better refactoring.
>
>> That said, are we certain that the xxx_or_null functions cannot 
>> safepoint? They won't try to load a class but surely they could 
>> potentially safepoint for other reasons?
>
> I don't see a safepoint in this code, or have found one while running 
> the tests.  Hopefully the NSV would have found it!
>
> Coleen
>
>>
>>> This is also called with NSV with or_null true.   I can't remember 
>>> where, I think it was the heap walker.
>>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/code/vtableStubs.cpp
>>>>
>>>> // cause a safepoint in this code that has NSV.
>>>>
>>>> Unclear what "code" has the NSV ??
>>>
>>> See above.  I can change the comment to:
>>>
>>> 237 // all locks. Only post this event if a new state is not 
>>> required. Creating a new state would
>>> 238 // cause a safepoint and the caller of this code has a 
>>> NoSafepointVerifier.
>>
>> Sounds good!
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>
>>>> As mentioned above why not put the check inside 
>>>> SafepointMechanism::block_if_requested instead?
>>>
>>> The other callers of block_if_requested already have a 
>>> check_possible_safepoint call. The only caller that made sense to 
>>> add was transition from or to vm.
>>
>> Got it.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/mutex.cpp
>>>>
>>>> The Mutex constructor cleanup is incidental but okay. (Does make me 
>>>> wonder about the whole Monitor construction process though ... 
>>>> ClearMonitor doesn't need to be protected any more.)
>>>
>>> This leaked into this change.  I'll take it out and post a different 
>>> RFR for it.  The no-arg constructor Monitor isn't needed either so 
>>> ClearMonitor can be deleted.
>>
>> Okay.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/thread.hpp
>>>>
>>>> // NSV checking
>>>> void check_for_valid_safepoint_state(bool potential_vm_operation) 
>>>> NOT_DEBUG_RETURN;
>>>> void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>>
>>>> Please expand NSV. Does the comment apply to both methods or only 
>>>> the first?
>>>
>>> Both methods.  I'll change it to:
>>>
>>>    // These functions check conditions on a JavaThread before 
>>> possibly going to a safepoint,
>>>    // including NoSafepointVerifier.
>>>    void check_for_valid_safepoint_state(bool potential_vm_operation) 
>>> NOT_DEBUG_RETURN;
>>>    void check_possible_safepoint() NOT_DEBUG_RETURN;
>>
>> Okay on the new comment.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> Thanks!
>>> Coleen
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
>