RFR (S) 5103339: Strengthen NoSafepointVerifier

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Wed Aug 14 14:35:49 UTC 2019



On 8/14/19 8:52 AM, David Holmes wrote:
> Hi Coleen,
>
> On 14/08/2019 10:47 am, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 8/7/19 7:18 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> One follow-up:
>>>
>>> >> That said, are we certain that the xxx_or_null functions cannot
>>> >> safepoint? They won't try to load a class but surely they could
>>> >> potentially safepoint for other reasons?
>>> >
>>> > I don't see a safepoint in this code, or have found one while running
>>> > the tests.  Hopefully the NSV would have found it!
>>>
>>> My question is really whether these functions are semantically 
>>> guaranteed to never safepoint? 
>>
>> No, these functions _do_ safepoint taking out a lock.
>
> I think we're getting our wires crossed somewhere. The code you have 
> is this:
>
>     if (or_null) {
>       return ak->array_klass_or_null(n);
>     }
> +   THREAD->check_possible_safepoint();
>     return ak->array_klass(n, THREAD);
>
> so we expect the array_klass() call to safepoint and hence we check 
> that is allowed before taking that path. We don't check the path that 
> calls array_klass_or_null() which suggests we don't expect it to take 
> a safepoint. My point is that AFAIK there is nothing that guarantees 
> array_klass_or_null() will not perform a safepoint check, so anyone 
> assuming it doesn't may be mistaken. Which suggests we should be 
> calling check_possible_safepoint() unconditionally so that both paths 
> are covered.

Hi,  I think there's some confusion here.  The function 
array_klass_or_null() calls array_klass_impl(or_null == true), and that 
function's callers expect it not to safepoint.   I have another patch 
that I can elaborate in more details.

The array_klass_impl(or_null == false) which is called by array_klass() 
can safepoint when taking out a mutex in the 'if' condition above:

  329   // lock-free read needs acquire semantics
  330   if (higher_dimension_acquire() == NULL) {
  331     if (or_null) return NULL;
  332
  333     ResourceMark rm;
  334     JavaThread *jt = (JavaThread *)THREAD;
  335     {
  336       // Ensure atomic creation of higher dimensions
  337       MutexLocker mu(MultiArray_lock, THREAD);



So this function will safepoint if higher_dimension is NULL and or_null 
is false.   We want to have this unconditionally check for safepoints 
when the or_null is false, but higher_dimension is not null, essentially 
unconditionally when array_klass() is called. The MutexLocker checks 
possible safepoint inside, so we needed this in the 'else' clause.   Or 
in the location below.

This is somewhat tortured logic because this function is to be called in 
different contexts.
>
>>> Is that a property that these functions expose to their callers? I 
>>> don't recall seeing anything to that affect, so it seems to me this 
>>> is just how the implementation happens to be, but is not guaranteed 
>>> to be so. If there was a NSV in the call path leading to this then I 
>>> would have to ask the person who put in the NSV on what basis they 
>>> think this mass of code should not ever hit a safepoint.
>> There isn't a property of these or any functions that guarantee they 
>> won't safepoint.  Some people have claimed having a TRAPS argumnent 
>> is a good indicator, but many functions without TRAPS may take out a 
>> MutexLocker which can safepoint.  I think it's safe to assume most 
>> functions safepoint and taking out a NSV must be careful and call 
>> into code that is contained.  The change is that NSV should assert if 
>> that condition is not met.
>>
>> The fact that this code conditionally safepoints makes it less 
>> obvious, but it's still worth checking possible safepoints here.
>>
>> In the or_null case, there actually isn't a caller with a NSV. In a 
>> future patch, I made calling this code with a MutexLocker with 
>> no_safepoint_check_flag and allow_vm_block = true imply a NSV.  The 
>> caller must use the array_klass_or_null() function or it will safepoint.
>
>> I made the changes you suggested.
>
> And changes others suggested :)
>
>> Incremental: 
>> http://cr.openjdk.java.net/~coleenp/2019/5103339.02.incr/webrev
>> Full: http://cr.openjdk.java.net/~coleenp/2019/5103339.02/webrev
>
> This all seems fine to me. With regards to the discussion above, what 
> you have done is replace the old call to clear_unhandled_oops with the 
> new call to check_possible_safepoint under the same conditions. I'm 
> querying whether those original conditions were actually too strict, 
> but that is in a sense outside the scope of this change.

The check unhandled oops case might have been too strict but was 
probably benign because the caller in the or_null == true case didn't 
have any unhandled oops that were subsequently dereferenced.   Ie 
unhandled oops shouldn't have been checked there but we didn't know it.

Thanks for the review and discussions.
Coleen

>
> Thanks,
> David
>
>> Thanks,
>> Coleen
>>
>>>
>>> Cheers,
>>> David
>>>
>>> On 8/08/2019 1:49 am, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Thanks David,
>>>>
>>>> On 8/6/19 10:36 PM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> On 7/08/2019 8:43 am, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Thanks David for reading this.
>>>>>>
>>>>>> On 8/5/19 9:47 PM, David Holmes wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> Could please update the bug explaining why it was re-opened and 
>>>>>>> what the intended enhancement is. It was closed in 2005 as not 
>>>>>>> necessary because of the unhandled-oop checking.
>>>>>>
>>>>>> Yes, I added a comment.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>>>
>>>>>>> On 6/08/2019 2:00 am, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: Add NSV check at possible safepoint transition or 
>>>>>>>> places that could take out locks. Consolidate with clearing 
>>>>>>>> unhandled oops.
>>>>>>>>
>>>>>>>> This change checks for NoSafepointVerifier no_safepoint_counts 
>>>>>>>> at possible safepoints.  The starting set is at transitions, 
>>>>>>>> and in the 
>>>>>>>
>>>>>>> Wouldn't it be better placed in the actual Safepoint checking 
>>>>>>> methods rather than callers of those methods?
>>>>>>
>>>>>> See my reply to Robbin.  We want to make the checks in places 
>>>>>> that conditionally may not safepoint poll, so that timing doesn't 
>>>>>> prevent finding when we've violated the NoSafepointVerifier check.
>>>>>
>>>>> Okay.
>>>>>
>>>>>>>
>>>>>>>> "else" clauses where CHECK_UNHANDLED_OOPS were clearing 
>>>>>>>> unhandled oops. Some of these were removed because they weren't 
>>>>>>>> places with possible safepoints, so were wrong.
>>>>>>>>
>>>>>>>> The unhandled oops clearing and no_safepoint counter check are 
>>>>>>>> now done in the same function. MemAllocator -> 
>>>>>>>> check_for_valid_allocation_state calls 
>>>>>>>> check_for_valid_safepoint_state which calls 
>>>>>>>> check_possible_safepoint.
>>>>>>>>
>>>>>>>> Calls to check_possible_safepoint are in DEBUG_ONLY when 
>>>>>>>> Thread::current() is called.
>>>>>>>>
>>>>>>>> I had to remove it from the else clause in JvmtiThreadState 
>>>>>>>> because it's called from a place that cannot safepoint (see 
>>>>>>>> vtableStubs.cpp).
>>>>>>>
>>>>>>> Can you elaborate on that further please. It's not obvious, 
>>>>>>> without following call chains, when any code may or may not hit 
>>>>>>> a safepoint check.
>>>>>>
>>>>>> No, it isn't.   There's quite a large call stack here, but the 
>>>>>> CompiledICLocker has an embedded NoSafepointVerifier, so here's 
>>>>>> the call stack:
>>>>>>
>>>>>> V  [libjvm.so+0x16c5d05] Thread::check_possible_safepoint()+0x55
>>>>>> V  [libjvm.so+0x10fb15c] 
>>>>>> JvmtiExport::post_dynamic_code_generated_while_holding_locks(char 
>>>>>> const*, unsigned char*, unsigned char*)+0x4c
>>>>>> V  [libjvm.so+0x17c8fac]  VtableStubs::find_stub(bool, int)+0x2ac
>>>>>> V  [libjvm.so+0x9dd18e] CompiledIC::set_to_megamorphic(CallInfo*, 
>>>>>> Bytecodes::Code, bool&, Thread*)+0x9e
>>>>>> V  [libjvm.so+0x1562bb4] 
>>>>>> SharedRuntime::handle_ic_miss_helper_internal(Handle, 
>>>>>> CompiledMethod*, frame const&, methodHandle, Bytecodes::Code, 
>>>>>> CallInfo&, bool&, Thread*)+0x434
>>>>>> V  [libjvm.so+0x156d202] 
>>>>>> SharedRuntime::handle_ic_miss_helper(JavaThread*, Thread*)+0x372
>>>>>> V  [libjvm.so+0x156d7a4] 
>>>>>> SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x184
>>>>>>
>>>>>> The function handle_ic_miss_helper_internal has a 
>>>>>> CompiledICLocker, which has an embedded NoSafepointVerifier.
>>>>>>
>>>>>> This seems like a lot of code to be protected with a 
>>>>>> NoSafepointVerifier, but I assume that at least the find_stub 
>>>>>> code shouldn't get cleaned up by a safepoint, so would require it.
>>>>>
>>>>> Okay so because you added the call to jvmti_thread_state() in the 
>>>>> vtableStubs code you have to remove the check from the else-clause 
>>>>> in jvmti_thread_state(), because it could fail with the new call 
>>>>> path.
>>>>
>>>> Yes, but I'm not happy with this so I've reinstated the check in 
>>>> the 'else' clause and changed 
>>>> post_dynamic_code_generated_while_holding_locks() to simply get the 
>>>> jvmti_state directly and not call JvmtiThreadState::state_for().  
>>>> It already had an assert that it wasn't null.
>>>>
>>>> I'm retesting this and will have an 02 plus incremental next week.
>>>>
>>>>>> I also ran tests with an assert that the jvmti_thread_state() != 
>>>>>> NULL, just to see if it's ever NULL and it wasn't.  I would have 
>>>>>> liked to keep the else.
>>>>>>>
>>>>>>>> os.cpp ResourceMark needed for debugging.
>>>>>>>
>>>>>>> Interesting catch - I would have expected the RM to be higher up 
>>>>>>> the call if needed. How did you detect this?
>>>>>>
>>>>>> Debugging, but I don't remember which thing I was debugging though.
>>>>>>>
>>>>>>>> Tested with tier1 on all Oracle platforms, and tier 1-3 on 
>>>>>>>> linux-x64-debug.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/5103339.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-5103339
>>>>>>>
>>>>>>> src/hotspot/share/oops/objArrayKlass.cpp
>>>>>>> src/hotspot/share/oops/typeArrayKlass.cpp
>>>>>>>
>>>>>>> Unclear why the check is predicated on !or_null ??
>>>>>>
>>>>>> If the parameter passed or_null is true, then the function never 
>>>>>> safepoints. I added this comment.
>>>>>>
>>>>>>      // If passed or_null, this function will not try to safepoint.
>>>>>>      if (!or_null) THREAD->check_possible_safepoint();
>>>>>
>>>>> Sorry I'm confused by the placement. We want to call 
>>>>> check_possible_safepoint() on code paths that can safepoint, so 
>>>>> that we ensure there is no NSV higher up the call chain when we 
>>>>> take that path. So notwithstanding where the check_unhandled_oops 
>>>>> was, instead of
>>>>>
>>>>>  352   } else {
>>>>>  353     if (!or_null) THREAD->check_possible_safepoint();
>>>>>  354   }
>>>>>  355
>>>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>>>  357   if (or_null) {
>>>>>  358     return ak->array_klass_or_null(n);
>>>>>  359   }
>>>>>  360   return ak->array_klass(n, THREAD);
>>>>>
>>>>> shouldn't/couldn't we just have:
>>>>>
>>>>>  352   }
>>>>>  353
>>>>>  354
>>>>>  355
>>>>>  356   ObjArrayKlass *ak = ObjArrayKlass::cast(higher_dimension());
>>>>>  357   if (or_null) {
>>>>>  358     return ak->array_klass_or_null(n);
>>>>>  359   }
>>>>>        THREAD->check_possible_safepoint();
>>>>>  360   return ak->array_klass(n, THREAD);
>>>>>
>>>>> ?
>>>>
>>>> I couldn't decide if this is better or worse than what I had 
>>>> because looking at this, one might question why this call is here, 
>>>> and it checks the safepoint twice (already in the MutexLocker 
>>>> path).  It's a few less lines of code.  okay, sure I can change 
>>>> it.  I didn't find a better refactoring.
>>>>
>>>>> That said, are we certain that the xxx_or_null functions cannot 
>>>>> safepoint? They won't try to load a class but surely they could 
>>>>> potentially safepoint for other reasons?
>>>>
>>>> I don't see a safepoint in this code, or have found one while 
>>>> running the tests.  Hopefully the NSV would have found it!
>>>>
>>>> Coleen
>>>>
>>>>>
>>>>>> This is also called with NSV with or_null true.   I can't 
>>>>>> remember where, I think it was the heap walker.
>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/code/vtableStubs.cpp
>>>>>>>
>>>>>>> // cause a safepoint in this code that has NSV.
>>>>>>>
>>>>>>> Unclear what "code" has the NSV ??
>>>>>>
>>>>>> See above.  I can change the comment to:
>>>>>>
>>>>>> 237 // all locks. Only post this event if a new state is not 
>>>>>> required. Creating a new state would
>>>>>> 238 // cause a safepoint and the caller of this code has a 
>>>>>> NoSafepointVerifier.
>>>>>
>>>>> Sounds good!
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>>>
>>>>>>> As mentioned above why not put the check inside 
>>>>>>> SafepointMechanism::block_if_requested instead?
>>>>>>
>>>>>> The other callers of block_if_requested already have a 
>>>>>> check_possible_safepoint call. The only caller that made sense to 
>>>>>> add was transition from or to vm.
>>>>>
>>>>> Got it.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>>>
>>>>>>> The Mutex constructor cleanup is incidental but okay. (Does make 
>>>>>>> me wonder about the whole Monitor construction process though 
>>>>>>> ... ClearMonitor doesn't need to be protected any more.)
>>>>>>
>>>>>> This leaked into this change.  I'll take it out and post a 
>>>>>> different RFR for it.  The no-arg constructor Monitor isn't 
>>>>>> needed either so ClearMonitor can be deleted.
>>>>>
>>>>> Okay.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/thread.hpp
>>>>>>>
>>>>>>> // NSV checking
>>>>>>> void check_for_valid_safepoint_state(bool 
>>>>>>> potential_vm_operation) NOT_DEBUG_RETURN;
>>>>>>> void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>>>>>
>>>>>>> Please expand NSV. Does the comment apply to both methods or 
>>>>>>> only the first?
>>>>>>
>>>>>> Both methods.  I'll change it to:
>>>>>>
>>>>>>    // These functions check conditions on a JavaThread before 
>>>>>> possibly going to a safepoint,
>>>>>>    // including NoSafepointVerifier.
>>>>>>    void check_for_valid_safepoint_state(bool 
>>>>>> potential_vm_operation) NOT_DEBUG_RETURN;
>>>>>>    void check_possible_safepoint() NOT_DEBUG_RETURN;
>>>>>
>>>>> Okay on the new comment.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Coleen
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>
>>>>
>>



More information about the hotspot-runtime-dev mailing list