RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks

Fri Jan 24 22:42:21 UTC 2020

On 25/01/2020 4:18 am, Daniel D. Daugherty wrote:
> A follow up on audit_and_print_stats()...
> 
> On 1/24/20 11:48 AM, Daniel D. Daugherty wrote:
>> On 1/23/20 9:09 PM, David Holmes wrote:
> 
> <snip>
> 
>>
>>>
>>>>
>>>>> ---
>>>>>
>>>>> 2189     // With fine grained locks on the monitor lists, it is 
>>>>> possible for
>>>>> 2190     // log_monitor_list_counts() to return a value that 
>>>>> doesn't match
>>>>> 2191     // LVars.population. So far a higher value has been seen 
>>>>> in testing
>>>>> 2192     // so something is being double counted by 
>>>>> log_monitor_list_counts().
>>>>>
>>>>> AFAICS this code is always executed at a safepoint so I don't 
>>>>> understand how the counts could be out-of-sync. Ditto for the 
>>>>> comments here:
>>>>>
>>>>> 2315     // With fine grained locks on LVars.free_list, it is 
>>>>> possible for an
>>>>> 2316     // ObjectMonitor to be prepended to LVars.free_list after 
>>>>> we started
>>>>> 2317     // calculating chk_om_free_count so LVars.free_count may not
>>>>> 2318     // match anymore.
>>>>>
>>>>> and here:
>>>>>
>>>>> 2346     // With fine grained locks on the monitor lists, it is 
>>>>> possible for
>>>>> 2347     // an exiting JavaThread to put its in-use ObjectMonitors 
>>>>> on the
>>>>> 2348     // global in-use list after chk_om_in_use_count is 
>>>>> calculated above.
>>>>>
>>>>> ???
>>>>
>>>> This header comment was intended to cover that rationale:
>>>>
>>>>      L2152: // This function can be called at a safepoint or it can 
>>>> be called when
>>>>      L2153: // we are trying to exit the VM. When we are trying to 
>>>> exit the VM, the
>>>>      L2154: // list walker functions can run in parallel with the 
>>>> other list
>>>>      L2155: // operations so spin-locking is used for safety.
>>>>      L2156: //
>>>>      L2157: // Calls to this function can be added in various places 
>>>> as a debugging
>>>>      L2158: // aid; pass 'true' for the 'on_exit' parameter to have 
>>>> in-use monitor
>>>>      L2159: // details logged at the Info level and 'false' for the 
>>>> 'on_exit'
>>>>      L2160: // parameter to have in-use monitor details logged at 
>>>> the Trace level.
>>>>      L2161: //
>>>>      L2162: void ObjectSynchronizer::audit_and_print_stats(bool 
>>>> on_exit) {
>>>>
>>>> so no the audit_and_print_stats() function and the various chk_*()
>>>> functions can be called at a non-safepoint.
>>>
>>> When we are exiting the VM this is called from exit_globals, which 
>>> does execute at a safepoint - either:
>>>
>>> - from VM_Exit::do_it() (a safepoint VM op); or
>>> - from Threads::destroy_vm() after the VMThread has been terminated, 
>>> which brings the VM to a safepoint first.
>>
>> Hmmm... I have this assert here:
>>
>> 2162 void ObjectSynchronizer::audit_and_print_stats(bool on_exit) {
>> 2163   assert(on_exit || SafepointSynchronize::is_at_safepoint(), 
>> "invariant");
>>
>> and I remember adding the "on_exit" conditional so the assert wouldn't
>> fire. I'll investigate more...
>>
>> Also, audit_and_print_stats() calls can be added as a debugging
>> aide when chasing ObjectMonitor list issues. The way that it is
>> written, it doesn't require a safepoint (when on_exit == true).
>>
>> I'll make the minor edits mentioned above before I send out the
>> webrev for the next round of review.
>>
>> Thanks again for being such a thorough reviewer.
>>
>> Dan
> 
> audit_and_print_stats() was added to the Async Monitor Deflation
> project by this changeset:
> 
> changeset:   34:64accf02d468
> user:        dcubed
> date:        Fri Dec 21 13:52:56 2018 -0500
> summary:     dcubed.monitor_deflate_conc.23, 
> diff.txt.dcubed.monitor_deflate_conc.23 - Add support for 
> ObjectSynchronizer::audit_and_print_stats(), called by exit_globals() at 
> Info level and by finish_deflate_idle_monitors() at Debug level; add 
> is_busy_async() for use by audit_and_print_stats(); restore a 
> guarantee(take->object() == NULL) check in omAlloc() now that 
> AsyncDeflateIdleMonitors cleanup of _object field is finished.
> 
> 
> Looking at my notes for that week, I was chasing tests that were failing
> due to ObjectMonitors that were keeping objects pinned. The tests were
> expecting that enough System.gc() calls would cause an unreferenced
> and collectible object to be GC'ed. This was back when we were having
> the JavaThread do its own deflation...
> 
> It looks like I was using audit_and_print_stats(true) calls to probe
> the VM at different points trying to determine what was keeping the
> object pinned and why... So I was using audit_and_print_stats() at
> non-safepoint locations for debugging purposes...

Okay. If you want this ability that is fine then.

> Dan
> 
> P.S.
> 
> In src/hotspot/share/compiler/compileBroker.cpp:
> CompileBroker::handle_full_code_cache() calls exit_globals()
> when the ExitOnFullCodeCache is true in non-PRODUCT bits.
> I don't think we're at a safepoint for that call... :-)
> I don't see any tests that enable that flag and the default
> is false so I couldn't have run into it in my testing...

Yikes! That's not the way to exit the VM :) But we probably don't really 
care. That termination path seems mid-way between a graceful exit and an 
abort. With no termination safepoint it can be quite likely to run into 
shutdown crashes anyway.

Cheers,
David