RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Jan 24 18:18:42 UTC 2020


A follow up on audit_and_print_stats()...

On 1/24/20 11:48 AM, Daniel D. Daugherty wrote:
> On 1/23/20 9:09 PM, David Holmes wrote:

<snip>

>
>>
>>>
>>>> ---
>>>>
>>>> 2189     // With fine grained locks on the monitor lists, it is 
>>>> possible for
>>>> 2190     // log_monitor_list_counts() to return a value that 
>>>> doesn't match
>>>> 2191     // LVars.population. So far a higher value has been seen 
>>>> in testing
>>>> 2192     // so something is being double counted by 
>>>> log_monitor_list_counts().
>>>>
>>>> AFAICS this code is always executed at a safepoint so I don't 
>>>> understand how the counts could be out-of-sync. Ditto for the 
>>>> comments here:
>>>>
>>>> 2315     // With fine grained locks on LVars.free_list, it is 
>>>> possible for an
>>>> 2316     // ObjectMonitor to be prepended to LVars.free_list after 
>>>> we started
>>>> 2317     // calculating chk_om_free_count so LVars.free_count may not
>>>> 2318     // match anymore.
>>>>
>>>> and here:
>>>>
>>>> 2346     // With fine grained locks on the monitor lists, it is 
>>>> possible for
>>>> 2347     // an exiting JavaThread to put its in-use ObjectMonitors 
>>>> on the
>>>> 2348     // global in-use list after chk_om_in_use_count is 
>>>> calculated above.
>>>>
>>>> ???
>>>
>>> This header comment was intended to cover that rationale:
>>>
>>>      L2152: // This function can be called at a safepoint or it can 
>>> be called when
>>>      L2153: // we are trying to exit the VM. When we are trying to 
>>> exit the VM, the
>>>      L2154: // list walker functions can run in parallel with the 
>>> other list
>>>      L2155: // operations so spin-locking is used for safety.
>>>      L2156: //
>>>      L2157: // Calls to this function can be added in various places 
>>> as a debugging
>>>      L2158: // aid; pass 'true' for the 'on_exit' parameter to have 
>>> in-use monitor
>>>      L2159: // details logged at the Info level and 'false' for the 
>>> 'on_exit'
>>>      L2160: // parameter to have in-use monitor details logged at 
>>> the Trace level.
>>>      L2161: //
>>>      L2162: void ObjectSynchronizer::audit_and_print_stats(bool 
>>> on_exit) {
>>>
>>> so no the audit_and_print_stats() function and the various chk_*()
>>> functions can be called at a non-safepoint.
>>
>> When we are exiting the VM this is called from exit_globals, which 
>> does execute at a safepoint - either:
>>
>> - from VM_Exit::do_it() (a safepoint VM op); or
>> - from Threads::destroy_vm() after the VMThread has been terminated, 
>> which brings the VM to a safepoint first.
>
> Hmmm... I have this assert here:
>
> 2162 void ObjectSynchronizer::audit_and_print_stats(bool on_exit) {
> 2163   assert(on_exit || SafepointSynchronize::is_at_safepoint(), 
> "invariant");
>
> and I remember adding the "on_exit" conditional so the assert wouldn't
> fire. I'll investigate more...
>
> Also, audit_and_print_stats() calls can be added as a debugging
> aide when chasing ObjectMonitor list issues. The way that it is
> written, it doesn't require a safepoint (when on_exit == true).
>
> I'll make the minor edits mentioned above before I send out the
> webrev for the next round of review.
>
> Thanks again for being such a thorough reviewer.
>
> Dan

audit_and_print_stats() was added to the Async Monitor Deflation
project by this changeset:

changeset:   34:64accf02d468
user:        dcubed
date:        Fri Dec 21 13:52:56 2018 -0500
summary:     dcubed.monitor_deflate_conc.23, 
diff.txt.dcubed.monitor_deflate_conc.23 - Add support for 
ObjectSynchronizer::audit_and_print_stats(), called by exit_globals() at 
Info level and by finish_deflate_idle_monitors() at Debug level; add 
is_busy_async() for use by audit_and_print_stats(); restore a 
guarantee(take->object() == NULL) check in omAlloc() now that 
AsyncDeflateIdleMonitors cleanup of _object field is finished.


Looking at my notes for that week, I was chasing tests that were failing
due to ObjectMonitors that were keeping objects pinned. The tests were
expecting that enough System.gc() calls would cause an unreferenced
and collectible object to be GC'ed. This was back when we were having
the JavaThread do its own deflation...

It looks like I was using audit_and_print_stats(true) calls to probe
the VM at different points trying to determine what was keeping the
object pinned and why... So I was using audit_and_print_stats() at
non-safepoint locations for debugging purposes...

Dan

P.S.

In src/hotspot/share/compiler/compileBroker.cpp:
CompileBroker::handle_full_code_cache() calls exit_globals()
when the ExitOnFullCodeCache is true in non-PRODUCT bits.
I don't think we're at a safepoint for that call... :-)
I don't see any tests that enable that flag and the default
is false so I couldn't have run into it in my testing...


More information about the hotspot-runtime-dev mailing list