RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jan 24 22:57:47 UTC 2020
On 1/24/20 5:42 PM, David Holmes wrote:
> On 25/01/2020 4:18 am, Daniel D. Daugherty wrote:
>> A follow up on audit_and_print_stats()...
>>
>> On 1/24/20 11:48 AM, Daniel D. Daugherty wrote:
>>> On 1/23/20 9:09 PM, David Holmes wrote:
>>
>> <snip>
>>
>>>
>>>>
>>>>>
>>>>>> ---
>>>>>>
>>>>>> 2189 // With fine grained locks on the monitor lists, it is
>>>>>> possible for
>>>>>> 2190 // log_monitor_list_counts() to return a value that
>>>>>> doesn't match
>>>>>> 2191 // LVars.population. So far a higher value has been seen
>>>>>> in testing
>>>>>> 2192 // so something is being double counted by
>>>>>> log_monitor_list_counts().
>>>>>>
>>>>>> AFAICS this code is always executed at a safepoint so I don't
>>>>>> understand how the counts could be out-of-sync. Ditto for the
>>>>>> comments here:
>>>>>>
>>>>>> 2315 // With fine grained locks on LVars.free_list, it is
>>>>>> possible for an
>>>>>> 2316 // ObjectMonitor to be prepended to LVars.free_list
>>>>>> after we started
>>>>>> 2317 // calculating chk_om_free_count so LVars.free_count may
>>>>>> not
>>>>>> 2318 // match anymore.
>>>>>>
>>>>>> and here:
>>>>>>
>>>>>> 2346 // With fine grained locks on the monitor lists, it is
>>>>>> possible for
>>>>>> 2347 // an exiting JavaThread to put its in-use
>>>>>> ObjectMonitors on the
>>>>>> 2348 // global in-use list after chk_om_in_use_count is
>>>>>> calculated above.
>>>>>>
>>>>>> ???
>>>>>
>>>>> This header comment was intended to cover that rationale:
>>>>>
>>>>> L2152: // This function can be called at a safepoint or it
>>>>> can be called when
>>>>> L2153: // we are trying to exit the VM. When we are trying to
>>>>> exit the VM, the
>>>>> L2154: // list walker functions can run in parallel with the
>>>>> other list
>>>>> L2155: // operations so spin-locking is used for safety.
>>>>> L2156: //
>>>>> L2157: // Calls to this function can be added in various
>>>>> places as a debugging
>>>>> L2158: // aid; pass 'true' for the 'on_exit' parameter to
>>>>> have in-use monitor
>>>>> L2159: // details logged at the Info level and 'false' for
>>>>> the 'on_exit'
>>>>> L2160: // parameter to have in-use monitor details logged at
>>>>> the Trace level.
>>>>> L2161: //
>>>>> L2162: void ObjectSynchronizer::audit_and_print_stats(bool
>>>>> on_exit) {
>>>>>
>>>>> so no the audit_and_print_stats() function and the various chk_*()
>>>>> functions can be called at a non-safepoint.
>>>>
>>>> When we are exiting the VM this is called from exit_globals, which
>>>> does execute at a safepoint - either:
>>>>
>>>> - from VM_Exit::do_it() (a safepoint VM op); or
>>>> - from Threads::destroy_vm() after the VMThread has been
>>>> terminated, which brings the VM to a safepoint first.
>>>
>>> Hmmm... I have this assert here:
>>>
>>> 2162 void ObjectSynchronizer::audit_and_print_stats(bool on_exit) {
>>> 2163 assert(on_exit || SafepointSynchronize::is_at_safepoint(),
>>> "invariant");
>>>
>>> and I remember adding the "on_exit" conditional so the assert wouldn't
>>> fire. I'll investigate more...
>>>
>>> Also, audit_and_print_stats() calls can be added as a debugging
>>> aide when chasing ObjectMonitor list issues. The way that it is
>>> written, it doesn't require a safepoint (when on_exit == true).
>>>
>>> I'll make the minor edits mentioned above before I send out the
>>> webrev for the next round of review.
>>>
>>> Thanks again for being such a thorough reviewer.
>>>
>>> Dan
>>
>> audit_and_print_stats() was added to the Async Monitor Deflation
>> project by this changeset:
>>
>> changeset: 34:64accf02d468
>> user: dcubed
>> date: Fri Dec 21 13:52:56 2018 -0500
>> summary: dcubed.monitor_deflate_conc.23,
>> diff.txt.dcubed.monitor_deflate_conc.23 - Add support for
>> ObjectSynchronizer::audit_and_print_stats(), called by exit_globals()
>> at Info level and by finish_deflate_idle_monitors() at Debug level;
>> add is_busy_async() for use by audit_and_print_stats(); restore a
>> guarantee(take->object() == NULL) check in omAlloc() now that
>> AsyncDeflateIdleMonitors cleanup of _object field is finished.
>>
>>
>> Looking at my notes for that week, I was chasing tests that were failing
>> due to ObjectMonitors that were keeping objects pinned. The tests were
>> expecting that enough System.gc() calls would cause an unreferenced
>> and collectible object to be GC'ed. This was back when we were having
>> the JavaThread do its own deflation...
>>
>> It looks like I was using audit_and_print_stats(true) calls to probe
>> the VM at different points trying to determine what was keeping the
>> object pinned and why... So I was using audit_and_print_stats() at
>> non-safepoint locations for debugging purposes...
>
> Okay. If you want this ability that is fine then.
Thanks! (It has been useful...)
>
>> Dan
>>
>> P.S.
>>
>> In src/hotspot/share/compiler/compileBroker.cpp:
>> CompileBroker::handle_full_code_cache() calls exit_globals()
>> when the ExitOnFullCodeCache is true in non-PRODUCT bits.
>> I don't think we're at a safepoint for that call... :-)
>> I don't see any tests that enable that flag and the default
>> is false so I couldn't have run into it in my testing...
>
> Yikes! That's not the way to exit the VM :) But we probably don't
> really care. That termination path seems mid-way between a graceful
> exit and an abort. With no termination safepoint it can be quite
> likely to run into shutdown crashes anyway.
Does seem to be a scary piece of code... it was added by this changeset:
$ hg log -r 71fd601907dc hotspot/src/share/vm/compiler/compileBroker.cpp
changeset: 4750:71fd601907dc
user: kvn
date: Fri Jan 29 09:27:22 2010 -0800
summary: 4360113: Evict nmethods when code cache gets full
so almost 10 years ago... :-) Dunno if anyone still uses this
option (ExitOnFullCodeCache)...
Dan
>
> Cheers,
> David
More information about the hotspot-runtime-dev
mailing list