RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks
David Holmes
david.holmes at oracle.com
Mon Jan 6 06:37:36 UTC 2020
Hi Dan,
Thanks for this Xmas Eve gift, but I'm afraid it had to wait till after
the holidays :) Happy New Year!
Thanks as always for the extensive writeups on all this to help explain
the details. I have only a few minor comments:
src/hotspot/share/runtime/synchronizer.cpp
166 // Note: the om parameter may or may not have been marked originally.
AFAICS there is a single caller of mark_om_ptr and the OM is unmarked.
Can you clarify (or point to wiki section) why you may be marking an
already marked OM*? To me that implies some kind of nested marking,
which would then make unlocking problematic.
--
175 ObjectMonitor* next =
(ObjectMonitor*)((intptr_t)Atomic::load(&om->_next_om) & ~OM_LOCK_BIT);
Couldn't that be:
ObjectMonitor* next = unmarked_next(om);
?
---
211 // The list head changed so we have to retry.
May I suggest for clarity:
// The list head changed before we could lock it,
// so we have to retry.
---
1155 // used with with block linkage _next_om fields).
typo: with with
---
1164 if (MonitorUsedDeflationThreshold > 0) {
1165 int monitors_used = Atomic::load(&LVars.population) -
Atomic::load(&LVars.free_count);
1166 int monitor_usage = (monitors_used * 100LL) /
Atomic::load(&LVars.population);
This may not matter, but given the fact that the counters are not
updated atomically wrt. the lists, we may be using two different values
of "population" in that calculation. Or is that not possible because
this is executed at a safepoint?
---
1401 // We use the simpler lock-mid-as-we-go protocol to prevent races
"simpler" than what? Somewhere (in the code) we are missing a simple
description of the chained-locking/hand-over-hand-locking pattern that
is being used to allow safe traversal. The comment block at L2275
touches on it but I think this needs to be clearly explained in the meta
comments for the whole OM lifecycle process. Perhaps some list
management notes immediately after the comment block at L1233-L1237 ?
---
1525 // s is locked so there must be a racing walker thread ahead
1526 // of us so we'll give it a chance to finish.
1527 while (is_locked(s)) {
1528 os::naked_short_sleep(1);
1529 }
So this implies that any walking code is safepoint-check free (or more
generally non-blocking). Is that asserted where possible?
---
1947 OrderAccess::storestore(); // flush inits for worker
threads
A storestore barrier doesn't flush anything, it simple maintains an
ordering between stores astride the barrier. If you needed to ensure
visibility of these updates to the GC worker threads then this would not
suffice. But any communication with the GC worker threads (i.e. telling
them there is work to be done) requires explicit synchronization anyway,
so this barrier should not be needed as it should be subsumed by that
existing mechanism.
---
2047 // The working list is local so no need for a memory barrier.
Unclear what kind of barrier you thought you might need. Did you mean
"no need for locking/synchronization"?
---
2189 // With fine grained locks on the monitor lists, it is possible for
2190 // log_monitor_list_counts() to return a value that doesn't match
2191 // LVars.population. So far a higher value has been seen in testing
2192 // so something is being double counted by
log_monitor_list_counts().
AFAICS this code is always executed at a safepoint so I don't understand
how the counts could be out-of-sync. Ditto for the comments here:
2315 // With fine grained locks on LVars.free_list, it is possible
for an
2316 // ObjectMonitor to be prepended to LVars.free_list after we
started
2317 // calculating chk_om_free_count so LVars.free_count may not
2318 // match anymore.
and here:
2346 // With fine grained locks on the monitor lists, it is possible for
2347 // an exiting JavaThread to put its in-use ObjectMonitors on the
2348 // global in-use list after chk_om_in_use_count is calculated
above.
???
---
That's it. :)
Thanks,
David
On 24/12/2019 7:57 am, Daniel D. Daugherty wrote:
> Greetings,
>
> I'm extracting another standalone fix from the Async Monitor Deflation
> project (JDK-8153224) and sending it out for review (and testing)
> separately.
>
> JDK-8235795 replace monitor list mux{Acquire,Release}(&gListLock)
> with spin locks
> https://bugs.openjdk.java.net/browse/JDK-8235795
>
> Here's the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8235795-webrev/0-for-jdk15/
>
> Folks that have reviewed JDK-8153224 will recognize these changes as
> a subset of the monitor list changes from the Async Monitor Deflation
> project. It's a subset because the Async Monitor Deflation project
> needs additional spin locking due to the async deflation work.
>
> The OpenJDK wiki for Async Monitor Deflation has several sections
> dedicated to the Spin-Lock Monitor List Management changes. This
> link will get you to the first section:
>
> Spin-Lock Monitor List Management In Theory
> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInTheory
>
>
> The remaining monitor list sections are:
>
> Background: ObjectMonitor Movement Between the Lists
> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Background:ObjectMonitorMovementBetweentheLists
>
>
> Spin-Lock Monitor List Management In Reality
> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInReality
>
>
> Using The New Spin-Lock Monitor List Functions
> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-UsingTheNewSpin-LockMonitorListFunctions
>
>
> Of course, the OpenJDK wiki content is specific to the Async Monitor
> Deflation project, but this extract is a very close subset.
>
> These changes have been tested in various Mach5 Tier[1-7] runs.
> I'm also doing SPECjbb2015 runs.
>
> Thanks, in advance, for comments, questions or suggestions.
>
> Dan
More information about the hotspot-runtime-dev
mailing list