RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jan 31 23:48:37 UTC 2020
Hi Coleen,
Thanks for the review!
Replies embedded below (as usual)...
On 1/30/20 8:22 PM, coleen.phillimore at oracle.com wrote:
>
> Dan,
>
> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/objectMonitor.hpp.udiff.html
>
>
> There should be three functions in ObjectMonitor like:
>
> ObjectMonitor* next_om() const { Atomic::load(&_next_om); }
> void set_next_om(ObjectMonitor* value) { Atomic::store(&_next_om,
> value); }
> and one for the CAS.
I like it! It makes sense to do this since I'm touch all the uses
of _next_om.
So here's what I added to
src/hotspot/share/runtime/objectMonitor.inline.hpp:
+inline ObjectMonitor* ObjectMonitor::next_om() const {
+ return Atomic::load(&_next_om);
+}
+
+// Simply set _next_om field to new_value.
+inline void ObjectMonitor::set_next_om(ObjectMonitor* new_value) {
+ Atomic::store(&_next_om, new_value);
+}
+
+// Try to set _next_om field to new_value if the current value matches
+// old_value. Otherwise, does not change the _next_om field. Returns
+// the prior value of the _next_om field.
+inline ObjectMonitor* ObjectMonitor::try_set_next_om(ObjectMonitor*
old_value, ObjectMonitor* new_value) {
+ return Atomic::cmpxchg(&_next_om, old_value, new_value);
+}
>
> And _next_om should be private. That way all accesses to this field
> don't need checking (ie. eyeball checking).
Adding the accessors and setters functions allowed _next_om
to become private. One less crufty piece!
>
> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/synchronizer.cpp.sdiff.html
>
>
> 129 ObjectMonitor* free_list;
>
>
> These fields should be prepended with _ ie. _free_list to follow the
> coding standard.
Done. It feels a little iffy since these globals are only fields
in a struct because we want padding... but I renamed them anyway.
> I would rename LVars to something like _om_globals like
> _om_globals._free_list . I think LVars is an odd name without any
> meaning.
In the existing code we have:
struct SharedGlobals {
static SharedGlobals GVars;
so when I added padding to the global list variables I
followed the existing style:
struct ListGlobals {
static ListGlobals LVars;
I think I'm going to go with:
struct ObjectMonitorListGlobals {
static ObjectMonitorListGlobals om_list_globals;
Note no leading '_' on the global variable.
> 230 // Set the next field in an ObjectMonitor to the specified value.
> 231 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
> 232 Atomic::store(&om->_next_om, value);
> 233 }
> 234
>
> the callers of this can just call om->set_next_om(value);
Done and removed the static function.
> I almost think you don't need om in the names and have the functions
> be next() and set_next()
The field name is _next_om so my preference is for next_om()
and set_next_om()...
> + if (!try_om_lock(tail)) {
> + continue; // failed to lock tail so try it all again
> + }
>
>
> Why would 'tail' be locked here?
First some context:
L238: static void prepend_list_to_common(ObjectMonitor* list,
ObjectMonitor* tail,
L239: int count, ObjectMonitor**
list_p,
L240: int* count_p) {
<snip>
L243: // Prepend list to *list_p.
L244: if (!try_om_lock(tail)) {
L245: continue; // failed to lock tail so try it all again
L246: }
L247: set_next(tail, cur); // tail now points to cur (and
unlocks tail)
prepend_list_to_common() is called by:
- prepend_block_to_lists()
- The 'list' and 'tail' belong to a newly allocated list of
ObjectMonitors so no race here.
- prepend_list_to_global_free_list()
- called by om_flush()
- The 'list' and 'tail' were just detached from a per-thread
free list that might have a list walker on it.
- can race with a debugging audit_and_print_stats() call
that is added to debug monitor subsystem problems
- can race with an audit_and_print_stats() call from
an exit_globals() call that doesn't happen at a safepoint
like the one we discovered with the ExitOnFullCodeCache
option...
- called by deflate_idle_monitors() at a safepoint
- The 'list' and 'tail' were just detached from the global
in-use list that might have a list walker on it.
- can race with a debugging audit_and_print_stats() call
that is added to debug monitor subsystem problems
- deflate_thread_local_monitors() at a safepoint
- The 'list' and 'tail' were just detached from a per-thread
in-use list that might have a list walker on it.
- can race with a debugging audit_and_print_stats() call
that is added to debug monitor subsystem problems
- prepend_list_to_global_in_use_list()
- called by om_flush()
- The 'list' and 'tail' were just detached from a per-thread
in-use list that might have a list walker on it.
- can race with a debugging audit_and_print_stats() call
that is added to debug monitor subsystem problems
- can race with an audit_and_print_stats() call from
an exit_globals() call that doesn't happen at a safepoint
like the one we discovered with the ExitOnFullCodeCache
option...
Without this call:
L244: if (!try_om_lock(tail)) {
this call:
L247: set_next(tail, cur); // tail now points to cur
can result in stomping on the lock bit from a racing
audit_and_print_stats() call and when that thread goes
to unlock the ObjectMonitor, we hit:
L199: guarantee(((intptr_t)next & OM_LOCK_BIT) == OM_LOCK_BIT,
"next=" INTPTR_FORMAT
L200: " must have OM_LOCK_BIT=%x set.", p2i(next),
OM_LOCK_BIT);
This particular race was so much fun to chase down... :-)
> 'list' is some safe sublist that doesn't change or isn't iterated on,
> while being prepended to the list_p, right?
No. See my analysis of the callers above.
> I was trying to understand the callers, but a comment would be helpful
> about what "list" is and where it's coming from.
The 'list' and 'tail' can belong to several different lists so
mentioning them in a comment would be a bit much...
I've changed that code to:
if (!try_om_lock(tail)) {
// Failed to lock tail due to a list walker so try it all again.
continue;
}
Please let me know if that helps (and is sufficient).
> + prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1],
> _BLOCKSIZE - 1,
>
>
> Commentary: this is just awful. (This leads to more questions but
> not about this patch).
That prepend_list_to_common() call is in the new
prepend_block_to_lists() function. Here's the original
block of code that prepend_block_to_lists() replaced:
1112 // Acquire the gListLock to manipulate g_block_list and
g_free_list.
1113 // An Oyama-Taura-Yonezawa scheme might be more efficient.
1114 Thread::muxAcquire(&gListLock, "om_alloc(2)");
1115 g_om_population += _BLOCKSIZE-1;
1116 g_om_free_count += _BLOCKSIZE-1;
1117
1118 // Add the new block to the list of extant blocks (g_block_list).
1119 // The very first ObjectMonitor in a block is reserved and
dedicated.
1120 // It serves as blocklist "next" linkage.
1121 temp[0]._next_om = g_block_list;
1122 // There are lock-free uses of g_block_list so make sure that
1123 // the previous stores happen before we update g_block_list.
1124 Atomic::release_store(&g_block_list, temp);
1125
1126 // Add the new string of ObjectMonitors to the global free list
1127 temp[_BLOCKSIZE - 1]._next_om = g_free_list;
1128 g_free_list = temp + 1;
1129 Thread::muxRelease(&gListLock);
So here's the complete "awful" call:
L291: // Second we handle LVars.free_list:
L292: prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1],
_BLOCKSIZE - 1,
L293: &LVars.free_list, &LVars.free_count);
and that call replaces these lines of original code:
1116 g_om_free_count += _BLOCKSIZE-1;
1126 // Add the new string of ObjectMonitors to the global free list
1127 temp[_BLOCKSIZE - 1]._next_om = g_free_list;
1128 g_free_list = temp + 1;
Part of the awfulness is the fact that we allocate ObjectMonitors
in blocks that are chained together on the g_block_list, but that's
a completely different discussion.
> + if (is_MonitorBound_exceeded(Atomic::load(&LVars.population) -
> Atomic::load(&LVars.free_count))) {
>
>
> Since these are globals, add them to your is_MonitorBound_exceeded
> functions.
The MonitorBound option is going away in JDK15:
JDK-8230940 Obsolete MonitorBound
https://bugs.openjdk.java.net/browse/JDK-8230940
so I'm not really worried about the is_MonitorBound_exceeded()
function. Please let me know if you are okay with that.
> I just hit a big block of code in om_release to look at but have to
> go, but here's the first part.
Thanks for part one. I'll keep my eyes open for more review comments.
Dan
>
> Coleen
>
> On 1/29/20 1:14 PM, Daniel D. Daugherty wrote:
>> Ping! Still looking for a second reviewer on this changeset...
>>
>> Dan
>>
>>
>> On 1/27/20 3:43 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I'm looking for a second reviewer on this thread. I've gone ahead and
>>> made changes based on David H's comments on CR0.
>>>
>>> JDK-8235795 replace monitor list
>>> mux{Acquire,Release}(&gListLock) with spin locks
>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>
>>> Copyright years will be updated when the patches are rebased to JDK15.
>>>
>>> Here's the incremental webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.inc/
>>>
>>> Here's the full webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/
>>>
>>> Here's what changed between CR0 and CR1:
>>>
>>> - refactor common code
>>> - refactor atomic load of LVars.population in
>>> monitors_used_above_threshold
>>> - simplify list walking in ObjectSynchronizer::om_release() so we
>>> lock fewer ObjectMonitors
>>> - remove unnecessary locking from
>>> ObjectSynchronizer::deflate_monitor_list()
>>> - add NoSafepointVerifier helpers to main list management functions
>>> - remove unnecessary storestore()
>>> - remove unnecessary comments
>>> - clarify/fix comments.
>>>
>>> These changes have been tested in a Mach5 Tier[1-3] run with no
>>> regressions. They have also been merged with 8235931 and 8236035 and
>>> included in a Mach5 Tier[1-8] run with no known regressions (so far
>>> since Tier8 is not quite finished).
>>>
>>> I did a SPECjbb2015 run on these bits with a jdk-14+32 baseline and
>>> 25 runs:
>>>
>>> criticalJOPS 0.25% (Non-significant)
>>> 66754.32 66923.08
>>> ± 1209.80 ± 1585.09
>>> p = 0.674
>>>
>>> maxJOPS -1.12% (Non-significant)
>>> 90965.80 89948.80
>>> ± 1788.39 ± 1989.22
>>> p = 0.063
>>>
>>> I did a SPECjbb2015 run on the merge of 8235931, 8236035, and 8235795
>>> with a jdk-14+32 baseline and 25 runs:
>>>
>>> criticalJOPS 0.37% (Non-significant)
>>> 66754.32 67003.92
>>> ± 1209.80 ± 1662.01
>>> p = 0.547
>>>
>>> maxJOPS -0.23% (Non-significant)
>>> 90965.80 90754.00
>>> ± 1788.39 ± 1851.64
>>> p = 0.683
>>>
>>> All of these results were flagged as "Non-significant" by the perf
>>> testing system. Looks like "p" values are still too high.
>>>
>>> Thanks, in advance, for comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 12/23/19 4:57 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I'm extracting another standalone fix from the Async Monitor Deflation
>>>> project (JDK-8153224) and sending it out for review (and testing)
>>>> separately.
>>>>
>>>> JDK-8235795 replace monitor list
>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>
>>>> Here's the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/0-for-jdk15/
>>>>
>>>> Folks that have reviewed JDK-8153224 will recognize these changes as
>>>> a subset of the monitor list changes from the Async Monitor Deflation
>>>> project. It's a subset because the Async Monitor Deflation project
>>>> needs additional spin locking due to the async deflation work.
>>>>
>>>> The OpenJDK wiki for Async Monitor Deflation has several sections
>>>> dedicated to the Spin-Lock Monitor List Management changes. This
>>>> link will get you to the first section:
>>>>
>>>> Spin-Lock Monitor List Management In Theory
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInTheory
>>>>
>>>>
>>>> The remaining monitor list sections are:
>>>>
>>>> Background: ObjectMonitor Movement Between the Lists
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Background:ObjectMonitorMovementBetweentheLists
>>>>
>>>>
>>>> Spin-Lock Monitor List Management In Reality
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInReality
>>>>
>>>>
>>>> Using The New Spin-Lock Monitor List Functions
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-UsingTheNewSpin-LockMonitorListFunctions
>>>>
>>>>
>>>> Of course, the OpenJDK wiki content is specific to the Async Monitor
>>>> Deflation project, but this extract is a very close subset.
>>>>
>>>> These changes have been tested in various Mach5 Tier[1-7] runs.
>>>> I'm also doing SPECjbb2015 runs.
>>>>
>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list