RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Mon Feb 3 19:07:14 UTC 2020
Hi Dan,
On 1/31/20 6:48 PM, Daniel D. Daugherty wrote:
> Hi Coleen,
>
> Thanks for the review!
>
> Replies embedded below (as usual)...
>
>
> On 1/30/20 8:22 PM, coleen.phillimore at oracle.com wrote:
>>
>> Dan,
>>
>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/objectMonitor.hpp.udiff.html
>>
>>
>> There should be three functions in ObjectMonitor like:
>>
>> ObjectMonitor* next_om() const { Atomic::load(&_next_om); }
>> void set_next_om(ObjectMonitor* value) { Atomic::store(&_next_om,
>> value); }
>> and one for the CAS.
>
> I like it! It makes sense to do this since I'm touch all the uses
> of _next_om.
>
> So here's what I added to
> src/hotspot/share/runtime/objectMonitor.inline.hpp:
>
> +inline ObjectMonitor* ObjectMonitor::next_om() const {
> + return Atomic::load(&_next_om);
> +}
> +
> +// Simply set _next_om field to new_value.
> +inline void ObjectMonitor::set_next_om(ObjectMonitor* new_value) {
> + Atomic::store(&_next_om, new_value);
> +}
> +
> +// Try to set _next_om field to new_value if the current value matches
> +// old_value. Otherwise, does not change the _next_om field. Returns
> +// the prior value of the _next_om field.
> +inline ObjectMonitor* ObjectMonitor::try_set_next_om(ObjectMonitor*
> old_value, ObjectMonitor* new_value) {
> + return Atomic::cmpxchg(&_next_om, old_value, new_value);
> +}
>
>
Great!
>>
>> And _next_om should be private. That way all accesses to this field
>> don't need checking (ie. eyeball checking).
>
> Adding the accessors and setters functions allowed _next_om
> to become private. One less crufty piece!
>
>
>>
>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/synchronizer.cpp.sdiff.html
>>
>>
>> 129 ObjectMonitor* free_list;
>>
>>
>> These fields should be prepended with _ ie. _free_list to follow the
>> coding standard.
>
> Done. It feels a little iffy since these globals are only fields
> in a struct because we want padding... but I renamed them anyway.
>
>
>> I would rename LVars to something like _om_globals like
>> _om_globals._free_list . I think LVars is an odd name without any
>> meaning.
>
> In the existing code we have:
>
> struct SharedGlobals {
> static SharedGlobals GVars;
>
> so when I added padding to the global list variables I
> followed the existing style:
>
> struct ListGlobals {
> static ListGlobals LVars;
>
> I think I'm going to go with:
>
> struct ObjectMonitorListGlobals {
> static ObjectMonitorListGlobals om_list_globals;
>
> Note no leading '_' on the global variable.
>
This seems fine. I noticed SVars later, but that can be changed also later.
>
>> 230 // Set the next field in an ObjectMonitor to the specified value.
>> 231 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
>> 232 Atomic::store(&om->_next_om, value);
>> 233 }
>> 234
>>
>> the callers of this can just call om->set_next_om(value);
>
> Done and removed the static function.
>
>
>> I almost think you don't need om in the names and have the functions
>> be next() and set_next()
>
> The field name is _next_om so my preference is for next_om()
> and set_next_om()...
>
That's good.
>
>> + if (!try_om_lock(tail)) {
>> + continue; // failed to lock tail so try it all again
>> + }
>>
>>
>> Why would 'tail' be locked here?
>
> First some context:
>
> L238: static void prepend_list_to_common(ObjectMonitor* list,
> ObjectMonitor* tail,
> L239: int count,
> ObjectMonitor** list_p,
> L240: int* count_p) {
> <snip>
> L243: // Prepend list to *list_p.
> L244: if (!try_om_lock(tail)) {
> L245: continue; // failed to lock tail so try it all again
> L246: }
> L247: set_next(tail, cur); // tail now points to cur (and
> unlocks tail)
>
> prepend_list_to_common() is called by:
>
> - prepend_block_to_lists()
> - The 'list' and 'tail' belong to a newly allocated list of
> ObjectMonitors so no race here.
This was the only one I'd read when I asked ...
> - prepend_list_to_global_free_list()
> - called by om_flush()
> - The 'list' and 'tail' were just detached from a per-thread
> free list that might have a list walker on it.
> - can race with a debugging audit_and_print_stats() call
> that is added to debug monitor subsystem problems
> - can race with an audit_and_print_stats() call from
> an exit_globals() call that doesn't happen at a safepoint
> like the one we discovered with the ExitOnFullCodeCache
> option...
> - called by deflate_idle_monitors() at a safepoint
> - The 'list' and 'tail' were just detached from the global
> in-use list that might have a list walker on it.
> - can race with a debugging audit_and_print_stats() call
> that is added to debug monitor subsystem problems
> - deflate_thread_local_monitors() at a safepoint
> - The 'list' and 'tail' were just detached from a per-thread
> in-use list that might have a list walker on it.
> - can race with a debugging audit_and_print_stats() call
> that is added to debug monitor subsystem problems
> - prepend_list_to_global_in_use_list()
> - called by om_flush()
> - The 'list' and 'tail' were just detached from a per-thread
> in-use list that might have a list walker on it.
> - can race with a debugging audit_and_print_stats() call
> that is added to debug monitor subsystem problems
> - can race with an audit_and_print_stats() call from
> an exit_globals() call that doesn't happen at a safepoint
> like the one we discovered with the ExitOnFullCodeCache
> option...
>
> Without this call:
>
> L244: if (!try_om_lock(tail)) {
>
> this call:
>
> L247: set_next(tail, cur); // tail now points to cur
>
> can result in stomping on the lock bit from a racing
> audit_and_print_stats() call and when that thread goes
> to unlock the ObjectMonitor, we hit:
>
> L199: guarantee(((intptr_t)next & OM_LOCK_BIT) == OM_LOCK_BIT,
> "next=" INTPTR_FORMAT
> L200: " must have OM_LOCK_BIT=%x set.", p2i(next),
> OM_LOCK_BIT);
>
> This particular race was so much fun to chase down... :-)
>
Okay, I see this could be a problem.
>
>> 'list' is some safe sublist that doesn't change or isn't iterated on,
>> while being prepended to the list_p, right?
>
> No. See my analysis of the callers above.
Okay, I'd only read through the om_alloc() caller
>
>
>> I was trying to understand the callers, but a comment would be
>> helpful about what "list" is and where it's coming from.
>
> The 'list' and 'tail' can belong to several different lists so
> mentioning them in a comment would be a bit much...
>
> I've changed that code to:
>
> if (!try_om_lock(tail)) {
> // Failed to lock tail due to a list walker so try it all again.
> continue;
> }
>
> Please let me know if that helps (and is sufficient).
>
>
>> + prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1],
>> _BLOCKSIZE - 1,
>>
>>
>> Commentary: this is just awful. (This leads to more questions but
>> not about this patch).
>
> That prepend_list_to_common() call is in the new
> prepend_block_to_lists() function. Here's the original
> block of code that prepend_block_to_lists() replaced:
>
> 1112 // Acquire the gListLock to manipulate g_block_list and
> g_free_list.
> 1113 // An Oyama-Taura-Yonezawa scheme might be more efficient.
> 1114 Thread::muxAcquire(&gListLock, "om_alloc(2)");
> 1115 g_om_population += _BLOCKSIZE-1;
> 1116 g_om_free_count += _BLOCKSIZE-1;
> 1117
> 1118 // Add the new block to the list of extant blocks
> (g_block_list).
> 1119 // The very first ObjectMonitor in a block is reserved and
> dedicated.
> 1120 // It serves as blocklist "next" linkage.
> 1121 temp[0]._next_om = g_block_list;
> 1122 // There are lock-free uses of g_block_list so make sure that
> 1123 // the previous stores happen before we update g_block_list.
> 1124 Atomic::release_store(&g_block_list, temp);
> 1125
> 1126 // Add the new string of ObjectMonitors to the global free list
> 1127 temp[_BLOCKSIZE - 1]._next_om = g_free_list;
> 1128 g_free_list = temp + 1;
> 1129 Thread::muxRelease(&gListLock);
>
> So here's the complete "awful" call:
>
> L291: // Second we handle LVars.free_list:
> L292: prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1],
> _BLOCKSIZE - 1,
> L293: &LVars.free_list, &LVars.free_count);
>
> and that call replaces these lines of original code:
>
> 1116 g_om_free_count += _BLOCKSIZE-1;
> 1126 // Add the new string of ObjectMonitors to the global free list
> 1127 temp[_BLOCKSIZE - 1]._next_om = g_free_list;
> 1128 g_free_list = temp + 1;
>
>
> Part of the awfulness is the fact that we allocate ObjectMonitors
> in blocks that are chained together on the g_block_list, but that's
> a completely different discussion.
>
Yes it is a different discussion. It took a while to understand this
and you've lied to the type system about these blocks.
>
>> + if (is_MonitorBound_exceeded(Atomic::load(&LVars.population) -
>> Atomic::load(&LVars.free_count))) {
>>
>>
>> Since these are globals, add them to your is_MonitorBound_exceeded
>> functions.
>
> The MonitorBound option is going away in JDK15:
>
> JDK-8230940 Obsolete MonitorBound
> https://bugs.openjdk.java.net/browse/JDK-8230940
>
> so I'm not really worried about the is_MonitorBound_exceeded()
> function. Please let me know if you are okay with that.
>
Sure, great.
>
>> I just hit a big block of code in om_release to look at but have to
>> go, but here's the first part.
>
> Thanks for part one. I'll keep my eyes open for more review comments.
Thanks,
Coleen
>
> Dan
>
>
>>
>> Coleen
>>
>> On 1/29/20 1:14 PM, Daniel D. Daugherty wrote:
>>> Ping! Still looking for a second reviewer on this changeset...
>>>
>>> Dan
>>>
>>>
>>> On 1/27/20 3:43 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I'm looking for a second reviewer on this thread. I've gone ahead and
>>>> made changes based on David H's comments on CR0.
>>>>
>>>> JDK-8235795 replace monitor list
>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>
>>>> Copyright years will be updated when the patches are rebased to JDK15.
>>>>
>>>> Here's the incremental webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.inc/
>>>>
>>>> Here's the full webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/
>>>>
>>>> Here's what changed between CR0 and CR1:
>>>>
>>>> - refactor common code
>>>> - refactor atomic load of LVars.population in
>>>> monitors_used_above_threshold
>>>> - simplify list walking in ObjectSynchronizer::om_release() so we
>>>> lock fewer ObjectMonitors
>>>> - remove unnecessary locking from
>>>> ObjectSynchronizer::deflate_monitor_list()
>>>> - add NoSafepointVerifier helpers to main list management functions
>>>> - remove unnecessary storestore()
>>>> - remove unnecessary comments
>>>> - clarify/fix comments.
>>>>
>>>> These changes have been tested in a Mach5 Tier[1-3] run with no
>>>> regressions. They have also been merged with 8235931 and 8236035 and
>>>> included in a Mach5 Tier[1-8] run with no known regressions (so far
>>>> since Tier8 is not quite finished).
>>>>
>>>> I did a SPECjbb2015 run on these bits with a jdk-14+32 baseline and
>>>> 25 runs:
>>>>
>>>> criticalJOPS 0.25% (Non-significant)
>>>> 66754.32 66923.08
>>>> ± 1209.80 ± 1585.09
>>>> p = 0.674
>>>>
>>>> maxJOPS -1.12% (Non-significant)
>>>> 90965.80 89948.80
>>>> ± 1788.39 ± 1989.22
>>>> p = 0.063
>>>>
>>>> I did a SPECjbb2015 run on the merge of 8235931, 8236035, and 8235795
>>>> with a jdk-14+32 baseline and 25 runs:
>>>>
>>>> criticalJOPS 0.37% (Non-significant)
>>>> 66754.32 67003.92
>>>> ± 1209.80 ± 1662.01
>>>> p = 0.547
>>>>
>>>> maxJOPS -0.23% (Non-significant)
>>>> 90965.80 90754.00
>>>> ± 1788.39 ± 1851.64
>>>> p = 0.683
>>>>
>>>> All of these results were flagged as "Non-significant" by the perf
>>>> testing system. Looks like "p" values are still too high.
>>>>
>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 12/23/19 4:57 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I'm extracting another standalone fix from the Async Monitor
>>>>> Deflation
>>>>> project (JDK-8153224) and sending it out for review (and testing)
>>>>> separately.
>>>>>
>>>>> JDK-8235795 replace monitor list
>>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>>
>>>>> Here's the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/0-for-jdk15/
>>>>>
>>>>> Folks that have reviewed JDK-8153224 will recognize these changes as
>>>>> a subset of the monitor list changes from the Async Monitor Deflation
>>>>> project. It's a subset because the Async Monitor Deflation project
>>>>> needs additional spin locking due to the async deflation work.
>>>>>
>>>>> The OpenJDK wiki for Async Monitor Deflation has several sections
>>>>> dedicated to the Spin-Lock Monitor List Management changes. This
>>>>> link will get you to the first section:
>>>>>
>>>>> Spin-Lock Monitor List Management In Theory
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInTheory
>>>>>
>>>>>
>>>>> The remaining monitor list sections are:
>>>>>
>>>>> Background: ObjectMonitor Movement Between the Lists
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Background:ObjectMonitorMovementBetweentheLists
>>>>>
>>>>>
>>>>> Spin-Lock Monitor List Management In Reality
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInReality
>>>>>
>>>>>
>>>>> Using The New Spin-Lock Monitor List Functions
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-UsingTheNewSpin-LockMonitorListFunctions
>>>>>
>>>>>
>>>>> Of course, the OpenJDK wiki content is specific to the Async Monitor
>>>>> Deflation project, but this extract is a very close subset.
>>>>>
>>>>> These changes have been tested in various Mach5 Tier[1-7] runs.
>>>>> I'm also doing SPECjbb2015 runs.
>>>>>
>>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list