RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Mon Feb 3 19:07:14 UTC 2020


Hi Dan,

On 1/31/20 6:48 PM, Daniel D. Daugherty wrote:
> Hi Coleen,
>
> Thanks for the review!
>
> Replies embedded below (as usual)...
>
>
> On 1/30/20 8:22 PM, coleen.phillimore at oracle.com wrote:
>>
>> Dan,
>>
>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/objectMonitor.hpp.udiff.html 
>>
>>
>> There should be three functions in ObjectMonitor like:
>>
>> ObjectMonitor* next_om() const { Atomic::load(&_next_om); }
>> void set_next_om(ObjectMonitor* value) { Atomic::store(&_next_om, 
>> value); }
>> and one for the CAS.
>
> I like it! It makes sense to do this since I'm touch all the uses
> of _next_om.
>
> So here's what I added to 
> src/hotspot/share/runtime/objectMonitor.inline.hpp:
>
> +inline ObjectMonitor* ObjectMonitor::next_om() const {
> +  return Atomic::load(&_next_om);
> +}
> +
> +// Simply set _next_om field to new_value.
> +inline void ObjectMonitor::set_next_om(ObjectMonitor* new_value) {
> +  Atomic::store(&_next_om, new_value);
> +}
> +
> +// Try to set _next_om field to new_value if the current value matches
> +// old_value. Otherwise, does not change the _next_om field. Returns
> +// the prior value of the _next_om field.
> +inline ObjectMonitor* ObjectMonitor::try_set_next_om(ObjectMonitor* 
> old_value, ObjectMonitor* new_value) {
> +  return Atomic::cmpxchg(&_next_om, old_value, new_value);
> +}
>
>

Great!

>>
>> And _next_om should be private.  That way all accesses to this field 
>> don't need checking (ie. eyeball checking).
>
> Adding the accessors and setters functions allowed _next_om
> to become private. One less crufty piece!
>
>
>>
>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/synchronizer.cpp.sdiff.html 
>>
>>
>> 129 ObjectMonitor* free_list;
>>
>>
>> These fields should be prepended with _ ie. _free_list to follow the 
>> coding standard.
>
> Done. It feels a little iffy since these globals are only fields
> in a struct because we want padding... but I renamed them anyway.
>
>
>> I would rename LVars to something like _om_globals like 
>> _om_globals._free_list .  I think LVars is an odd name without any 
>> meaning.
>
> In the existing code we have:
>
>     struct SharedGlobals {
>     static SharedGlobals GVars;
>
> so when I added padding to the global list variables I
> followed the existing style:
>
>     struct ListGlobals {
>     static ListGlobals LVars;
>
> I think I'm going to go with:
>
>     struct ObjectMonitorListGlobals {
>     static ObjectMonitorListGlobals om_list_globals;
>
> Note no leading '_' on the global variable.
>

This seems fine.  I noticed SVars later, but that can be changed also later.
>
>> 230 // Set the next field in an ObjectMonitor to the specified value.
>> 231 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
>> 232 Atomic::store(&om->_next_om, value);
>> 233 }
>> 234
>>
>> the callers of this can just call om->set_next_om(value);
>
> Done and removed the static function.
>
>
>> I almost think you don't need om in the names and have the functions 
>> be next() and set_next()
>
> The field name is _next_om so my preference is for next_om()
> and set_next_om()...
>

That's good.
>
>> + if (!try_om_lock(tail)) {
>> + continue; // failed to lock tail so try it all again
>> + }
>>
>>
>> Why would 'tail' be locked here?
>
> First some context:
>
>     L238: static void prepend_list_to_common(ObjectMonitor* list, 
> ObjectMonitor* tail,
>     L239:                                    int count, 
> ObjectMonitor** list_p,
>     L240:                                    int* count_p) {
> <snip>
>     L243:     // Prepend list to *list_p.
>     L244:     if (!try_om_lock(tail)) {
>     L245:       continue;  // failed to lock tail so try it all again
>     L246:     }
>     L247:     set_next(tail, cur);  // tail now points to cur (and 
> unlocks tail)
>
> prepend_list_to_common() is called by:
>
>   - prepend_block_to_lists()
>     - The 'list' and 'tail' belong to a newly allocated list of
>       ObjectMonitors so no race here.

This was the only one I'd read when I asked ...
>   - prepend_list_to_global_free_list()
>     - called by om_flush()
>       - The 'list' and 'tail' were just detached from a per-thread
>         free list that might have a list walker on it.
>       - can race with a debugging audit_and_print_stats() call
>         that is added to debug monitor subsystem problems
>       - can race with an audit_and_print_stats() call from
>         an exit_globals() call that doesn't happen at a safepoint
>         like the one we discovered with the ExitOnFullCodeCache
>         option...
>     - called by deflate_idle_monitors() at a safepoint
>       - The 'list' and 'tail' were just detached from the global
>         in-use list that might have a list walker on it.
>       - can race with a debugging audit_and_print_stats() call
>         that is added to debug monitor subsystem problems
>     - deflate_thread_local_monitors() at a safepoint
>       - The 'list' and 'tail' were just detached from a per-thread
>         in-use list that might have a list walker on it.
>       - can race with a debugging audit_and_print_stats() call
>         that is added to debug monitor subsystem problems
>   - prepend_list_to_global_in_use_list()
>     - called by om_flush()
>       - The 'list' and 'tail' were just detached from a per-thread
>         in-use list that might have a list walker on it.
>       - can race with a debugging audit_and_print_stats() call
>         that is added to debug monitor subsystem problems
>       - can race with an audit_and_print_stats() call from
>         an exit_globals() call that doesn't happen at a safepoint
>         like the one we discovered with the ExitOnFullCodeCache
>         option...
>
> Without this call:
>
>     L244:     if (!try_om_lock(tail)) {
>
> this call:
>
>     L247:     set_next(tail, cur);  // tail now points to cur
>
> can result in stomping on the lock bit from a racing
> audit_and_print_stats() call and when that thread goes
> to unlock the ObjectMonitor, we hit:
>
>     L199:   guarantee(((intptr_t)next & OM_LOCK_BIT) == OM_LOCK_BIT, 
> "next=" INTPTR_FORMAT
>     L200:             " must have OM_LOCK_BIT=%x set.", p2i(next), 
> OM_LOCK_BIT);
>
> This particular race was so much fun to chase down... :-)
>

Okay, I see this could be a problem.

>
>> 'list' is some safe sublist that doesn't change or isn't iterated on, 
>> while being prepended to the list_p, right?
>
> No. See my analysis of the callers above.

Okay, I'd only read through the om_alloc() caller
>
>
>> I was trying to understand the callers, but a comment would be 
>> helpful about what "list" is and where it's coming from.
>
> The 'list' and 'tail' can belong to several different lists so
> mentioning them in a comment would be a bit much...
>
> I've changed that code to:
>
>     if (!try_om_lock(tail)) {
>       // Failed to lock tail due to a list walker so try it all again.
>       continue;
>     }
>
> Please let me know if that helps (and is sufficient).
>
>
>> + prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1], 
>> _BLOCKSIZE - 1,
>>
>>
>> Commentary: this is just awful.    (This leads to more questions but 
>> not about this patch).
>
> That prepend_list_to_common() call is in the new
> prepend_block_to_lists() function. Here's the original
> block of code that prepend_block_to_lists() replaced:
>
> 1112     // Acquire the gListLock to manipulate g_block_list and 
> g_free_list.
> 1113     // An Oyama-Taura-Yonezawa scheme might be more efficient.
> 1114     Thread::muxAcquire(&gListLock, "om_alloc(2)");
> 1115     g_om_population += _BLOCKSIZE-1;
> 1116     g_om_free_count += _BLOCKSIZE-1;
> 1117
> 1118     // Add the new block to the list of extant blocks 
> (g_block_list).
> 1119     // The very first ObjectMonitor in a block is reserved and 
> dedicated.
> 1120     // It serves as blocklist "next" linkage.
> 1121     temp[0]._next_om = g_block_list;
> 1122     // There are lock-free uses of g_block_list so make sure that
> 1123     // the previous stores happen before we update g_block_list.
> 1124     Atomic::release_store(&g_block_list, temp);
> 1125
> 1126     // Add the new string of ObjectMonitors to the global free list
> 1127     temp[_BLOCKSIZE - 1]._next_om = g_free_list;
> 1128     g_free_list = temp + 1;
> 1129     Thread::muxRelease(&gListLock);
>
> So here's the complete "awful" call:
>
>  L291:   // Second we handle LVars.free_list:
>  L292:   prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1], 
> _BLOCKSIZE - 1,
>  L293:                          &LVars.free_list, &LVars.free_count);
>
> and that call replaces these lines of original code:
>
> 1116     g_om_free_count += _BLOCKSIZE-1;
> 1126     // Add the new string of ObjectMonitors to the global free list
> 1127     temp[_BLOCKSIZE - 1]._next_om = g_free_list;
> 1128     g_free_list = temp + 1;
>
>
> Part of the awfulness is the fact that we allocate ObjectMonitors
> in blocks that are chained together on the g_block_list, but that's
> a completely different discussion.
>

Yes it is a different discussion.   It took a while to understand this 
and you've lied to the type system about these blocks.
>
>> + if (is_MonitorBound_exceeded(Atomic::load(&LVars.population) - 
>> Atomic::load(&LVars.free_count))) {
>>
>>
>> Since these are globals, add them to your is_MonitorBound_exceeded 
>> functions.
>
> The MonitorBound option is going away in JDK15:
>
>     JDK-8230940 Obsolete MonitorBound
>     https://bugs.openjdk.java.net/browse/JDK-8230940
>
> so I'm not really worried about the is_MonitorBound_exceeded()
> function. Please let me know if you are okay with that.
>

Sure, great.
>
>> I just hit a big block of code in om_release to look at but have to 
>> go, but here's the first part.
>
> Thanks for part one. I'll keep my eyes open for more review comments.

Thanks,
Coleen

>
> Dan
>
>
>>
>> Coleen
>>
>> On 1/29/20 1:14 PM, Daniel D. Daugherty wrote:
>>> Ping! Still looking for a second reviewer on this changeset...
>>>
>>> Dan
>>>
>>>
>>> On 1/27/20 3:43 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I'm looking for a second reviewer on this thread. I've gone ahead and
>>>> made changes based on David H's comments on CR0.
>>>>
>>>>     JDK-8235795 replace monitor list 
>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>     https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>
>>>> Copyright years will be updated when the patches are rebased to JDK15.
>>>>
>>>> Here's the incremental webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.inc/
>>>>
>>>> Here's the full webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/
>>>>
>>>> Here's what changed between CR0 and CR1:
>>>>
>>>>   - refactor common code
>>>>   - refactor atomic load of LVars.population in 
>>>> monitors_used_above_threshold
>>>>   - simplify list walking in ObjectSynchronizer::om_release() so we 
>>>> lock fewer ObjectMonitors
>>>>   - remove unnecessary locking from 
>>>> ObjectSynchronizer::deflate_monitor_list()
>>>>   - add NoSafepointVerifier helpers to main list management functions
>>>>   - remove unnecessary storestore()
>>>>   - remove unnecessary comments
>>>>   - clarify/fix comments.
>>>>
>>>> These changes have been tested in a Mach5 Tier[1-3] run with no
>>>> regressions. They have also been merged with 8235931 and 8236035 and
>>>> included in a Mach5 Tier[1-8] run with no known regressions (so far
>>>> since Tier8 is not quite finished).
>>>>
>>>> I did a SPECjbb2015 run on these bits with a jdk-14+32 baseline and 
>>>> 25 runs:
>>>>
>>>>     criticalJOPS                 0.25%  (Non-significant)
>>>>                    66754.32   66923.08
>>>>                   ± 1209.80  ± 1585.09
>>>>                              p = 0.674
>>>>
>>>>     maxJOPS                     -1.12%  (Non-significant)
>>>>                    90965.80   89948.80
>>>>                   ± 1788.39  ± 1989.22
>>>>                              p = 0.063
>>>>
>>>> I did a SPECjbb2015 run on the merge of 8235931, 8236035, and 8235795
>>>> with a jdk-14+32 baseline and 25 runs:
>>>>
>>>>     criticalJOPS                 0.37%  (Non-significant)
>>>>                    66754.32   67003.92
>>>>                   ± 1209.80  ± 1662.01
>>>>                              p = 0.547
>>>>
>>>>     maxJOPS                     -0.23%  (Non-significant)
>>>>                    90965.80   90754.00
>>>>                   ± 1788.39  ± 1851.64
>>>>                              p = 0.683
>>>>
>>>> All of these results were flagged as "Non-significant" by the perf
>>>> testing system. Looks like "p" values are still too high.
>>>>
>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 12/23/19 4:57 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I'm extracting another standalone fix from the Async Monitor 
>>>>> Deflation
>>>>> project (JDK-8153224) and sending it out for review (and testing)
>>>>> separately.
>>>>>
>>>>>     JDK-8235795 replace monitor list 
>>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>>
>>>>> Here's the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/0-for-jdk15/
>>>>>
>>>>> Folks that have reviewed JDK-8153224 will recognize these changes as
>>>>> a subset of the monitor list changes from the Async Monitor Deflation
>>>>> project. It's a subset because the Async Monitor Deflation project
>>>>> needs additional spin locking due to the async deflation work.
>>>>>
>>>>> The OpenJDK wiki for Async Monitor Deflation has several sections
>>>>> dedicated to the Spin-Lock Monitor List Management changes. This
>>>>> link will get you to the first section:
>>>>>
>>>>> Spin-Lock Monitor List Management In Theory
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInTheory 
>>>>>
>>>>>
>>>>> The remaining monitor list sections are:
>>>>>
>>>>> Background: ObjectMonitor Movement Between the Lists
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Background:ObjectMonitorMovementBetweentheLists 
>>>>>
>>>>>
>>>>> Spin-Lock Monitor List Management In Reality
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInReality 
>>>>>
>>>>>
>>>>> Using The New Spin-Lock Monitor List Functions
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-UsingTheNewSpin-LockMonitorListFunctions 
>>>>>
>>>>>
>>>>> Of course, the OpenJDK wiki content is specific to the Async Monitor
>>>>> Deflation project, but this extract is a very close subset.
>>>>>
>>>>> These changes have been tested in various Mach5 Tier[1-7] runs.
>>>>> I'm also doing SPECjbb2015 runs.
>>>>>
>>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list