RFR(L): 8235795: replace monitor list mux{Acquire,Release}(&gListLock) with spin locks

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Jan 31 23:48:37 UTC 2020


Hi Coleen,

Thanks for the review!

Replies embedded below (as usual)...


On 1/30/20 8:22 PM, coleen.phillimore at oracle.com wrote:
>
> Dan,
>
> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/objectMonitor.hpp.udiff.html 
>
>
> There should be three functions in ObjectMonitor like:
>
> ObjectMonitor* next_om() const { Atomic::load(&_next_om); }
> void set_next_om(ObjectMonitor* value) { Atomic::store(&_next_om, 
> value); }
> and one for the CAS.

I like it! It makes sense to do this since I'm touch all the uses
of _next_om.

So here's what I added to 
src/hotspot/share/runtime/objectMonitor.inline.hpp:

+inline ObjectMonitor* ObjectMonitor::next_om() const {
+  return Atomic::load(&_next_om);
+}
+
+// Simply set _next_om field to new_value.
+inline void ObjectMonitor::set_next_om(ObjectMonitor* new_value) {
+  Atomic::store(&_next_om, new_value);
+}
+
+// Try to set _next_om field to new_value if the current value matches
+// old_value. Otherwise, does not change the _next_om field. Returns
+// the prior value of the _next_om field.
+inline ObjectMonitor* ObjectMonitor::try_set_next_om(ObjectMonitor* 
old_value, ObjectMonitor* new_value) {
+  return Atomic::cmpxchg(&_next_om, old_value, new_value);
+}


>
> And _next_om should be private.  That way all accesses to this field 
> don't need checking (ie. eyeball checking).

Adding the accessors and setters functions allowed _next_om
to become private. One less crufty piece!


>
> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/src/hotspot/share/runtime/synchronizer.cpp.sdiff.html 
>
>
> 129 ObjectMonitor* free_list;
>
>
> These fields should be prepended with _ ie. _free_list to follow the 
> coding standard.

Done. It feels a little iffy since these globals are only fields
in a struct because we want padding... but I renamed them anyway.


> I would rename LVars to something like _om_globals like 
> _om_globals._free_list .  I think LVars is an odd name without any 
> meaning.

In the existing code we have:

     struct SharedGlobals {
     static SharedGlobals GVars;

so when I added padding to the global list variables I
followed the existing style:

     struct ListGlobals {
     static ListGlobals LVars;

I think I'm going to go with:

     struct ObjectMonitorListGlobals {
     static ObjectMonitorListGlobals om_list_globals;

Note no leading '_' on the global variable.


> 230 // Set the next field in an ObjectMonitor to the specified value.
> 231 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
> 232 Atomic::store(&om->_next_om, value);
> 233 }
> 234
>
> the callers of this can just call om->set_next_om(value);

Done and removed the static function.


> I almost think you don't need om in the names and have the functions 
> be next() and set_next()

The field name is _next_om so my preference is for next_om()
and set_next_om()...


> + if (!try_om_lock(tail)) {
> + continue; // failed to lock tail so try it all again
> + }
>
>
> Why would 'tail' be locked here?

First some context:

     L238: static void prepend_list_to_common(ObjectMonitor* list, 
ObjectMonitor* tail,
     L239:                                    int count, ObjectMonitor** 
list_p,
     L240:                                    int* count_p) {
<snip>
     L243:     // Prepend list to *list_p.
     L244:     if (!try_om_lock(tail)) {
     L245:       continue;  // failed to lock tail so try it all again
     L246:     }
     L247:     set_next(tail, cur);  // tail now points to cur (and 
unlocks tail)

prepend_list_to_common() is called by:

   - prepend_block_to_lists()
     - The 'list' and 'tail' belong to a newly allocated list of
       ObjectMonitors so no race here.
   - prepend_list_to_global_free_list()
     - called by om_flush()
       - The 'list' and 'tail' were just detached from a per-thread
         free list that might have a list walker on it.
       - can race with a debugging audit_and_print_stats() call
         that is added to debug monitor subsystem problems
       - can race with an audit_and_print_stats() call from
         an exit_globals() call that doesn't happen at a safepoint
         like the one we discovered with the ExitOnFullCodeCache
         option...
     - called by deflate_idle_monitors() at a safepoint
       - The 'list' and 'tail' were just detached from the global
         in-use list that might have a list walker on it.
       - can race with a debugging audit_and_print_stats() call
         that is added to debug monitor subsystem problems
     - deflate_thread_local_monitors() at a safepoint
       - The 'list' and 'tail' were just detached from a per-thread
         in-use list that might have a list walker on it.
       - can race with a debugging audit_and_print_stats() call
         that is added to debug monitor subsystem problems
   - prepend_list_to_global_in_use_list()
     - called by om_flush()
       - The 'list' and 'tail' were just detached from a per-thread
         in-use list that might have a list walker on it.
       - can race with a debugging audit_and_print_stats() call
         that is added to debug monitor subsystem problems
       - can race with an audit_and_print_stats() call from
         an exit_globals() call that doesn't happen at a safepoint
         like the one we discovered with the ExitOnFullCodeCache
         option...

Without this call:

     L244:     if (!try_om_lock(tail)) {

this call:

     L247:     set_next(tail, cur);  // tail now points to cur

can result in stomping on the lock bit from a racing
audit_and_print_stats() call and when that thread goes
to unlock the ObjectMonitor, we hit:

     L199:   guarantee(((intptr_t)next & OM_LOCK_BIT) == OM_LOCK_BIT, 
"next=" INTPTR_FORMAT
     L200:             " must have OM_LOCK_BIT=%x set.", p2i(next), 
OM_LOCK_BIT);

This particular race was so much fun to chase down... :-)


> 'list' is some safe sublist that doesn't change or isn't iterated on, 
> while being prepended to the list_p, right?

No. See my analysis of the callers above.


> I was trying to understand the callers, but a comment would be helpful 
> about what "list" is and where it's coming from.

The 'list' and 'tail' can belong to several different lists so
mentioning them in a comment would be a bit much...

I've changed that code to:

     if (!try_om_lock(tail)) {
       // Failed to lock tail due to a list walker so try it all again.
       continue;
     }

Please let me know if that helps (and is sufficient).


> + prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1], 
> _BLOCKSIZE - 1,
>
>
> Commentary: this is just awful.    (This leads to more questions but 
> not about this patch).

That prepend_list_to_common() call is in the new
prepend_block_to_lists() function. Here's the original
block of code that prepend_block_to_lists() replaced:

1112     // Acquire the gListLock to manipulate g_block_list and 
g_free_list.
1113     // An Oyama-Taura-Yonezawa scheme might be more efficient.
1114     Thread::muxAcquire(&gListLock, "om_alloc(2)");
1115     g_om_population += _BLOCKSIZE-1;
1116     g_om_free_count += _BLOCKSIZE-1;
1117
1118     // Add the new block to the list of extant blocks (g_block_list).
1119     // The very first ObjectMonitor in a block is reserved and 
dedicated.
1120     // It serves as blocklist "next" linkage.
1121     temp[0]._next_om = g_block_list;
1122     // There are lock-free uses of g_block_list so make sure that
1123     // the previous stores happen before we update g_block_list.
1124     Atomic::release_store(&g_block_list, temp);
1125
1126     // Add the new string of ObjectMonitors to the global free list
1127     temp[_BLOCKSIZE - 1]._next_om = g_free_list;
1128     g_free_list = temp + 1;
1129     Thread::muxRelease(&gListLock);

So here's the complete "awful" call:

  L291:   // Second we handle LVars.free_list:
  L292:   prepend_list_to_common(new_blk + 1, &new_blk[_BLOCKSIZE - 1], 
_BLOCKSIZE - 1,
  L293:                          &LVars.free_list, &LVars.free_count);

and that call replaces these lines of original code:

1116     g_om_free_count += _BLOCKSIZE-1;
1126     // Add the new string of ObjectMonitors to the global free list
1127     temp[_BLOCKSIZE - 1]._next_om = g_free_list;
1128     g_free_list = temp + 1;


Part of the awfulness is the fact that we allocate ObjectMonitors
in blocks that are chained together on the g_block_list, but that's
a completely different discussion.


> + if (is_MonitorBound_exceeded(Atomic::load(&LVars.population) - 
> Atomic::load(&LVars.free_count))) {
>
>
> Since these are globals, add them to your is_MonitorBound_exceeded 
> functions.

The MonitorBound option is going away in JDK15:

     JDK-8230940 Obsolete MonitorBound
     https://bugs.openjdk.java.net/browse/JDK-8230940

so I'm not really worried about the is_MonitorBound_exceeded()
function. Please let me know if you are okay with that.


> I just hit a big block of code in om_release to look at but have to 
> go, but here's the first part.

Thanks for part one. I'll keep my eyes open for more review comments.

Dan


>
> Coleen
>
> On 1/29/20 1:14 PM, Daniel D. Daugherty wrote:
>> Ping! Still looking for a second reviewer on this changeset...
>>
>> Dan
>>
>>
>> On 1/27/20 3:43 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I'm looking for a second reviewer on this thread. I've gone ahead and
>>> made changes based on David H's comments on CR0.
>>>
>>>     JDK-8235795 replace monitor list 
>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>     https://bugs.openjdk.java.net/browse/JDK-8235795
>>>
>>> Copyright years will be updated when the patches are rebased to JDK15.
>>>
>>> Here's the incremental webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.inc/
>>>
>>> Here's the full webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/1-for-jdk15.full/
>>>
>>> Here's what changed between CR0 and CR1:
>>>
>>>   - refactor common code
>>>   - refactor atomic load of LVars.population in 
>>> monitors_used_above_threshold
>>>   - simplify list walking in ObjectSynchronizer::om_release() so we 
>>> lock fewer ObjectMonitors
>>>   - remove unnecessary locking from 
>>> ObjectSynchronizer::deflate_monitor_list()
>>>   - add NoSafepointVerifier helpers to main list management functions
>>>   - remove unnecessary storestore()
>>>   - remove unnecessary comments
>>>   - clarify/fix comments.
>>>
>>> These changes have been tested in a Mach5 Tier[1-3] run with no
>>> regressions. They have also been merged with 8235931 and 8236035 and
>>> included in a Mach5 Tier[1-8] run with no known regressions (so far
>>> since Tier8 is not quite finished).
>>>
>>> I did a SPECjbb2015 run on these bits with a jdk-14+32 baseline and 
>>> 25 runs:
>>>
>>>     criticalJOPS                 0.25%  (Non-significant)
>>>                    66754.32   66923.08
>>>                   ± 1209.80  ± 1585.09
>>>                              p = 0.674
>>>
>>>     maxJOPS                     -1.12%  (Non-significant)
>>>                    90965.80   89948.80
>>>                   ± 1788.39  ± 1989.22
>>>                              p = 0.063
>>>
>>> I did a SPECjbb2015 run on the merge of 8235931, 8236035, and 8235795
>>> with a jdk-14+32 baseline and 25 runs:
>>>
>>>     criticalJOPS                 0.37%  (Non-significant)
>>>                    66754.32   67003.92
>>>                   ± 1209.80  ± 1662.01
>>>                              p = 0.547
>>>
>>>     maxJOPS                     -0.23%  (Non-significant)
>>>                    90965.80   90754.00
>>>                   ± 1788.39  ± 1851.64
>>>                              p = 0.683
>>>
>>> All of these results were flagged as "Non-significant" by the perf
>>> testing system. Looks like "p" values are still too high.
>>>
>>> Thanks, in advance, for comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 12/23/19 4:57 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I'm extracting another standalone fix from the Async Monitor Deflation
>>>> project (JDK-8153224) and sending it out for review (and testing)
>>>> separately.
>>>>
>>>>     JDK-8235795 replace monitor list 
>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>     https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>
>>>> Here's the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8235795-webrev/0-for-jdk15/
>>>>
>>>> Folks that have reviewed JDK-8153224 will recognize these changes as
>>>> a subset of the monitor list changes from the Async Monitor Deflation
>>>> project. It's a subset because the Async Monitor Deflation project
>>>> needs additional spin locking due to the async deflation work.
>>>>
>>>> The OpenJDK wiki for Async Monitor Deflation has several sections
>>>> dedicated to the Spin-Lock Monitor List Management changes. This
>>>> link will get you to the first section:
>>>>
>>>> Spin-Lock Monitor List Management In Theory
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInTheory 
>>>>
>>>>
>>>> The remaining monitor list sections are:
>>>>
>>>> Background: ObjectMonitor Movement Between the Lists
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Background:ObjectMonitorMovementBetweentheLists 
>>>>
>>>>
>>>> Spin-Lock Monitor List Management In Reality
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-Spin-LockMonitorListManagementInReality 
>>>>
>>>>
>>>> Using The New Spin-Lock Monitor List Functions
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-UsingTheNewSpin-LockMonitorListFunctions 
>>>>
>>>>
>>>> Of course, the OpenJDK wiki content is specific to the Async Monitor
>>>> Deflation project, but this extract is a very close subset.
>>>>
>>>> These changes have been tested in various Mach5 Tier[1-7] runs.
>>>> I'm also doing SPECjbb2015 runs.
>>>>
>>>> Thanks, in advance, for comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list