RFR(L) 8153224 Monitor deflation prolong safepoints (CR8/v2.08/11-for-jdk14)
Robbin Ehn
robbin.ehn at oracle.com
Fri Nov 8 13:35:31 UTC 2019
Hi Dan,
Thanks for looking into this, some comments on v8:
##################
src/hotspot/cpu/sparc/globalDefinitions_sparc.hpp
src/hotspot/cpu/x86/globalDefinitions_x86.hpp
src/hotspot/share/logging/logTag.hpp
src/hotspot/share/oops/markWord.hpp
src/hotspot/share/runtime/basicLock.cpp
src/hotspot/share/runtime/safepoint.cpp
src/hotspot/share/runtime/serviceThread.cpp
src/hotspot/share/runtime/sharedRuntime.cpp
src/hotspot/share/runtime/synchronizer.hpp
src/hotspot/share/runtime/vmOperations.cpp
src/hotspot/share/runtime/vmOperations.hpp
src/hotspot/share/runtime/vmStructs.cpp
src/hotspot/share/runtime/vmThread.cpp
test/hotspot/gtest/oops/test_markWord.cpp
No comments.
##################
I don't see the benefit of having the -HandshakeAfterDeflateIdleMonitors code paths.
Removing that option would mean these files can be reverted:
src/hotspot/cpu/aarch64/globals_aarch64.hpp
src/hotspot/cpu/arm/globals_arm.hpp
src/hotspot/cpu/ppc/globals_ppc.hpp
src/hotspot/cpu/s390/globals_s390.hpp
src/hotspot/cpu/sparc/globals_sparc.hpp
src/hotspot/cpu/x86/globals_x86.hpp
src/hotspot/cpu/x86/macroAssembler_x86.cpp
src/hotspot/cpu/x86/macroAssembler_x86.hpp
src/hotspot/cpu/zero/globals_zero.hpp
And one less option here:
src/hotspot/share/runtime/globals.hpp
##################
src/hotspot/share/prims/jvm.cpp
Unclear if this is a good idea.
##################
src/hotspot/share/prims/whitebox.cpp
This would assume the test expects the right thing, but that is not obvious.
##################
src/hotspot/share/prims/jvmtiEnvBase.cpp
The current pending and waiting monitor is only changed by the JavaThread itself.
It only sets it after _contentions is increased.
It clears it before _contentions is decreased.
We are depending on safepoint or the thread is suspended, so it can't be
deflated since _contentions are > 0.
Plus the thread have already increased the ref count and can't decrease it
(since at safepoint or suspended).
##################
src/hotspot/share/runtime/objectMonitor.cpp
###1
You have several these (and in other files):
242 jint l_ref_count = ref_count();
243 ADIM_guarantee(l_ref_count > 0, "must be positive: l_ref_count=%d,
ref_count=%d", l_ref_count, ref_count());
Please use Atomic::load() in ref_count.
Since this is dependent on ref_count being volatile, otherwise the compiler may
only do one load.
###2
307 // Prevent deflation. See ObjectSynchronizer::deflate_monitor(),
...
311 Atomic::add(1, &_contentions);
In ObjectSynchronizer::deflate_monitor if you would check ref count instead of
_contetion, we could remove contention.
Since all waiters also have a ref count it looks like we don't need waiters either.
In ObjectSynchronizer::deflate_monitor:
if (mid->_contentions != 0 || mid->_waiters != 0) {
Why not just do:
if (mid->ref_count()) {
?
##################
src/hotspot/share/runtime/objectMonitor.hpp
###1
252 intptr_t is_busy() const {
253 // TODO-FIXME: assert _owner == null implies _recursions = 0
254 // We do not include _ref_count in the is_busy() check because
255 // _ref_count is for indicating that the ObjectMonitor* is in
256 // use which is orthogonal to whether the ObjectMonitor itself
257 // is in use for a locking operation.
But in the non-debug code we always check:
+ if (mid->is_busy() || mid->ref_count() != 0) {
So it seem like you should have a method including ref count.
##################
src/hotspot/share/runtime/objectMonitor.inline.hpp
Use Atomic::load for ref count.
##################
src/hotspot/share/runtime/synchronizer.cpp
###1
139 static volatile int g_om_free_count = 0; // # on g_free_list
140 static volatile int g_om_in_use_count = 0; // # on g_om_in_use_list
141 static volatile int g_om_population = 0; // # Extant -- in circulation
142 static volatile int g_om_wait_count = 0; // # on g_wait_list
No padding here, aren't they more contended than the fields in the OM?
###2
151 static bool is_next_marked(ObjectMonitor* om) {
Is only used in ObjectSynchronizer::om_flush.
Here you fetch a OM and read the next field, this do not need LA semantics on
supported platforms.
This would only need Atomic::load.
###3
191 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
In no place you need SR, in the only places it would made a difference:
345 OrderAccess::storestore();
346 set_next(cur, next); // Unmark the previous list head.
and
1714 OrderAccess::storestore();
1715 set_next(in_use_list, next);
You have a storestore already!
This code reads as:
OrderAccess::storestore();
OrderAccess::loadstore();
OrderAccess::storestore();
om->_next_om = value
So it should be an Atomic::store.
###4
198 static bool mark_list_head(ObjectMonitor* volatile * list_p
Since the mark is an embedded spinlock I think the terminology should be
changed. (that the spinlock is inside a the next pointer should be abstracted away)
E.g. mark_next_loop would just be lock.
The load of the list heads should use Atmoic:load.
It also seem a bit wired to return next for the locking method.
And output parameter can just be returned, and return NULL if list head is NULL.
E.g.
198 static ObjectMonitor* get_list_head_locked(ObjectMonitor* volatile * list_p) {
200 while (true) {
201 ObjectMonitor* mid = Atomic::load(list_p);
202 if (mid == NULL) {
203 return NULL; // The list is empty.
204 }
205 if (try_lock(mid)) {
206 if (Atmoic::load(list_p) != mid) {
207 // The list head changed so we have to retry.
208 unlock(mid);
210 } else {
return mid;
}
214 }
// Yield ?
215 }
216 }
With colleteral changes.
###5
220 static ObjectMonitor* unmarked_next(ObjectMonitor* om)
Atomic::store is what needed.
###6
333 static void prepend_to_common(
345 OrderAccess::storestore();
346 set_next(cur, next); // Unmark the previous list head.
Double storestore. (fixed by changing set_next to Atomic::store)
###7
375 static ObjectMonitor* take_from_start_of_common(ObjectMonitor* volatile *
list_p,
Triple storestore here.
386 Atomic::dec(count_p);
387 // mark_list_head() used cmpxchg() above, switching list head can be lazier:
388 OrderAccess::storestore();
389 // Unmark take, but leave the next value for any lagging list
390 // walkers. It will get cleaned up when take is prepended to
391 // the in-use list:
392 set_next(take, next);
393 return take;
Reads:
count_p--
OrderAccess::loadstore();
OrderAccess::storestore();
OrderAccess::storestore();
OrderAccess::loadstore();
OrderAccess::storestore();
take->_next_om = next;
Fixed by changing set_next to Atomic::store and removing the
OrderAccess::storestore();
###8
ObjectSynchronizer::om_release(
1591 if (m == mid) {
1592 // We found 'm' on the per-thread in-use list so try to extract it.
1593 if (cur_mid_in_use == NULL) {
1594 // mid is the list head and it is marked. Switch the list head
1595 // to next which unmarks the list head, but leaves mid marked:
1596 self->om_in_use_list = next;
1597 // mark_list_head() used cmpxchg() above, switching list head can
be lazier:
1598 OrderAccess::storestore();
1599 } else {
1600 // mid and cur_mid_in_use are marked. Switch cur_mid_in_use's
1601 // next field to next which unmarks cur_mid_in_use, but leaves
1602 // mid marked:
1603 OrderAccess::release_store(&cur_mid_in_use->_next_om, next);
1604 }
1605 extracted = true;
1606 Atomic::dec(&self->om_in_use_count);
1607 // Unmark mid, but leave the next value for any lagging list
1608 // walkers. It will get cleaned up when mid is prepended to
1609 // the thread's free list:
1610 set_next(mid, next);
1611 break;
1612 }
This does not look correct. Before taking this branch we have done a cmpxchg in
mark_list_head or mark_next_loop.
This is how it reads:
OrderAccess::storestore(); // from previous cmpxchg
OrderAccess::loadstore(); // from previous cmpxchg
1591 if (m == mid) {
1593 if (cur_mid_in_use == NULL) {
1596 self->om_in_use_list = next;
1598 OrderAccess::storestore();
1599 } else {
OrderAccess::storestore();
OrderAccess::loadstore();
1603 cur_mid_in_use->_next_om = next;
1604 }
1605 extracted = true;
OrderAccess::storestore();
OrderAccess::fence(); // storestore|storeload|loadstore|loadload
self->om_in_use_count--; // Atomic::dec
OrderAccess::storestore();
OrderAccess::loadstore();
OrderAccess::storestore();
OrderAccess::loadstore();
mid->_next_om = next; // Atomic::store
1611 break;
1612 }
extracted is local variable so you so not need any orderaccess before it set.
Fixed by changing set_next to Atomic::store, removing the
OrderAccess::storestore() and changing OrderAccess::release_store to
Atmoic::store();
###9
1653 void ObjectSynchronizer::om_flush(Thread* self) {
1714 OrderAccess::storestore();
1715 set_next(in_use_list, next);
Fixed by changing set_next to Atomic::store.
###10
1737 self->om_free_list = NULL;
1738 OrderAccess::storestore(); // Lazier memory is okay for list walkers.
prepend_list_to_g_free_list/prepend_list_to_g_om_in_use_list does first thing
cmpxchg so there is no need for this storestore.
###11
1797 void ObjectSynchronizer::inflate(ObjectMonitorHandle* omh_p, Thread* self,
1938 // Once ObjectMonitor is configured and the object is associated
1939 // with the ObjectMonitor, it is safe to allow async deflation:
1940 assert(m->is_new(), "freshly allocated monitor must be new");
1941 m->set_allocation_state(ObjectMonitor::Old);
So we use ref count, contention, waiter, owner and allocation state to keep OM
alive in different scenarios.
There is not way for me to keep track of that. I don't see why you would need
more than owner and ref count.
If you allocate the om with ref count 1 you can remove _allocation_state and
just decrease ref count here instead.
###12
2079 bool ObjectSynchronizer::deflate_monitor
2112 if (AsyncDeflateIdleMonitors) {
2113 // clear() expects the owner field to be NULL and we won't race
2114 // with the simple C2 ObjectMonitor
The macro assambler code is not just executed by C2, so this comment is a bit
misleading. (there are some more also)
###13
2306 int ObjectSynchronizer::deflate_monitor_list(
Same issue as ObjectSynchronizer::om_release.
Fixed by changing set_next to Atomic::store, removing the
OrderAccess::storestore() and changing OrderAccess::release_store to
Atmoic::store();
###14
2474 if (SafepointSynchronize::is_synchronizing() &&
This is the wrong method to call, it should
SafepointMechanism::should_block(Thread* thread);
###15
2578 void ObjectSynchronizer::deflate_idle_monitors_using_JT() {
2616 g_wait_list = NULL;
2617 OrderAccess::storestore(); // Lazier memory sync is okay for list walkers.
I don't see that g_wait_list is ever simutainously read.
Either it is accessed by serviceThread outside a safepoint or by VMThread inside
a safepoint?
It looks like g_wait_list can just be a local in:
void ObjectSynchronizer::deflate_idle_monitors_using_JT()
(disregarding the debug code that might read it in a safepoint)
###16
2722 assert(SafepointSynchronize::is_synchronizing(), "sanity check");
This is the wrong method to call, it should
SafepointMechanism::should_block(Thread* thread);
##################
src/hotspot/share/runtime/vframe.cpp
We are at safepoint or current thread or in a handshake, current pending and
waiting monitor is already stable.
##################
src/hotspot/share/services/threadService.cpp
These changes are only needed for the -HandshakeAfterDeflateIdleMonitors path.
##################
test/jdk/java/rmi/server/UnicastRemoteObject/unexportObject/UnexportLeak.java
Note: if OM had a weak to object instead this would not be needed.
Thanks, Robbin
On 11/4/19 10:03 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have made changes to the Async Monitor Deflation code in response to
> the CR7/v2.07/10-for-jdk14 code review cycle. Thanks to David H., Robbin
> and Erik O. for their comments!
>
> JDK14 Rampdown phase one is coming on Dec. 12, 2019 and the Async Monitor
> Deflation project needs to push before Nov. 12, 2019 in order to allow
> for sufficient bake time for such a big change. Nov. 12 is _next_ Tuesday
> so we have 8 days from today to finish this code review cycle and push
> this code for JDK14.
>
> Carsten and Roman! Time for you guys to chime in again on the code reviews.
>
> I have attached the change list from CR7 to CR8 instead of putting it in
> the body of this email. I've also added a link to the CR7-to-CR8-changes
> file to the webrevs so it should be easy to find.
>
> Main bug URL:
>
> JDK-8153224 Monitor deflation prolong safepoints
> https://bugs.openjdk.java.net/browse/JDK-8153224
>
> The project is currently baselined on jdk-14+21.
>
> Here's the full webrev URL for those folks that want to see all of the
> current Async Monitor Deflation code in one go (v2.08 full):
>
> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.full
>
> Some folks might want to see just what has changed since the last review
> cycle so here's a webrev for that (v2.08 inc):
>
> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.inc/
>
> The OpenJDK wiki did not need any changes for this round:
>
> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>
> The jdk-14+21 based v2.08 version of the patch has been thru Mach5 tier[1-8]
> testing on Oracle's usual set of platforms. It has also been through my usual
> set of stress testing on Linux-X64, macOSX and Solaris-X64 with the addition
> of Robbin's "MoCrazy 1024" test running in parallel with the other tests in
> my lab. Some testing is still running, but so far there are no new regressions.
>
> I have not yet done a SPECjbb2015 round on the CR8/v2.08/11-for-jdk14 bits.
>
> Thanks, in advance, for any questions, comments or suggestions.
>
> Dan
>
>
> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> The Async Monitor Deflation project is reaching the end game. I have no
>> changes planned for the project at this time so all that is left is code
>> review and any changes that results from those reviews.
>>
>> Carsten and Roman! Time for you guys to chime in again on the code reviews.
>>
>> I have attached the list of fixes from CR6 to CR7 instead of putting it
>> in the main body of this email.
>>
>> Main bug URL:
>>
>> JDK-8153224 Monitor deflation prolong safepoints
>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>
>> The project is currently baselined on jdk-14+19.
>>
>> Here's the full webrev URL for those folks that want to see all of the
>> current Async Monitor Deflation code in one go (v2.07 full):
>>
>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.full
>>
>> Some folks might want to see just what has changed since the last review
>> cycle so here's a webrev for that (v2.07 inc):
>>
>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.inc/
>>
>> The OpenJDK wiki has been updated to match the CR7/v2.07/10-for-jdk14 changes:
>>
>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>
>> The jdk-14+18 based v2.07 version of the patch has been thru Mach5 tier[1-8]
>> testing on Oracle's usual set of platforms. It has also been through my usual
>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with the addition
>> of Robbin's "MoCrazy 1024" test running in parallel with the other tests in
>> my lab.
>>
>> The jdk-14+19 based v2.07 version of the patch has been thru Mach5 tier[1-3]
>> test on Oracle's usual set of platforms. Mach5 tier[4-8] are in process.
>>
>> I did another round of SPECjbb2015 testing in Oracle's Aurora Performance lab
>> using using their tuned SPECjbb2015 Linux-X64 G1 configs:
>>
>> - "base" is jdk-14+18
>> - "v2.07" is the latest version and includes C2 inc_om_ref_count() support
>> on LP64 X64 and the new HandshakeAfterDeflateIdleMonitors option
>> - "off" is with -XX:-AsyncDeflateIdleMonitors specified
>> - "handshake" is with -XX:+HandshakeAfterDeflateIdleMonitors specified
>>
>> hbIR hbIR
>> (max attempted) (settled) max-jOPS critical-jOPS runtime
>> --------------- --------- -------- ------------- -------
>> 34282.00 30635.90 28831.30 20969.20 3841.30 base
>> 34282.00 30973.00 29345.80 21025.20 3964.10 v2.07
>> 34282.00 31105.60 29174.30 21074.00 3931.30 v2.07_handshake
>> 34282.00 30789.70 27151.60 19839.10 3850.20 v2.07_off
>>
>> - The Aurora Perf comparison tool reports:
>>
>> Comparison max-jOPS critical-jOPS
>> ---------------------- -------------------- --------------------
>> base vs 2.07 +1.78% (s, p=0.000) +0.27% (ns, p=0.790)
>> base vs 2.07_handshake +1.19% (s, p=0.007) +0.58% (ns, p=0.536)
>> base vs 2.07_off -5.83% (ns, p=0.394) -5.39% (ns, p=0.347)
>>
>> (s) - significant (ns) - not-significant
>>
>> - For historical comparison, the Aurora Perf comparision tool
>> reported for v2.06 with a baseline of jdk-13+31:
>>
>> Comparison max-jOPS critical-jOPS
>> ---------------------- -------------------- --------------------
>> base vs 2.06 -0.32% (ns, p=0.345) +0.71% (ns, p=0.646)
>> base vs 2.06_off +0.49% (ns, p=0.292) -1.21% (ns, p=0.481)
>>
>> (s) - significant (ns) - not-significant
>>
>> Thanks, in advance, for any questions, comments or suggestions.
>>
>> Dan
>>
>>
>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> The Async Monitor Deflation project has rebased to JDK14 so it's time
>>> for our first code review in that new context!!
>>>
>>> I've been focused on changing the monitor list management code to be
>>> lock-free in order to make SPECjbb2015 happier. Of course with a change
>>> like that, it takes a while to chase down all the new and wonderful
>>> races. At this point, I have the code back to the same stability that
>>> I had with CR5/v2.05/8-for-jdk13.
>>>
>>> To lay the ground work for this round of review, I pushed the following
>>> two fixes to jdk/jdk earlier today:
>>>
>>> JDK-8230184 rename, whitespace, indent and comments changes in preparation
>>> for lock free Monitor lists
>>> https://bugs.openjdk.java.net/browse/JDK-8230184
>>>
>>> JDK-8230317 serviceability/sa/ClhsdbPrintStatics.java fails after 8230184
>>> https://bugs.openjdk.java.net/browse/JDK-8230317
>>>
>>> I have attached the list of fixes from CR5 to CR6 instead of putting
>>> in the main body of this email.
>>>
>>> Main bug URL:
>>>
>>> JDK-8153224 Monitor deflation prolong safepoints
>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>
>>> The project is currently baselined on jdk-14+11 plus the fixes for
>>> JDK-8230184 and JDK-8230317.
>>>
>>> Here's the full webrev URL for those folks that want to see all of the
>>> current Async Monitor Deflation code in one go (v2.06 full):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.full/
>>>
>>>
>>> The primary focus of this review cycle is on the lock-free Monitor List
>>> management changes so here's a webrev for just that patch (v2.06c):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06c.inc/
>>>
>>> The secondary focus of this review cycle is on the bug fixes that have
>>> been made since CR5/v2.05/8-for-jdk13 so here's a webrev for just that
>>> patch (v2.06b):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06b.inc/
>>>
>>> The third and final bucket for this review cycle is the rename, whitespace,
>>> indent and comments changes made in preparation for lock free Monitor list
>>> management. Almost all of that was extracted into JDK-8230184 for the
>>> baseline so this bucket now has just a few comment changes relative to
>>> CR5/v2.05/8-for-jdk13. Here's a webrev for the remainder (v2.06a):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06a.inc/
>>>
>>>
>>> Some folks might want to see just what has changed since the last review
>>> cycle so here's a webrev for that (v2.06 inc):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.inc/
>>>
>>>
>>> Last, but not least, some folks might want to see the code before the
>>> addition of lock-free Monitor List management so here's a webrev for
>>> that (v2.00 -> v2.05):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.05.inc/
>>>
>>> The OpenJDK wiki will need minor updates to match the CR6 changes:
>>>
>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>
>>> but that should only be changes to describe per-thread list async monitor
>>> deflation being done by the ServiceThread.
>>>
>>> (I did update the OpenJDK wiki for the CR5 changes back on 2019.08.14)
>>>
>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>> Oracle's usual set of platforms. It has also been through my usual set
>>> of stress testing on Linux-X64, macOSX and Solaris-X64.
>>>
>>> I did a bunch of SPECjbb2015 testing in Oracle's Aurora Performance lab
>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs. This was using
>>> this patch baselined on jdk-13+31 (for stability):
>>>
>>> hbIR hbIR
>>> (max attempted) (settled) max-jOPS critical-jOPS runtime
>>> --------------- --------- -------- ------------- -------
>>> 34282.00 28837.20 27905.20 19817.40 3658.10 base
>>> 34965.70 29798.80 27814.90 19959.00 3514.60 v2.06d
>>> 34282.00 29100.70 28042.50 19577.00 3701.90 v2.06d_off
>>> 34282.00 29218.50 27562.80 19397.30 3657.60 v2.06d_ocache
>>> 34965.70 29838.30 26512.40 19170.60 3569.90 v2.05
>>> 34282.00 28926.10 27734.00 19835.10 3588.40 v2.05_off
>>>
>>> The "off" configs are with -XX:-AsyncDeflateIdleMonitors specified and
>>> the "ocache" config is with 128 byte cache line sizes instead of 64 byte
>>> cache lines sizes. "v2.06d" is the last set of changes that I made before
>>> those changes were distributed into the "v2.06a", "v2.06b" and "v2.06c"
>>> buckets for this review recycle.
>>>
>>>
>>> Thanks, in advance, for any questions, comments or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I've been focused on chasing down and fixing the rare test failures
>>>> that only pop up rarely. So this round is primarily fixes for races
>>>> with a few additional fixes that came from Karen's review of CR4.
>>>> Thanks Karen!
>>>>
>>>> I have attached the list of fixes from CR4 to CR5 instead of putting
>>>> in the main body of this email.
>>>>
>>>> Main bug URL:
>>>>
>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>
>>>> The project is currently baselined on jdk-13+29. This will likely be
>>>> the last JDK13 baseline for this project and I'll roll to the JDK14
>>>> (jdk/jdk) repo soon...
>>>>
>>>> Here's the full webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.full/
>>>>
>>>> Here's the incremental webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.inc/
>>>>
>>>> I have not yet checked the OpenJDK wiki to see if it needs any updates
>>>> to match the CR5 changes:
>>>>
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>
>>>> (I did update the OpenJDK wiki for the CR4 changes back on 2019.06.26)
>>>>
>>>> This version of the patch has been thru Mach5 tier[1-3] testing on
>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running now and
>>>> Mach5 tier[78] will follow. I'll kick off the usual stress testing
>>>> on Linux-X64, macOSX and Solaris-X64 as those machines become available.
>>>> Since I haven't made any performance changes in this round, I'll only
>>>> be running SPECjbb2015 to gather the latest monitorinflation logs.
>>>>
>>>> Next up:
>>>>
>>>> - We're still seeing 4-5% lower performance with SPECjbb2015 on
>>>> Linux-X64 and we've determined that some of that comes from
>>>> contention on the gListLock. So I'm going to investigate removing
>>>> the gListLock. Yes, another lock free set of changes is coming!
>>>> - Of course, going lock free often causes new races and new failures
>>>> so that's a good reason for make those changes isolated in their
>>>> own round (and not holding up CR5/v2.05/8-for-jdk13 anymore).
>>>> - I finally have a potential fix for the Win* failure with
>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>> but I haven't run it through Mach5 yet so it'll be in the next round.
>>>> - Some RTM tests were recently re-enabled in Mach5 and I'm seeing some
>>>> monitor related failures there. I suspect that I need to go take a
>>>> look at the C2 RTM macro assembler code and look for things that might
>>>> conflict if Async Monitor Deflation. If you're interested in that kind
>>>> of issue, then see the macroAssembler_x86.cpp sanity check that I
>>>> added in this round!
>>>>
>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a fix for an issue that came up during performance testing.
>>>>> Many thanks to Robbin for diagnosing the issue in his SPECjbb2015
>>>>> experiments.
>>>>>
>>>>> Here's the list of changes from CR3 to CR4. The list is a bit
>>>>> verbose due to the complexity of the issue, but the changes
>>>>> themselves are not that big.
>>>>>
>>>>> Functional:
>>>>> - Change SafepointSynchronize::is_cleanup_needed() from calling
>>>>> ObjectSynchronizer::is_cleanup_needed() to calling
>>>>> ObjectSynchronizer::is_safepoint_deflation_needed():
>>>>> - is_safepoint_deflation_needed() returns the result of
>>>>> monitors_used_above_threshold() for safepoint based
>>>>> monitor deflation (!AsyncDeflateIdleMonitors).
>>>>> - For AsyncDeflateIdleMonitors, it only returns true if
>>>>> there is a special deflation request, e.g., System.gc()
>>>>> - This solves a bug where there are a bunch of Cleanup
>>>>> safepoints that simply request async deflation which
>>>>> keeps the async JavaThreads from making progress on
>>>>> their async deflation work.
>>>>> - Add AsyncDeflationInterval diagnostic option. Description:
>>>>> Async deflate idle monitors every so many milliseconds when
>>>>> MonitorUsedDeflationThreshold is exceeded (0 is off).
>>>>> - Replace ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with
>>>>> ObjectSynchronizer::is_async_deflation_needed():
>>>>> - is_async_deflation_needed() returns true when
>>>>> is_async_cleanup_requested() is true or when
>>>>> monitors_used_above_threshold() is true (but no more often than
>>>>> AsyncDeflationInterval).
>>>>> - if AsyncDeflateIdleMonitors Service_lock->wait() now waits for
>>>>> at most GuaranteedSafepointInterval millis:
>>>>> - This allows is_async_deflation_needed() to be checked at
>>>>> the same interval as GuaranteedSafepointInterval.
>>>>> (default is 1000 millis/1 second)
>>>>> - Once is_async_deflation_needed() has returned true, it
>>>>> generally cannot return true for AsyncDeflationInterval.
>>>>> This is to prevent async deflation from swamping the
>>>>> ServiceThread.
>>>>> - The ServiceThread still handles async deflation of the global
>>>>> in-use list and now it also marks JavaThreads for async deflation
>>>>> of their in-use lists.
>>>>> - The ServiceThread will check for async deflation work every
>>>>> GuaranteedSafepointInterval.
>>>>> - A safepoint can still cause the ServiceThread to check for
>>>>> async deflation work via is_async_deflation_requested.
>>>>> - Refactor code from ObjectSynchronizer::is_cleanup_needed() into
>>>>> monitors_used_above_threshold() and remove is_cleanup_needed().
>>>>> - In addition to System.gc(), the VM_Exit VM op and the final
>>>>> VMThread safepoint now set the is_special_deflation_requested
>>>>> flag to reduce the in-use monitor population that is reported by
>>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit.
>>>>>
>>>>> Test update:
>>>>> - test/hotspot/gtest/oops/test_markOop.cpp is updated to work with
>>>>> AsyncDeflateIdleMonitors.
>>>>>
>>>>> Collateral:
>>>>> - Add/clarify/update some logging messages.
>>>>>
>>>>> Cleanup:
>>>>> - Updated comments based on Karen's code review.
>>>>> - Change 'special cleanup' -> 'special deflation' and
>>>>> 'async cleanup' -> 'async deflation'.
>>>>> - comment and function name changes
>>>>> - Clarify MonitorUsedDeflationThreshold description;
>>>>>
>>>>>
>>>>> Main bug URL:
>>>>>
>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>
>>>>> The project is currently baselined on jdk-13+22.
>>>>>
>>>>> Here's the full webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.full/
>>>>>
>>>>> Here's the incremental webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.inc/
>>>>>
>>>>> I have not updated the OpenJDK wiki to reflect the CR4 changes:
>>>>>
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>
>>>>> The wiki doesn't say a whole lot about the async deflation invocation
>>>>> mechanism so I have to figure out how to add that content.
>>>>>
>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run is
>>>>> running now. Kitchensink8H on product, fastdebug, and slowdebug bits
>>>>> are running on Linux-X64, MacOSX and Solaris-X64. I still have to run
>>>>> my stress kit on Linux-X64. I still have to run the SPECjbb2015
>>>>> baseline and CR4 runs on Linux-X64, MacOSX and Solaris-X64.
>>>>>
>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I had some discussions with Karen about a race that was in the
>>>>>> ObjectMonitor::enter() code in CR2/v2.02/5-for-jdk13. This race was
>>>>>> theoretical and I had no test failures due to it. The fix is pretty
>>>>>> simple: remove the special case code for async deflation in the
>>>>>> ObjectMonitor::enter() function and rely solely on the ref_count
>>>>>> for ObjectMonitor::enter() protection.
>>>>>>
>>>>>> During those discussions Karen also floated the idea of using the
>>>>>> ref_count field instead of the contentions field for the Async
>>>>>> Monitor Deflation protocol. I decided to go ahead and code up that
>>>>>> change and I have run it through the usual stress and Mach5 testing
>>>>>> with no issues. It's also known as v2.03 (for those for with the
>>>>>> patches) and as webrev/6-for-jdk13 (for those with webrev URLs).
>>>>>> Sorry for all the names...
>>>>>>
>>>>>> Main bug URL:
>>>>>>
>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>
>>>>>> The project is currently baselined on jdk-13+18.
>>>>>>
>>>>>> Here's the full webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.full/
>>>>>>
>>>>>> Here's the incremental webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.inc/
>>>>>>
>>>>>> I have also updated the OpenJDK wiki to reflect the CR3 changes:
>>>>>>
>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>
>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run had
>>>>>> no issues. Kitchensink8H on product, fastdebug, and slowdebug bits
>>>>>> had no failures on Linux-X64; MacOSX fastdebug and slowdebug and
>>>>>> Solaris-X64 release had the usual "Too large time diff" complaints.
>>>>>> 12 hour Inflate2 runs on product, fastdebug and slowdebug bits on
>>>>>> Linux-X64, MacOSX and Solaris-X64 had no failures. My Linux-X64
>>>>>> stress kit is running right now.
>>>>>>
>>>>>> I've done the SPECjbb2015 baseline and CR3 runs. I need to gather
>>>>>> the results and analyze them.
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have a small but important bug fix for the Async Monitor Deflation
>>>>>>> project ready to go. It's also known as v2.02 (for those for with the
>>>>>>> patches) and as webrev/5-for-jdk13 (for those with webrev URLs). Sorry
>>>>>>> for all the names...
>>>>>>>
>>>>>>> JDK-8222295 was pushed to jdk/jdk two days ago so that baseline patch
>>>>>>> is out of our hair.
>>>>>>>
>>>>>>> Main bug URL:
>>>>>>>
>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>
>>>>>>> The project is currently baselined on jdk-13+17.
>>>>>>>
>>>>>>> Here's the full webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/
>>>>>>>
>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/
>>>>>>>
>>>>>>> I still have to update the OpenJDK wiki to reflect the CR2 changes:
>>>>>>>
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>
>>>>>>> This version of the patch has been thru Mach5 tier[1-6] testing on
>>>>>>> Oracle's usual set of platforms. Mach5 tier[7-8] is running now.
>>>>>>> My stress kit is running on Solaris-X64 now. Kitchensink8H is running
>>>>>>> now on product, fastdebug, and slowdebug bits on Linux-X64, MacOSX
>>>>>>> and Solaris-X64. 12 hour Inflate2 runs are running now on product,
>>>>>>> fastdebug and slowdebug bits on Linux-X64, MacOSX and Solaris-X64.
>>>>>>> I'll start my my stress kit on Linux-X64 sometime on Sunday (after
>>>>>>> my jdk-13+18 stress run is done).
>>>>>>>
>>>>>>> I'll do SPECjbb2015 baseline and CR2 runs after all the stress
>>>>>>> testing is done.
>>>>>>>
>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I finally have CR1 for the Async Monitor Deflation project ready to
>>>>>>>> go. It's also known as v2.01 (for those for with the patches) and as
>>>>>>>> webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all the
>>>>>>>> names...
>>>>>>>>
>>>>>>>> Main bug URL:
>>>>>>>>
>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>
>>>>>>>> Baseline bug fixes URL:
>>>>>>>>
>>>>>>>> JDK-8222295 more baseline cleanups from Async Monitor Deflation project
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295
>>>>>>>>
>>>>>>>> The project is currently baselined on jdk-13+15.
>>>>>>>>
>>>>>>>> Here's the webrev for the latest baseline changes (JDK-8222295):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295
>>>>>>>>
>>>>>>>> Here's the full webrev URL (JDK-8153224 only):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/
>>>>>>>>
>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/
>>>>>>>>
>>>>>>>> So I'm looking for reviews for both JDK-8222295 and the latest version
>>>>>>>> of JDK-8153224...
>>>>>>>>
>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR changes:
>>>>>>>>
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>
>>>>>>>> This version of the patch has been thru Mach5 tier[1-3] testing on
>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running now and
>>>>>>>> Mach5 tier[78] will be run later today. My stress kit on Solaris-X64
>>>>>>>> is running now. Linux-X64 stress testing will start on Sunday. I'm
>>>>>>>> planning to do Kitchensink runs, SPECjbb2015 runs and my monitor
>>>>>>>> inflation stress tests on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>
>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on:
>>>>>>>>>
>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>
>>>>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>>>>
>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>
>>>>>>>>> Here's the webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/
>>>>>>>>>
>>>>>>>>> Here's a link to Carsten's original webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
>>>>>>>>>
>>>>>>>>> Earlier versions of this patch have been through several rounds of
>>>>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and
>>>>>>>>> Roman for their preliminary code review comments. A very special
>>>>>>>>> thanks to Robbin and Roman for building and testing the patch in
>>>>>>>>> their own environments (including specJBB2015).
>>>>>>>>>
>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>>>>> Oracle's usual set of platforms. Earlier versions have been run
>>>>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers
>>>>>>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink
>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug
>>>>>>>>> and slowdebug). Earlier versions have run my monitor inflation stress
>>>>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product,
>>>>>>>>> fastdebug and slowdebug).
>>>>>>>>>
>>>>>>>>> All of the testing done on earlier versions will be redone on the
>>>>>>>>> latest version of the patch.
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>> P.S.
>>>>>>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>> is currently failing in -Xcomp mode on Win* only. I've been trying
>>>>>>>>> to characterize/analyze this failure for more than a week now. At
>>>>>>>>> this point I'm convinced that Async Monitor Deflation is aggravating
>>>>>>>>> an existing bug. However, I plan to have a better handle on that
>>>>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list