RFR(L) 8153224 Monitor deflation prolong safepoints (CR10/v2.10/13-for-jdk15)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Feb 28 20:53:20 UTC 2020
>>> Thanks for all the hard work you put into this Dan!
>>
>> Thanks! It has been an adventure.
>
> I can imagine. :) Thanks for all the great work ironing out the
details of this headache inducing monster hair ball!
I like to delude myself into thinking that all the subtasks of
JDK-8153224 that I've already pushed have reduced the hair ball!
:-)
Dan
On 2/28/20 11:35 AM, Erik Österlund wrote:
> Hi Dan,
>
> On 2020-02-28 16:57, Daniel D. Daugherty wrote:
>> Hi Erik!
>>
>> Thanks for the fast review!!
>>
>> Replies embedded below...
>>
>>
>> On 2/27/20 11:14 AM, Erik Österlund wrote:
>>> Hi Dan,
>>>
>>> This looks great overall.
>>
>> Thanks!
>>
>>
>>> I have a few minor comments, each one manifesting as a patch you can
>>> apply.
>>
>> Thanks for doing this review "patch style". I've missed that from the
>> Thread-SMR days... :-)
>
> :)
>
>>
>>> #1 When we found the handshaking mechanism to provide promising
>>> numbers, we used a deflation interval of 1 second,
>>> the same number used for "guaranteed safepoint interval". I would
>>> recommend using that number too. Like this:
>>>
>>> http://cr.openjdk.java.net/~eosterlund/dan_monitors/webrev.00/
>>
>> IIRC, that testing was with ZGC and not with the default G1 or am I
>> misremembering?
>
> Correct!
>
>> Background for explaining how AsyncDeflationInterval works:
>>
>> More to the point, AsyncDeflationInterval is intended to be lower
>> boundary
>> to prevent swamping the ServiceThread. It means that we will async
>> deflate
>> no more frequently than AsyncDeflationInterval when there is NOT an
>> explicit
>> request for async deflation:
>>
>> src/hotspot/share/runtime/synchronizer.cpp: is_async_deflation_needed():
>>
>> L1286 if (is_async_deflation_requested()) {
>> L1287 // Async deflation request.
>> L1288 return true;
>> L1289 }
>> L1290 if (AsyncDeflationInterval > 0 &&
>> L1291 time_since_last_async_deflation_ms() >
>> AsyncDeflationInterval &&
>> L1292 monitors_used_above_threshold()) {
>> L1293 // It's been longer than our specified deflate interval and
>> there
>> L1294 // are too many monitors in use. We don't deflate more
>> frequently
>> L1295 // than AsyncDeflationInterval (unless
>> is_async_deflation_requested)
>> L1296 // in order to not swamp the ServiceThread.
>> L1297 _last_async_deflation_time_ns = os::javaTimeNanos();
>> L1298 return true;
>> L1299 }
>>
>> L1286-8: Honor explicit request for async deflation (no throttling).
>>
>> L1290-8: Honor MonitorUsedDeflationThreshold being exceeded if
>> AsyncDeflationInterval is set and it has been long enough
>> since the last async deflation cycle.
>>
>> The OpenJDK wiki has a whole section that talks about monitor deflation
>> invocation details:
>>
>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#AsyncMonitorDeflation-MonitorDeflationInvocationDetails
>>
>>
>>
>> So when we haven't exceeded MonitorUsedDeflationThreshold, the
>> ServiceThread will wake up every GuaranteedSafepointInterval to
>> check for periodic tasks. If is_async_deflation_needed() returns
>> true, then we'll do an async deflation. So at most, we'll do an
>> async deflation every GuaranteedSafepointInterval and
>> AsyncDeflationInterval won't come into play as the lower bound.
>>
>>
>> When we have exceeded MonitorUsedDeflationThreshold, when the
>> ServiceThread wakes up to execute periodic tasks, then we'll do
>> an async deflation because is_async_deflation_needed() has returned
>> returned true. When the ServiceThread has finished handling all the
>> periodic tasks, it will loop around and check again for triggers.
>> Even if MonitorUsedDeflationThreshold is still exceeded,
>> is_async_deflation_needed() will only return true for the latest
>> trigger check if the last async deflation timestamp is longer ago
>> than AsyncDeflationInterval. So at most we'll do an async deflation
>> every AsyncDeflationInterval.
>>
>>
>> So... now everyone should be caught up on how it is supposed to work.
>> :-)
>>
>> My idea behind AsyncDeflationInterval is that while we recognize
>> that we are beyond MonitorUsedDeflationThreshold and we want to
>> cleanup, we don't want to swamp the ServiceThread by cleaning up
>> all the time. Having AsyncDeflationInterval be 1/4 of a second
>> while GuaranteedSafepointInterval is one second seemed like a
>> reasonable compromise.
>
> Thank you for the explanation.
>
>> We could make AsyncDeflationInterval == GuaranteedSafepointInterval
>> and that would make sure that is_async_deflation_needed() only
>> returns true once per ServiceThread wakeup. Of course that's
>> assuming the ServiceThread isn't woken up for other work.
>>
>> When I kick off my SPECjbb2015 experiments, I'll include a config
>> where -XX:AsyncDeflationInterval=1000 is specified and I'll keep
>> you posted on the results.
>
> Thank you. Perhaps 250 ms is an okay default anyway.I will leave it up
> to you.
>
>>
>>> #2 When deflating a monitor, there are 5 different scenarios in
>>> which we bail out. I found it easier to read
>>> the code with an "early exit" style, where the if statements
>>> describe the failure mode, perform some cleanup and
>>> exit, instead of nested if statements describing the success case,
>>> with shared bailing code at the end. Like this:
>>>
>>> http://cr.openjdk.java.net/~eosterlund/dan_monitors/webrev.01/
>>>
>>> This one is only stylistic, so I am okay if you don't like that style.
>>
>> I like it! Code motion to the extreme. I matched each moved section from
>> its old location to its new location. The only changes were the
>> necessary
>> logic inversions. There's even one typo that has been there for a very
>> long time and it is kept intact. :-) (I'll likely fix it when I import
>> the patch.)
>>
>> Update: Normally I also like the early-exit style so I tried to figure
>> out why I used this nested if-statements style. What I figured out by
>> going thru my notes is that my deflate_monitor_using_JT() was created
>> from Carsten's ObjectMonitor::try_disable_monitor() which uses the
>> nested if-statement style.
>
> Ah. :)
> Glad you liked it!
>
>>
>>> #3 Seems to me the following code looks like it is protecting
>>> against conditions that are no longer important with handshakes:
>>>
>>> http://cr.openjdk.java.net/~eosterlund/dan_monitors/webrev.02/
>>>
>>> Am I right? Looks from the comments like you suspected one of these
>>> was redundant too.
>>
>> Let's take these one at a time.
>>
>> Old code from your first change:
>>
>> L438: if (object() != obj) {
>> L439: // ObjectMonitor's object ref no longer refers to the
>> target object
>> L440: // so the object's header has already been restored.
>> L441: return;
>> L442: }
>>
>> This one guards against touching obj's header when we detect a mismatch
>> between ObjectMonitor::_object and the 'obj'. While I _think_ that this
>> can only happen without handshakes after deflation, I'm not 100% sure.
>> I can see changing it into:
>>
>> ADIM_guarantee(object() == obj, "object=" INTPTR_FORMAT
>> " must equal obj=" INTPTR_FORMAT, p2i(object()),
>> p2i(obj));
>
> That sounds reasonable. My understanding is that once the object has
> been set once, the object disassociation
> happens first when it is recycled, which happens after the handshake.
> Therefore, that condition should be impossible.
>
>> Old code from your second change:
>>
>> L445: if (dmw.value() == 0) {
>> L446: // ObjectMonitor's header/dmw has been cleared so the
>> ObjectMonitor
>> L447: // has been deflated and taken off the global free list.
>> L448: return;
>> L449: }
>>
>> This one guards against touching obj's header when we detect that the
>> ObjectMonitor's header/dmw has been cleared. The comment is correct as
>> that happens here:
>>
>> src/hotspot/share/runtime/synchronizer.cpp: om_alloc()
>>
>> L1460: // 2: try to allocate from the global
>> om_list_globals._free_list
>> <snip>
>> L1473: if (AsyncDeflateIdleMonitors) {
>> L1474: // We allowed 3 field values to linger during async
>> deflation.
>> L1475: // Clear or restore them as appropriate.
>> L1476: take->set_header(markWord::zero());
>>
>> This means that the ObjectMonitor has traveled from its in-use list
>> to the global wait list (handshake here), to the global free list.
>>
>> In this particular case, I don't think I even need to replace the
>> old code with an assert or a guarantee. This check is below the
>> block you removed:
>>
>> L452: ADIM_guarantee(dmw.is_neutral(), "must be neutral: dmw="
>> INTPTR_FORMAT, dmw.value());
>>
>> will catch a NULL value quite nicely. Ref:
>>
>> src/hotspot/share/oops/markWord.hpp
>> static const uintptr_t unlocked_value = 1;
>> <snip>
>> bool is_neutral() const { return (mask_bits(value(),
>> biased_lock_mask_in_place) == unlocked_value); }
>>
>
> Sounds good!
>
>> Old code from your third change:
>>
>> L2097: // The ObjectMonitor could have been deflated and reused for
>> L2098: // another object before we bumped the ref_count so make sure
>> L2099: // our object still refers to this ObjectMonitor.
>> L2100: // Note: With handshakes after deflation is this race even
>> L2101: // possible anymore?
>> L2102: const markWord tmp = object->mark();
>> L2103: if (!tmp.has_monitor() || tmp.monitor() != om_ptr) {
>> L2104: // Async deflation and reuse won the race so we have to
>> retry.
>> L2105: // Skip object header restoration since that's already
>> done.
>> L2106: om_ptr->dec_ref_count();
>> L2107: return false;
>> L2108: }
>>
>> This one guards ObjectMonitorHandle::save_om_ptr() callers from an
>> object
>> that doesn't refer to the ObjectMonitor* that we think it refers to.
>> While I _think_ that this can only happen without handshakes after
>> deflation, I'm not 100% sure. I can see changing it into:
>>
>> const markWord tmp = object->mark();
>> ADIM_guarantee(tmp.has_monitor() && tmp.monitor() == om_ptr,
>> "object=" INTPTR_FORMAT ", header=" INTPTR_FORMAT
>> ": object must have a header that refers to om_ptr="
>> INTPTR_FORMAT, p2i(object), tmp.value(),
>> p2i(om_ptr));
>
> That sounds reasonable. My reasoning is that if deflation started
> after the ref count, and we have gotten this far through
> the owner and ref count checks, then the object has not been deflated.
> The other case to consider is that deflation happens
> *before* the ref count is incremented. But if that was the case, the
> monitor we have a pointer to would need a handshake
> before it can re-associate. Since we know we have not replied to such
> a handshake, it could not have re-associated. Therefore
> we know that at the time of the load, the monitor was associated to
> its object. And it can't have re-associated since then. It
> can have gotten deflated, but then the other checks would have caught
> that. Therefore, this check is checking a condition
> that can no longer happen.
>
>>
>>> I also wonder if the special deflation requests that dig into the
>>> System.gc() calls really ought to be there. Is
>>> this a case of tests assuming things about deflation happening at
>>> System.gc for the sake of deflation,
>>
>> If a test calls System.gc() expecting some specific not reachable object
>> to be GC'ed, then this code helps. When threads deflated their own
>> ObjectMonitors, this was definitely needed. Now that the ServiceThread
>> does the deflation, there is less of a problem. But with this code in
>> place, System.gc() makes deflation happen more quickly so I expect fewer
>> intermittent failures with these tests that assume too much about what
>> System.gc() does.
>>
>>
>>> or for the
>>> sake of letting objects die that are only referenced from
>>> ObjectMonitors?
>>
>> That's the best reason. If you call System.gc() or the equivalent,
>> you're
>> asking for objects to die. Deflating idle ObjectMonitors helps that
>> happen.
>>
>>
>>> If the latter, then perhaps that can be
>>> solved by making these references weak. But that seems like it ought
>>> to be a follow-up patch that introduces a weak
>>> oop storage for these things, and removes the special cleanup code
>>> that might not be needed any more.
>>
>> I know you have a patch for adding regular oop storage for the _object
>> field in the ObjectMonitor and we definitely want to look at that as a
>> separate change. As for switching to 'weak references', I have to dig up
>> the things that David H and I thought of during this project that would
>> be complicated and/or broken by use of weak refs.
>>
>>
>>> I can volunteer.
>>
>> Thanks! As usual, I'm more than happy to take code from you, run with
>> it, shepherd it through my stress testing, and eventually push it :-)
>
> Awesome. We have a deal then! It looks from your explanation like
> making these things weak would remove the need for
> special deflation requests by System.gc. This is definitely something
> that I think is desirable, so we can purge GC safepoints
> from per-thread processing, and I will gladly have a look at that for
> a follow-up patch. Any information you might have about
> why weak references are problematic would be great to have. But we can
> chat off-list about that.
>
>>
>>> The reason I am wondering is because I am currently removing all
>>> per-thread overheads from our GC safepoints.
>>>
>>> If you agree with #1 and #3 (optionally #2), and I guessed right
>>> what the special request logic is for, then I
>>> don't need another webrev.
>>
>> I'll do some testing with #1 as part of my current test cycle
>> on the v2.10 bits... will keep you posted.
>>
>> I like #2 (with one typo fix).
>>
>> I like #3 with the above mentioned changes.
>>
>> These changes will get queued up with requests from the other
>> reviewers so
>> I'll likely have to roll a new webrev anyway for tracking.
>
> Awesome, thanks!
>
>>
>>> Thanks for all the hard work you put into this Dan!
>>
>> Thanks! It has been an adventure.
>
> I can imagine. :) Thanks for all the great work ironing out the
> details of this headache inducing monster hair ball!
>
> /Erik
>
>> Dan
>>
>>
>>>
>>> /Erik
>>>
>>> On 2020-02-26 23:22, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have made changes to the Async Monitor Deflation code in response to
>>>> the CR9/v2.09/12-for-jdk14 code review cycle. Thanks to Robbin and
>>>> Erik O.
>>>> for their comments in this round!
>>>>
>>>> With the extraction and push of {8235931,8236035,8235795} to JDK15,
>>>> the
>>>> Async Monitor Deflation code is back to "just" async deflation
>>>> changes!
>>>>
>>>> I have attached the change list from CR9 to CR10 instead of putting
>>>> it in
>>>> the body of this email. I've also added a link to the
>>>> CR9-to-CR10-changes
>>>> file to the webrevs so it should be easy to find.
>>>>
>>>> Main bug URL:
>>>>
>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>
>>>> The project is currently baselined on jdk-15+11.
>>>>
>>>> Here's the full webrev URL for those folks that want to see all of the
>>>> current Async Monitor Deflation code in one go (v2.10 full):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for-jdk15+11.v2.10.full/
>>>>
>>>>
>>>> Some folks might want to see just what has changed since the last
>>>> review
>>>> cycle so here's a webrev for that (v2.10 inc):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for-jdk15+11.v2.10.inc/
>>>>
>>>>
>>>> Since we backed out the HandshakeAfterDeflateIdleMonitors option
>>>> and the
>>>> C2 ref_count changes and updated the copyright years, the "inc"
>>>> webrev has
>>>> a bit more noise in it than usual. Sorry about that!
>>>>
>>>> The OpenJDK wiki has been updated for this round of changes:
>>>>
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>
>>>> The jdk-15+11 based v2.10 version of the patch has been thru Mach5
>>>> tier[1-7]
>>>> testing on Oracle's usual set of platforms. Mach5 tier8 is still
>>>> running.
>>>> I'm running the v2.10 patch through my usual set of stress testing on
>>>> Linux-X64 and macOSX.
>>>>
>>>> I'm planning to do a SPECjbb2015 round on the
>>>> CR10/v2.20/13-for-jdk15 bits.
>>>>
>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> This project is no longer targeted to JDK14 so this is NOT an
>>>>> urgent code
>>>>> review request.
>>>>>
>>>>> I've extracted the following three fixes from the Async Monitor
>>>>> Deflation
>>>>> project code:
>>>>>
>>>>> JDK-8235931 add OM_CACHE_LINE_SIZE and use smaller size on
>>>>> SPARCv9 and X64
>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931
>>>>>
>>>>> JDK-8236035 refactor ObjectMonitor::set_owner() and _owner
>>>>> field setting
>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035
>>>>>
>>>>> JDK-8235795 replace monitor list
>>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>>
>>>>> Each of these has been reviewed separately and will be pushed to
>>>>> JDK15
>>>>> in the near future (possibly by the end of this week). Of course,
>>>>> there
>>>>> were improvements during these review cycles and the purpose of this
>>>>> e-mail is to provided updated webrevs for this fix
>>>>> (CR9/v2.09/12-for-jdk14)
>>>>> within the revised context provided by {8235931, 8236035, 8235795}.
>>>>>
>>>>> Main bug URL:
>>>>>
>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>
>>>>> The project is currently baselined on jdk-14+34.
>>>>>
>>>>> Here's the full webrev URL for those folks that want to see all of
>>>>> the
>>>>> current Async Monitor Deflation code along with {8235931, 8236035,
>>>>> 8235795}
>>>>> in one go (v2.09b full):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09b.full/
>>>>>
>>>>>
>>>>> Compare the open.patch file in 12-for-jdk14.v2.09.full and
>>>>> 12-for-jdk14.v2.09b.full
>>>>> using your favorite file comparison/merge tool to see how Async
>>>>> Monitor Deflation
>>>>> evolved due to {8235931, 8236035, 8235795}.
>>>>>
>>>>> Some folks might want to see just the Async Monitor Deflation code
>>>>> on top of
>>>>> {8235931, 8236035, 8235795} so here's a webrev for that (v2.09b inc):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09b.inc/
>>>>>
>>>>>
>>>>> These webrevs have gone thru several Mach5 Tier[1-8] runs along with
>>>>> my usual stress testing and SPECjbb2015 testing and there aren't any
>>>>> surprises relative to CR9/v2.09/12-for-jdk14.
>>>>>
>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have made changes to the Async Monitor Deflation code in
>>>>>> response to
>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. Thanks to David H.,
>>>>>> Robbin
>>>>>> and Erik O. for their comments!
>>>>>>
>>>>>> This project is no longer targeted to JDK14 so this is NOT an
>>>>>> urgent code
>>>>>> review request. The primary purpose of this webrev is simply to
>>>>>> close the
>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to let folks see how
>>>>>> I resolved
>>>>>> the code review comments from that round.
>>>>>>
>>>>>> Most of the comments in the CR8/v2.08/11-for-jdk14 code review
>>>>>> cycle were
>>>>>> on the monitor list changes so I'm going to take a look at
>>>>>> extracting those
>>>>>> changes into a standalone patch. Switching from
>>>>>> Thread::muxAcquire(&gListLock)
>>>>>> and Thread::muxRelease(&gListLock) to finer grained internal spin
>>>>>> locks needs
>>>>>> to be thoroughly reviewed and the best way to do that is
>>>>>> separately from the
>>>>>> Async Monitor Deflation changes. Thanks to Coleen for suggesting
>>>>>> doing this
>>>>>> extraction earlier.
>>>>>>
>>>>>> I have attached the change list from CR8 to CR9 instead of
>>>>>> putting it in
>>>>>> the body of this email. I've also added a link to the
>>>>>> CR8-to-CR9-changes
>>>>>> file to the webrevs so it should be easy to find.
>>>>>>
>>>>>> Main bug URL:
>>>>>>
>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>
>>>>>> The project is currently baselined on jdk-14+26.
>>>>>>
>>>>>> Here's the full webrev URL for those folks that want to see all
>>>>>> of the
>>>>>> current Async Monitor Deflation code in one go (v2.09 full):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09.full/
>>>>>>
>>>>>>
>>>>>> Some folks might want to see just what has changed since the last
>>>>>> review
>>>>>> cycle so here's a webrev for that (v2.09 inc):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09.inc/
>>>>>>
>>>>>>
>>>>>> The OpenJDK wiki has NOT yet been updated for this round of changes:
>>>>>>
>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>
>>>>>>
>>>>>> The jdk-14+26 based v2.09 version of the patch has been thru
>>>>>> Mach5 tier[1-7]
>>>>>> testing on Oracle's usual set of platforms. Mach5 tier8 is still
>>>>>> running.
>>>>>> A slightly older version of the v2.09 patch has also been through
>>>>>> my usual
>>>>>> set of stress testing on Linux-X64 and macOSX with the addition
>>>>>> of Robbin's
>>>>>> "MoCrazy 1024" test running in parallel on Linux-X64 with the
>>>>>> other tests in
>>>>>> my lab. The "MoCrazy 1024" has been going for > 5 days and 6700+
>>>>>> iterations
>>>>>> without any failures.
>>>>>>
>>>>>> I'm planning to do a SPECjbb2015 round on the
>>>>>> CR9/v2.09/12-for-jdk14 bits.
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have made changes to the Async Monitor Deflation code in
>>>>>>> response to
>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. Thanks to David
>>>>>>> H., Robbin
>>>>>>> and Erik O. for their comments!
>>>>>>>
>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, 2019 and the
>>>>>>> Async Monitor
>>>>>>> Deflation project needs to push before Nov. 12, 2019 in order to
>>>>>>> allow
>>>>>>> for sufficient bake time for such a big change. Nov. 12 is
>>>>>>> _next_ Tuesday
>>>>>>> so we have 8 days from today to finish this code review cycle
>>>>>>> and push
>>>>>>> this code for JDK14.
>>>>>>>
>>>>>>> Carsten and Roman! Time for you guys to chime in again on the
>>>>>>> code reviews.
>>>>>>>
>>>>>>> I have attached the change list from CR7 to CR8 instead of
>>>>>>> putting it in
>>>>>>> the body of this email. I've also added a link to the
>>>>>>> CR7-to-CR8-changes
>>>>>>> file to the webrevs so it should be easy to find.
>>>>>>>
>>>>>>> Main bug URL:
>>>>>>>
>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>
>>>>>>> The project is currently baselined on jdk-14+21.
>>>>>>>
>>>>>>> Here's the full webrev URL for those folks that want to see all
>>>>>>> of the
>>>>>>> current Async Monitor Deflation code in one go (v2.08 full):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.full
>>>>>>>
>>>>>>>
>>>>>>> Some folks might want to see just what has changed since the
>>>>>>> last review
>>>>>>> cycle so here's a webrev for that (v2.08 inc):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.inc/
>>>>>>>
>>>>>>>
>>>>>>> The OpenJDK wiki did not need any changes for this round:
>>>>>>>
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>
>>>>>>>
>>>>>>> The jdk-14+21 based v2.08 version of the patch has been thru
>>>>>>> Mach5 tier[1-8]
>>>>>>> testing on Oracle's usual set of platforms. It has also been
>>>>>>> through my usual
>>>>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with
>>>>>>> the addition
>>>>>>> of Robbin's "MoCrazy 1024" test running in parallel with the
>>>>>>> other tests in
>>>>>>> my lab. Some testing is still running, but so far there are no
>>>>>>> new regressions.
>>>>>>>
>>>>>>> I have not yet done a SPECjbb2015 round on the
>>>>>>> CR8/v2.08/11-for-jdk14 bits.
>>>>>>>
>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> The Async Monitor Deflation project is reaching the end game. I
>>>>>>>> have no
>>>>>>>> changes planned for the project at this time so all that is
>>>>>>>> left is code
>>>>>>>> review and any changes that results from those reviews.
>>>>>>>>
>>>>>>>> Carsten and Roman! Time for you guys to chime in again on the
>>>>>>>> code reviews.
>>>>>>>>
>>>>>>>> I have attached the list of fixes from CR6 to CR7 instead of
>>>>>>>> putting it
>>>>>>>> in the main body of this email.
>>>>>>>>
>>>>>>>> Main bug URL:
>>>>>>>>
>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>
>>>>>>>> The project is currently baselined on jdk-14+19.
>>>>>>>>
>>>>>>>> Here's the full webrev URL for those folks that want to see all
>>>>>>>> of the
>>>>>>>> current Async Monitor Deflation code in one go (v2.07 full):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.full
>>>>>>>>
>>>>>>>>
>>>>>>>> Some folks might want to see just what has changed since the
>>>>>>>> last review
>>>>>>>> cycle so here's a webrev for that (v2.07 inc):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.inc/
>>>>>>>>
>>>>>>>>
>>>>>>>> The OpenJDK wiki has been updated to match the
>>>>>>>> CR7/v2.07/10-for-jdk14 changes:
>>>>>>>>
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>
>>>>>>>>
>>>>>>>> The jdk-14+18 based v2.07 version of the patch has been thru
>>>>>>>> Mach5 tier[1-8]
>>>>>>>> testing on Oracle's usual set of platforms. It has also been
>>>>>>>> through my usual
>>>>>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with
>>>>>>>> the addition
>>>>>>>> of Robbin's "MoCrazy 1024" test running in parallel with the
>>>>>>>> other tests in
>>>>>>>> my lab.
>>>>>>>>
>>>>>>>> The jdk-14+19 based v2.07 version of the patch has been thru
>>>>>>>> Mach5 tier[1-3]
>>>>>>>> test on Oracle's usual set of platforms. Mach5 tier[4-8] are in
>>>>>>>> process.
>>>>>>>>
>>>>>>>> I did another round of SPECjbb2015 testing in Oracle's Aurora
>>>>>>>> Performance lab
>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs:
>>>>>>>>
>>>>>>>> - "base" is jdk-14+18
>>>>>>>> - "v2.07" is the latest version and includes C2
>>>>>>>> inc_om_ref_count() support
>>>>>>>> on LP64 X64 and the new HandshakeAfterDeflateIdleMonitors
>>>>>>>> option
>>>>>>>> - "off" is with -XX:-AsyncDeflateIdleMonitors specified
>>>>>>>> - "handshake" is with
>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors specified
>>>>>>>>
>>>>>>>> hbIR hbIR
>>>>>>>> (max attempted) (settled) max-jOPS critical-jOPS runtime
>>>>>>>> --------------- --------- -------- ------------- -------
>>>>>>>> 34282.00 30635.90 28831.30 20969.20 3841.30 base
>>>>>>>> 34282.00 30973.00 29345.80 21025.20 3964.10 v2.07
>>>>>>>> 34282.00 31105.60 29174.30 21074.00 3931.30
>>>>>>>> v2.07_handshake
>>>>>>>> 34282.00 30789.70 27151.60 19839.10 3850.20 v2.07_off
>>>>>>>>
>>>>>>>> - The Aurora Perf comparison tool reports:
>>>>>>>>
>>>>>>>> Comparison max-jOPS critical-jOPS
>>>>>>>> ---------------------- --------------------
>>>>>>>> --------------------
>>>>>>>> base vs 2.07 +1.78% (s, p=0.000) +0.27% (ns,
>>>>>>>> p=0.790)
>>>>>>>> base vs 2.07_handshake +1.19% (s, p=0.007) +0.58% (ns,
>>>>>>>> p=0.536)
>>>>>>>> base vs 2.07_off -5.83% (ns, p=0.394) -5.39%
>>>>>>>> (ns, p=0.347)
>>>>>>>>
>>>>>>>> (s) - significant (ns) - not-significant
>>>>>>>>
>>>>>>>> - For historical comparison, the Aurora Perf comparision tool
>>>>>>>> reported for v2.06 with a baseline of jdk-13+31:
>>>>>>>>
>>>>>>>> Comparison max-jOPS critical-jOPS
>>>>>>>> ---------------------- --------------------
>>>>>>>> --------------------
>>>>>>>> base vs 2.06 -0.32% (ns, p=0.345) +0.71%
>>>>>>>> (ns, p=0.646)
>>>>>>>> base vs 2.06_off +0.49% (ns, p=0.292) -1.21%
>>>>>>>> (ns, p=0.481)
>>>>>>>>
>>>>>>>> (s) - significant (ns) - not-significant
>>>>>>>>
>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> The Async Monitor Deflation project has rebased to JDK14 so
>>>>>>>>> it's time
>>>>>>>>> for our first code review in that new context!!
>>>>>>>>>
>>>>>>>>> I've been focused on changing the monitor list management code
>>>>>>>>> to be
>>>>>>>>> lock-free in order to make SPECjbb2015 happier. Of course with
>>>>>>>>> a change
>>>>>>>>> like that, it takes a while to chase down all the new and
>>>>>>>>> wonderful
>>>>>>>>> races. At this point, I have the code back to the same
>>>>>>>>> stability that
>>>>>>>>> I had with CR5/v2.05/8-for-jdk13.
>>>>>>>>>
>>>>>>>>> To lay the ground work for this round of review, I pushed the
>>>>>>>>> following
>>>>>>>>> two fixes to jdk/jdk earlier today:
>>>>>>>>>
>>>>>>>>> JDK-8230184 rename, whitespace, indent and comments
>>>>>>>>> changes in preparation
>>>>>>>>> for lock free Monitor lists
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230184
>>>>>>>>>
>>>>>>>>> JDK-8230317 serviceability/sa/ClhsdbPrintStatics.java
>>>>>>>>> fails after 8230184
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230317
>>>>>>>>>
>>>>>>>>> I have attached the list of fixes from CR5 to CR6 instead of
>>>>>>>>> putting
>>>>>>>>> in the main body of this email.
>>>>>>>>>
>>>>>>>>> Main bug URL:
>>>>>>>>>
>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>
>>>>>>>>> The project is currently baselined on jdk-14+11 plus the fixes
>>>>>>>>> for
>>>>>>>>> JDK-8230184 and JDK-8230317.
>>>>>>>>>
>>>>>>>>> Here's the full webrev URL for those folks that want to see
>>>>>>>>> all of the
>>>>>>>>> current Async Monitor Deflation code in one go (v2.06 full):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.full/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The primary focus of this review cycle is on the lock-free
>>>>>>>>> Monitor List
>>>>>>>>> management changes so here's a webrev for just that patch
>>>>>>>>> (v2.06c):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06c.inc/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The secondary focus of this review cycle is on the bug fixes
>>>>>>>>> that have
>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so here's a webrev for
>>>>>>>>> just that
>>>>>>>>> patch (v2.06b):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06b.inc/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The third and final bucket for this review cycle is the
>>>>>>>>> rename, whitespace,
>>>>>>>>> indent and comments changes made in preparation for lock free
>>>>>>>>> Monitor list
>>>>>>>>> management. Almost all of that was extracted into JDK-8230184
>>>>>>>>> for the
>>>>>>>>> baseline so this bucket now has just a few comment changes
>>>>>>>>> relative to
>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for the remainder
>>>>>>>>> (v2.06a):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06a.inc/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Some folks might want to see just what has changed since the
>>>>>>>>> last review
>>>>>>>>> cycle so here's a webrev for that (v2.06 inc):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.inc/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Last, but not least, some folks might want to see the code
>>>>>>>>> before the
>>>>>>>>> addition of lock-free Monitor List management so here's a
>>>>>>>>> webrev for
>>>>>>>>> that (v2.00 -> v2.05):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.05.inc/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The OpenJDK wiki will need minor updates to match the CR6
>>>>>>>>> changes:
>>>>>>>>>
>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> but that should only be changes to describe per-thread list
>>>>>>>>> async monitor
>>>>>>>>> deflation being done by the ServiceThread.
>>>>>>>>>
>>>>>>>>> (I did update the OpenJDK wiki for the CR5 changes back on
>>>>>>>>> 2019.08.14)
>>>>>>>>>
>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8]
>>>>>>>>> testing on
>>>>>>>>> Oracle's usual set of platforms. It has also been through my
>>>>>>>>> usual set
>>>>>>>>> of stress testing on Linux-X64, macOSX and Solaris-X64.
>>>>>>>>>
>>>>>>>>> I did a bunch of SPECjbb2015 testing in Oracle's Aurora
>>>>>>>>> Performance lab
>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs. This
>>>>>>>>> was using
>>>>>>>>> this patch baselined on jdk-13+31 (for stability):
>>>>>>>>>
>>>>>>>>> hbIR hbIR
>>>>>>>>> (max attempted) (settled) max-jOPS critical-jOPS runtime
>>>>>>>>> --------------- --------- -------- ------------- -------
>>>>>>>>> 34282.00 28837.20 27905.20 19817.40 3658.10 base
>>>>>>>>> 34965.70 29798.80 27814.90 19959.00 3514.60 v2.06d
>>>>>>>>> 34282.00 29100.70 28042.50 19577.00 3701.90
>>>>>>>>> v2.06d_off
>>>>>>>>> 34282.00 29218.50 27562.80 19397.30 3657.60
>>>>>>>>> v2.06d_ocache
>>>>>>>>> 34965.70 29838.30 26512.40 19170.60 3569.90 v2.05
>>>>>>>>> 34282.00 28926.10 27734.00 19835.10 3588.40
>>>>>>>>> v2.05_off
>>>>>>>>>
>>>>>>>>> The "off" configs are with -XX:-AsyncDeflateIdleMonitors
>>>>>>>>> specified and
>>>>>>>>> the "ocache" config is with 128 byte cache line sizes instead
>>>>>>>>> of 64 byte
>>>>>>>>> cache lines sizes. "v2.06d" is the last set of changes that I
>>>>>>>>> made before
>>>>>>>>> those changes were distributed into the "v2.06a", "v2.06b" and
>>>>>>>>> "v2.06c"
>>>>>>>>> buckets for this review recycle.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> I've been focused on chasing down and fixing the rare test
>>>>>>>>>> failures
>>>>>>>>>> that only pop up rarely. So this round is primarily fixes for
>>>>>>>>>> races
>>>>>>>>>> with a few additional fixes that came from Karen's review of
>>>>>>>>>> CR4.
>>>>>>>>>> Thanks Karen!
>>>>>>>>>>
>>>>>>>>>> I have attached the list of fixes from CR4 to CR5 instead of
>>>>>>>>>> putting
>>>>>>>>>> in the main body of this email.
>>>>>>>>>>
>>>>>>>>>> Main bug URL:
>>>>>>>>>>
>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>
>>>>>>>>>> The project is currently baselined on jdk-13+29. This will
>>>>>>>>>> likely be
>>>>>>>>>> the last JDK13 baseline for this project and I'll roll to the
>>>>>>>>>> JDK14
>>>>>>>>>> (jdk/jdk) repo soon...
>>>>>>>>>>
>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.full/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.inc/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have not yet checked the OpenJDK wiki to see if it needs
>>>>>>>>>> any updates
>>>>>>>>>> to match the CR5 changes:
>>>>>>>>>>
>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 changes back on
>>>>>>>>>> 2019.06.26)
>>>>>>>>>>
>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3]
>>>>>>>>>> testing on
>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running
>>>>>>>>>> now and
>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off the usual stress
>>>>>>>>>> testing
>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as those machines become
>>>>>>>>>> available.
>>>>>>>>>> Since I haven't made any performance changes in this round,
>>>>>>>>>> I'll only
>>>>>>>>>> be running SPECjbb2015 to gather the latest monitorinflation
>>>>>>>>>> logs.
>>>>>>>>>>
>>>>>>>>>> Next up:
>>>>>>>>>>
>>>>>>>>>> - We're still seeing 4-5% lower performance with SPECjbb2015 on
>>>>>>>>>> Linux-X64 and we've determined that some of that comes from
>>>>>>>>>> contention on the gListLock. So I'm going to investigate
>>>>>>>>>> removing
>>>>>>>>>> the gListLock. Yes, another lock free set of changes is
>>>>>>>>>> coming!
>>>>>>>>>> - Of course, going lock free often causes new races and new
>>>>>>>>>> failures
>>>>>>>>>> so that's a good reason for make those changes isolated in
>>>>>>>>>> their
>>>>>>>>>> own round (and not holding up CR5/v2.05/8-for-jdk13 anymore).
>>>>>>>>>> - I finally have a potential fix for the Win* failure with
>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>> but I haven't run it through Mach5 yet so it'll be in the
>>>>>>>>>> next round.
>>>>>>>>>> - Some RTM tests were recently re-enabled in Mach5 and I'm
>>>>>>>>>> seeing some
>>>>>>>>>> monitor related failures there. I suspect that I need to go
>>>>>>>>>> take a
>>>>>>>>>> look at the C2 RTM macro assembler code and look for things
>>>>>>>>>> that might
>>>>>>>>>> conflict if Async Monitor Deflation. If you're interested
>>>>>>>>>> in that kind
>>>>>>>>>> of issue, then see the macroAssembler_x86.cpp sanity check
>>>>>>>>>> that I
>>>>>>>>>> added in this round!
>>>>>>>>>>
>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> I have a fix for an issue that came up during performance
>>>>>>>>>>> testing.
>>>>>>>>>>> Many thanks to Robbin for diagnosing the issue in his
>>>>>>>>>>> SPECjbb2015
>>>>>>>>>>> experiments.
>>>>>>>>>>>
>>>>>>>>>>> Here's the list of changes from CR3 to CR4. The list is a bit
>>>>>>>>>>> verbose due to the complexity of the issue, but the changes
>>>>>>>>>>> themselves are not that big.
>>>>>>>>>>>
>>>>>>>>>>> Functional:
>>>>>>>>>>> - Change SafepointSynchronize::is_cleanup_needed() from
>>>>>>>>>>> calling
>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to calling
>>>>>>>>>>> ObjectSynchronizer::is_safepoint_deflation_needed():
>>>>>>>>>>> - is_safepoint_deflation_needed() returns the result of
>>>>>>>>>>> monitors_used_above_threshold() for safepoint based
>>>>>>>>>>> monitor deflation (!AsyncDeflateIdleMonitors).
>>>>>>>>>>> - For AsyncDeflateIdleMonitors, it only returns true if
>>>>>>>>>>> there is a special deflation request, e.g., System.gc()
>>>>>>>>>>> - This solves a bug where there are a bunch of Cleanup
>>>>>>>>>>> safepoints that simply request async deflation which
>>>>>>>>>>> keeps the async JavaThreads from making progress on
>>>>>>>>>>> their async deflation work.
>>>>>>>>>>> - Add AsyncDeflationInterval diagnostic option. Description:
>>>>>>>>>>> Async deflate idle monitors every so many milliseconds
>>>>>>>>>>> when
>>>>>>>>>>> MonitorUsedDeflationThreshold is exceeded (0 is off).
>>>>>>>>>>> - Replace
>>>>>>>>>>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with
>>>>>>>>>>> ObjectSynchronizer::is_async_deflation_needed():
>>>>>>>>>>> - is_async_deflation_needed() returns true when
>>>>>>>>>>> is_async_cleanup_requested() is true or when
>>>>>>>>>>> monitors_used_above_threshold() is true (but no more
>>>>>>>>>>> often than
>>>>>>>>>>> AsyncDeflationInterval).
>>>>>>>>>>> - if AsyncDeflateIdleMonitors Service_lock->wait() now
>>>>>>>>>>> waits for
>>>>>>>>>>> at most GuaranteedSafepointInterval millis:
>>>>>>>>>>> - This allows is_async_deflation_needed() to be
>>>>>>>>>>> checked at
>>>>>>>>>>> the same interval as GuaranteedSafepointInterval.
>>>>>>>>>>> (default is 1000 millis/1 second)
>>>>>>>>>>> - Once is_async_deflation_needed() has returned true, it
>>>>>>>>>>> generally cannot return true for
>>>>>>>>>>> AsyncDeflationInterval.
>>>>>>>>>>> This is to prevent async deflation from swamping the
>>>>>>>>>>> ServiceThread.
>>>>>>>>>>> - The ServiceThread still handles async deflation of the
>>>>>>>>>>> global
>>>>>>>>>>> in-use list and now it also marks JavaThreads for async
>>>>>>>>>>> deflation
>>>>>>>>>>> of their in-use lists.
>>>>>>>>>>> - The ServiceThread will check for async deflation work
>>>>>>>>>>> every
>>>>>>>>>>> GuaranteedSafepointInterval.
>>>>>>>>>>> - A safepoint can still cause the ServiceThread to check
>>>>>>>>>>> for
>>>>>>>>>>> async deflation work via is_async_deflation_requested.
>>>>>>>>>>> - Refactor code from
>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() into
>>>>>>>>>>> monitors_used_above_threshold() and remove
>>>>>>>>>>> is_cleanup_needed().
>>>>>>>>>>> - In addition to System.gc(), the VM_Exit VM op and the final
>>>>>>>>>>> VMThread safepoint now set the
>>>>>>>>>>> is_special_deflation_requested
>>>>>>>>>>> flag to reduce the in-use monitor population that is
>>>>>>>>>>> reported by
>>>>>>>>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit.
>>>>>>>>>>>
>>>>>>>>>>> Test update:
>>>>>>>>>>> - test/hotspot/gtest/oops/test_markOop.cpp is updated to
>>>>>>>>>>> work with
>>>>>>>>>>> AsyncDeflateIdleMonitors.
>>>>>>>>>>>
>>>>>>>>>>> Collateral:
>>>>>>>>>>> - Add/clarify/update some logging messages.
>>>>>>>>>>>
>>>>>>>>>>> Cleanup:
>>>>>>>>>>> - Updated comments based on Karen's code review.
>>>>>>>>>>> - Change 'special cleanup' -> 'special deflation' and
>>>>>>>>>>> 'async cleanup' -> 'async deflation'.
>>>>>>>>>>> - comment and function name changes
>>>>>>>>>>> - Clarify MonitorUsedDeflationThreshold description;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>
>>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>
>>>>>>>>>>> The project is currently baselined on jdk-13+22.
>>>>>>>>>>>
>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.full/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.inc/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have not updated the OpenJDK wiki to reflect the CR4 changes:
>>>>>>>>>>>
>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The wiki doesn't say a whole lot about the async deflation
>>>>>>>>>>> invocation
>>>>>>>>>>> mechanism so I have to figure out how to add that content.
>>>>>>>>>>>
>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8]
>>>>>>>>>>> testing on
>>>>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit
>>>>>>>>>>> run is
>>>>>>>>>>> running now. Kitchensink8H on product, fastdebug, and
>>>>>>>>>>> slowdebug bits
>>>>>>>>>>> are running on Linux-X64, MacOSX and Solaris-X64. I still
>>>>>>>>>>> have to run
>>>>>>>>>>> my stress kit on Linux-X64. I still have to run the SPECjbb2015
>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>>>>
>>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>
>>>>>>>>>>>> I had some discussions with Karen about a race that was in the
>>>>>>>>>>>> ObjectMonitor::enter() code in CR2/v2.02/5-for-jdk13. This
>>>>>>>>>>>> race was
>>>>>>>>>>>> theoretical and I had no test failures due to it. The fix
>>>>>>>>>>>> is pretty
>>>>>>>>>>>> simple: remove the special case code for async deflation in
>>>>>>>>>>>> the
>>>>>>>>>>>> ObjectMonitor::enter() function and rely solely on the
>>>>>>>>>>>> ref_count
>>>>>>>>>>>> for ObjectMonitor::enter() protection.
>>>>>>>>>>>>
>>>>>>>>>>>> During those discussions Karen also floated the idea of
>>>>>>>>>>>> using the
>>>>>>>>>>>> ref_count field instead of the contentions field for the Async
>>>>>>>>>>>> Monitor Deflation protocol. I decided to go ahead and code
>>>>>>>>>>>> up that
>>>>>>>>>>>> change and I have run it through the usual stress and Mach5
>>>>>>>>>>>> testing
>>>>>>>>>>>> with no issues. It's also known as v2.03 (for those for
>>>>>>>>>>>> with the
>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for those with webrev
>>>>>>>>>>>> URLs).
>>>>>>>>>>>> Sorry for all the names...
>>>>>>>>>>>>
>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>
>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>
>>>>>>>>>>>> The project is currently baselined on jdk-13+18.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.full/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.inc/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I have also updated the OpenJDK wiki to reflect the CR3
>>>>>>>>>>>> changes:
>>>>>>>>>>>>
>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8]
>>>>>>>>>>>> testing on
>>>>>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit
>>>>>>>>>>>> run had
>>>>>>>>>>>> no issues. Kitchensink8H on product, fastdebug, and
>>>>>>>>>>>> slowdebug bits
>>>>>>>>>>>> had no failures on Linux-X64; MacOSX fastdebug and
>>>>>>>>>>>> slowdebug and
>>>>>>>>>>>> Solaris-X64 release had the usual "Too large time diff"
>>>>>>>>>>>> complaints.
>>>>>>>>>>>> 12 hour Inflate2 runs on product, fastdebug and slowdebug
>>>>>>>>>>>> bits on
>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no failures. My
>>>>>>>>>>>> Linux-X64
>>>>>>>>>>>> stress kit is running right now.
>>>>>>>>>>>>
>>>>>>>>>>>> I've done the SPECjbb2015 baseline and CR3 runs. I need to
>>>>>>>>>>>> gather
>>>>>>>>>>>> the results and analyze them.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, in advance, for any questions, comments or
>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a small but important bug fix for the Async Monitor
>>>>>>>>>>>>> Deflation
>>>>>>>>>>>>> project ready to go. It's also known as v2.02 (for those
>>>>>>>>>>>>> for with the
>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for those with webrev
>>>>>>>>>>>>> URLs). Sorry
>>>>>>>>>>>>> for all the names...
>>>>>>>>>>>>>
>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two days ago so that
>>>>>>>>>>>>> baseline patch
>>>>>>>>>>>>> is out of our hair.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>
>>>>>>>>>>>>> The project is currently baselined on jdk-13+17.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR2
>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-6]
>>>>>>>>>>>>> testing on
>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[7-8] is
>>>>>>>>>>>>> running now.
>>>>>>>>>>>>> My stress kit is running on Solaris-X64 now. Kitchensink8H
>>>>>>>>>>>>> is running
>>>>>>>>>>>>> now on product, fastdebug, and slowdebug bits on
>>>>>>>>>>>>> Linux-X64, MacOSX
>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs are running now on
>>>>>>>>>>>>> product,
>>>>>>>>>>>>> fastdebug and slowdebug bits on Linux-X64, MacOSX and
>>>>>>>>>>>>> Solaris-X64.
>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 sometime on
>>>>>>>>>>>>> Sunday (after
>>>>>>>>>>>>> my jdk-13+18 stress run is done).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 runs after all the
>>>>>>>>>>>>> stress
>>>>>>>>>>>>> testing is done.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or
>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I finally have CR1 for the Async Monitor Deflation
>>>>>>>>>>>>>> project ready to
>>>>>>>>>>>>>> go. It's also known as v2.01 (for those for with the
>>>>>>>>>>>>>> patches) and as
>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with webrev URLs). Sorry
>>>>>>>>>>>>>> for all the
>>>>>>>>>>>>>> names...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Baseline bug fixes URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JDK-8222295 more baseline cleanups from Async Monitor
>>>>>>>>>>>>>> Deflation project
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The project is currently baselined on jdk-13+15.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's the webrev for the latest baseline changes
>>>>>>>>>>>>>> (JDK-8222295):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 only):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I'm looking for reviews for both JDK-8222295 and the
>>>>>>>>>>>>>> latest version
>>>>>>>>>>>>>> of JDK-8153224...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR
>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3]
>>>>>>>>>>>>>> testing on
>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is
>>>>>>>>>>>>>> running now and
>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. My stress kit on
>>>>>>>>>>>>>> Solaris-X64
>>>>>>>>>>>>>> is running now. Linux-X64 stress testing will start on
>>>>>>>>>>>>>> Sunday. I'm
>>>>>>>>>>>>>> planning to do Kitchensink runs, SPECjbb2015 runs and my
>>>>>>>>>>>>>> monitor
>>>>>>>>>>>>>> inflation stress tests on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or
>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread for my port of
>>>>>>>>>>>>>>> Carsten's work on:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's a link to Carsten's original webrev:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Earlier versions of this patch have been through several
>>>>>>>>>>>>>>> rounds of
>>>>>>>>>>>>>>> preliminary review. Many thanks to Carsten, Coleen,
>>>>>>>>>>>>>>> Robbin, and
>>>>>>>>>>>>>>> Roman for their preliminary code review comments. A very
>>>>>>>>>>>>>>> special
>>>>>>>>>>>>>>> thanks to Robbin and Roman for building and testing the
>>>>>>>>>>>>>>> patch in
>>>>>>>>>>>>>>> their own environments (including specJBB2015).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8]
>>>>>>>>>>>>>>> testing on
>>>>>>>>>>>>>>> Oracle's usual set of platforms. Earlier versions have
>>>>>>>>>>>>>>> been run
>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 and Solaris-X64
>>>>>>>>>>>>>>> servers
>>>>>>>>>>>>>>> (product, fastdebug, slowdebug).Earlier versions have
>>>>>>>>>>>>>>> run Kitchensink
>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64
>>>>>>>>>>>>>>> (product, fastdebug
>>>>>>>>>>>>>>> and slowdebug). Earlier versions have run my monitor
>>>>>>>>>>>>>>> inflation stress
>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64
>>>>>>>>>>>>>>> (product,
>>>>>>>>>>>>>>> fastdebug and slowdebug).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> All of the testing done on earlier versions will be
>>>>>>>>>>>>>>> redone on the
>>>>>>>>>>>>>>> latest version of the patch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or
>>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> P.S.
>>>>>>>>>>>>>>> One subtest in
>>>>>>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on Win* only. I've
>>>>>>>>>>>>>>> been trying
>>>>>>>>>>>>>>> to characterize/analyze this failure for more than a
>>>>>>>>>>>>>>> week now. At
>>>>>>>>>>>>>>> this point I'm convinced that Async Monitor Deflation is
>>>>>>>>>>>>>>> aggravating
>>>>>>>>>>>>>>> an existing bug. However, I plan to have a better handle
>>>>>>>>>>>>>>> on that
>>>>>>>>>>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list