RFR(L) 8153224 Monitor deflation prolong safepoints (CR8/v2.08/11-for-jdk14)

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Dec 20 23:30:56 UTC 2019


On 12/20/19 6:12 AM, Robbin Ehn wrote:
> Hi Dan,
>
> Sorry for taking some time, truncated:
>
> On 2019-12-10 22:10, Daniel D. Daugherty wrote:
>> So java/lang/management/ThreadInfo gives a Java program access to
>> ThreadService::get_current_contended_monitor() for any target thread
>> without going to a safepoint:
>
> Yes you are correct.
> I wonder if we should instead have current waiting object.
> Since we are actually interested in the object not the monitor.

The code is using ThreadService::get_current_contended_monitor() to get
to the object and then it saves a handle for the object. It's just the
getting the contended monitor for a non-self and non-suspended and
non-safepointed thread that's risky.


>
>>>
>>> In no place you need SR,
>>
>> I think you meant "RS" here since we were using release_store() here.
>
> The terminology often seen:
> load with aqcuire semantics/loads acquiring => la
> store with release semantics/stores releasing => sr
>
> But since 'release' relates to precceding stores and loads,
> you also see 'release store'.

Thanks for the clarification. I was going off our function name.


>>
>> so if COMPILER2 is not built, then the code that I'm talking about
>> is not in the system...
>>
>
> Ok, thanks for that clarification
>>
>> Ohhhhhh.... by using is_synchronizing() all those call sites will
>> do or not do whatever is being controlled based on whether a
>> safepoint is beginning, but they won't check for a local_poll()...
>>
>> I think that means that a local_poll() request will take longer to
>> be honored right?
>
> Yes. And there can also be safepoint behind the handshake, which can 
> be delayed (unlikley).
>
>>> I don't see that g_wait_list is ever simutainously read.
>>> Either it is accessed by serviceThread outside a safepoint or by 
>>> VMThread inside a safepoint?
>>>
>>> It looks like g_wait_list can just be a local in:
>>> void ObjectSynchronizer::deflate_idle_monitors_using_JT()
>>>
>>> (disregarding the debug code that might read it in a safepoint)
>>
>> g_wait_list is checked by 
>> ObjectSynchronizer::chk_global_wait_list_and_count()
>> which is called by ObjectSynchronizer::audit_and_print_stats().
>>
>> audit_and_print_stats() is called from three places:
>
> But this is what I call 'debug' code.

Right, I'm not arguing that you call it debug code. It still has to
work and do the right thing. Remember you called me on the fact that
I wasn't using locking in my list walkers in one of your code reviews.
You pointed out that my list walker could jump from one list to
another blindly... so now the list walkers do locking... as of v2.09
(IIRC)...

 > (disregarding the debug code that might read it in a safepoint)

What I was pointing out here was that list walking (debug) code can
read it at a non-safepoint.


>
>> So g_wait_list can be read simultaneously at a non-safepoint at VM exit
>> time. Since I don't want auditing failures to crop up at VM exit time,
>> g_wait_list needs to remain where it is. And yes, some of my stress runs
>> have showed crashes at VM exit time due to auditing failures when things
>> are kept "in sync"...
>
> Ok
>> I think the code in javaVFrame::print_lock_info_on() is very careful 
>> about
>> its call to thread()->current_pending_monitor(&omh) so if we do 
>> decide to
>> allow for passing NULL to current_pending_monitor(), then this would 
>> be a
>> spot where I think it would be safe. However, we have been trying 
>> very hard
>> to do things in code to make things safe rather than relying on 
>> comments.
>>
>> Resolving this comment relies on how we decide to resolve the long 
>> thread
>> for src/hotspot/share/prims/jvmtiEnvBase.cpp above.
>>
>> Update: I've gone through and added a comment at the decl for the
>> ObjectMonitorHandles associated the calls to:
>>
>>      Thread::current_waiting_monitor(&omh)
>>      Thread::current_pending_monitor(&omh)
>>
>> Hopefully, it makes things more clear.
>
> Good
>
>>
>>
>> I think I've replied to all of the comments. Please let me know if I
>> missed anything... Thanks again for the crawl thru code review!
>
> Nope, thanks!

Again, thanks for the crawl thru code review.

Dan

>
> /Robbin
>
>>
>> Dan
>>
>>
>>>
>>> Thanks, Robbin
>>>
>>>
>>> On 11/4/19 10:03 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have made changes to the Async Monitor Deflation code in response to
>>>> the CR7/v2.07/10-for-jdk14 code review cycle. Thanks to David H., 
>>>> Robbin
>>>> and Erik O. for their comments!
>>>>
>>>> JDK14 Rampdown phase one is coming on Dec. 12, 2019 and the Async 
>>>> Monitor
>>>> Deflation project needs to push before Nov. 12, 2019 in order to allow
>>>> for sufficient bake time for such a big change. Nov. 12 is _next_ 
>>>> Tuesday
>>>> so we have 8 days from today to finish this code review cycle and push
>>>> this code for JDK14.
>>>>
>>>> Carsten and Roman! Time for you guys to chime in again on the code 
>>>> reviews.
>>>>
>>>> I have attached the change list from CR7 to CR8 instead of putting 
>>>> it in
>>>> the body of this email. I've also added a link to the 
>>>> CR7-to-CR8-changes
>>>> file to the webrevs so it should be easy to find.
>>>>
>>>> Main bug URL:
>>>>
>>>>      JDK-8153224 Monitor deflation prolong safepoints
>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>
>>>> The project is currently baselined on jdk-14+21.
>>>>
>>>> Here's the full webrev URL for those folks that want to see all of the
>>>> current Async Monitor Deflation code in one go (v2.08 full):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.full 
>>>>
>>>>
>>>> Some folks might want to see just what has changed since the last 
>>>> review
>>>> cycle so here's a webrev for that (v2.08 inc):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.inc/ 
>>>>
>>>>
>>>> The OpenJDK wiki did not need any changes for this round:
>>>>
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>
>>>> The jdk-14+21 based v2.08 version of the patch has been thru Mach5 
>>>> tier[1-8]
>>>> testing on Oracle's usual set of platforms. It has also been 
>>>> through my usual
>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with the 
>>>> addition
>>>> of Robbin's "MoCrazy 1024" test running in parallel with the other 
>>>> tests in
>>>> my lab. Some testing is still running, but so far there are no new 
>>>> regressions.
>>>>
>>>> I have not yet done a SPECjbb2015 round on the 
>>>> CR8/v2.08/11-for-jdk14 bits.
>>>>
>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> The Async Monitor Deflation project is reaching the end game. I 
>>>>> have no
>>>>> changes planned for the project at this time so all that is left 
>>>>> is code
>>>>> review and any changes that results from those reviews.
>>>>>
>>>>> Carsten and Roman! Time for you guys to chime in again on the code 
>>>>> reviews.
>>>>>
>>>>> I have attached the list of fixes from CR6 to CR7 instead of 
>>>>> putting it
>>>>> in the main body of this email.
>>>>>
>>>>> Main bug URL:
>>>>>
>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>
>>>>> The project is currently baselined on jdk-14+19.
>>>>>
>>>>> Here's the full webrev URL for those folks that want to see all of 
>>>>> the
>>>>> current Async Monitor Deflation code in one go (v2.07 full):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.full 
>>>>>
>>>>>
>>>>> Some folks might want to see just what has changed since the last 
>>>>> review
>>>>> cycle so here's a webrev for that (v2.07 inc):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.inc/ 
>>>>>
>>>>>
>>>>> The OpenJDK wiki has been updated to match the 
>>>>> CR7/v2.07/10-for-jdk14 changes:
>>>>>
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>
>>>>> The jdk-14+18 based v2.07 version of the patch has been thru Mach5 
>>>>> tier[1-8]
>>>>> testing on Oracle's usual set of platforms. It has also been 
>>>>> through my usual
>>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with 
>>>>> the addition
>>>>> of Robbin's "MoCrazy 1024" test running in parallel with the other 
>>>>> tests in
>>>>> my lab.
>>>>>
>>>>> The jdk-14+19 based v2.07 version of the patch has been thru Mach5 
>>>>> tier[1-3]
>>>>> test on Oracle's usual set of platforms. Mach5 tier[4-8] are in 
>>>>> process.
>>>>>
>>>>> I did another round of SPECjbb2015 testing in Oracle's Aurora 
>>>>> Performance lab
>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs:
>>>>>
>>>>>     - "base" is jdk-14+18
>>>>>     - "v2.07" is the latest version and includes C2 
>>>>> inc_om_ref_count() support
>>>>>       on LP64 X64 and the new HandshakeAfterDeflateIdleMonitors 
>>>>> option
>>>>>     - "off" is with -XX:-AsyncDeflateIdleMonitors specified
>>>>>     - "handshake" is with -XX:+HandshakeAfterDeflateIdleMonitors 
>>>>> specified
>>>>>
>>>>>          hbIR           hbIR
>>>>>     (max attempted)  (settled)  max-jOPS  critical-jOPS runtime
>>>>>     ---------------  ---------  --------  ------------- -------
>>>>>            34282.00   30635.90  28831.30       20969.20 3841.30 base
>>>>>            34282.00   30973.00  29345.80       21025.20 3964.10 v2.07
>>>>>            34282.00   31105.60  29174.30       21074.00 3931.30 
>>>>> v2.07_handshake
>>>>>            34282.00   30789.70  27151.60       19839.10 3850.20 
>>>>> v2.07_off
>>>>>
>>>>>     - The Aurora Perf comparison tool reports:
>>>>>
>>>>>         Comparison              max-jOPS critical-jOPS
>>>>>         ----------------------  -------------------- 
>>>>> --------------------
>>>>>         base vs 2.07            +1.78% (s, p=0.000) +0.27% (ns, 
>>>>> p=0.790)
>>>>>         base vs 2.07_handshake  +1.19% (s, p=0.007) +0.58% (ns, 
>>>>> p=0.536)
>>>>>         base vs 2.07_off        -5.83% (ns, p=0.394) -5.39% (ns, 
>>>>> p=0.347)
>>>>>
>>>>>         (s) - significant  (ns) - not-significant
>>>>>
>>>>>     - For historical comparison, the Aurora Perf comparision tool
>>>>>         reported for v2.06 with a baseline of jdk-13+31:
>>>>>
>>>>>         Comparison              max-jOPS critical-jOPS
>>>>>         ----------------------  -------------------- 
>>>>> --------------------
>>>>>         base vs 2.06            -0.32% (ns, p=0.345) +0.71% (ns, 
>>>>> p=0.646)
>>>>>         base vs 2.06_off        +0.49% (ns, p=0.292) -1.21% (ns, 
>>>>> p=0.481)
>>>>>
>>>>>         (s) - significant  (ns) - not-significant
>>>>>
>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> The Async Monitor Deflation project has rebased to JDK14 so it's 
>>>>>> time
>>>>>> for our first code review in that new context!!
>>>>>>
>>>>>> I've been focused on changing the monitor list management code to be
>>>>>> lock-free in order to make SPECjbb2015 happier. Of course with a 
>>>>>> change
>>>>>> like that, it takes a while to chase down all the new and wonderful
>>>>>> races. At this point, I have the code back to the same stability 
>>>>>> that
>>>>>> I had with CR5/v2.05/8-for-jdk13.
>>>>>>
>>>>>> To lay the ground work for this round of review, I pushed the 
>>>>>> following
>>>>>> two fixes to jdk/jdk earlier today:
>>>>>>
>>>>>>     JDK-8230184 rename, whitespace, indent and comments changes 
>>>>>> in preparation
>>>>>>                 for lock free Monitor lists
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230184
>>>>>>
>>>>>>     JDK-8230317 serviceability/sa/ClhsdbPrintStatics.java fails 
>>>>>> after 8230184
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230317
>>>>>>
>>>>>> I have attached the list of fixes from CR5 to CR6 instead of putting
>>>>>> in the main body of this email.
>>>>>>
>>>>>> Main bug URL:
>>>>>>
>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>
>>>>>> The project is currently baselined on jdk-14+11 plus the fixes for
>>>>>> JDK-8230184 and JDK-8230317.
>>>>>>
>>>>>> Here's the full webrev URL for those folks that want to see all 
>>>>>> of the
>>>>>> current Async Monitor Deflation code in one go (v2.06 full):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.full/ 
>>>>>>
>>>>>>
>>>>>>
>>>>>> The primary focus of this review cycle is on the lock-free 
>>>>>> Monitor List
>>>>>> management changes so here's a webrev for just that patch (v2.06c):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06c.inc/ 
>>>>>>
>>>>>>
>>>>>> The secondary focus of this review cycle is on the bug fixes that 
>>>>>> have
>>>>>> been made since CR5/v2.05/8-for-jdk13 so here's a webrev for just 
>>>>>> that
>>>>>> patch (v2.06b):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06b.inc/ 
>>>>>>
>>>>>>
>>>>>> The third and final bucket for this review cycle is the rename, 
>>>>>> whitespace,
>>>>>> indent and comments changes made in preparation for lock free 
>>>>>> Monitor list
>>>>>> management. Almost all of that was extracted into JDK-8230184 for 
>>>>>> the
>>>>>> baseline so this bucket now has just a few comment changes 
>>>>>> relative to
>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for the remainder (v2.06a):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06a.inc/ 
>>>>>>
>>>>>>
>>>>>>
>>>>>> Some folks might want to see just what has changed since the last 
>>>>>> review
>>>>>> cycle so here's a webrev for that (v2.06 inc):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.inc/ 
>>>>>>
>>>>>>
>>>>>>
>>>>>> Last, but not least, some folks might want to see the code before 
>>>>>> the
>>>>>> addition of lock-free Monitor List management so here's a webrev for
>>>>>> that (v2.00 -> v2.05):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.05.inc/ 
>>>>>>
>>>>>>
>>>>>> The OpenJDK wiki will need minor updates to match the CR6 changes:
>>>>>>
>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>
>>>>>>
>>>>>> but that should only be changes to describe per-thread list async 
>>>>>> monitor
>>>>>> deflation being done by the ServiceThread.
>>>>>>
>>>>>> (I did update the OpenJDK wiki for the CR5 changes back on 
>>>>>> 2019.08.14)
>>>>>>
>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>> Oracle's usual set of platforms. It has also been through my 
>>>>>> usual set
>>>>>> of stress testing on Linux-X64, macOSX and Solaris-X64.
>>>>>>
>>>>>> I did a bunch of SPECjbb2015 testing in Oracle's Aurora 
>>>>>> Performance lab
>>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs. This 
>>>>>> was using
>>>>>> this patch baselined on jdk-13+31 (for stability):
>>>>>>
>>>>>>           hbIR           hbIR
>>>>>>      (max attempted)  (settled)  max-jOPS  critical-jOPS runtime
>>>>>>      ---------------  ---------  --------  ------------- -------
>>>>>>             34282.00   28837.20  27905.20       19817.40 3658.10 
>>>>>> base
>>>>>>             34965.70   29798.80  27814.90       19959.00 3514.60 
>>>>>> v2.06d
>>>>>>             34282.00   29100.70  28042.50       19577.00 3701.90 
>>>>>> v2.06d_off
>>>>>>             34282.00   29218.50  27562.80       19397.30 3657.60 
>>>>>> v2.06d_ocache
>>>>>>             34965.70   29838.30  26512.40       19170.60 3569.90 
>>>>>> v2.05
>>>>>>             34282.00   28926.10  27734.00       19835.10 3588.40 
>>>>>> v2.05_off
>>>>>>
>>>>>> The "off" configs are with -XX:-AsyncDeflateIdleMonitors 
>>>>>> specified and
>>>>>> the "ocache" config is with 128 byte cache line sizes instead of 
>>>>>> 64 byte
>>>>>> cache lines sizes. "v2.06d" is the last set of changes that I 
>>>>>> made before
>>>>>> those changes were distributed into the "v2.06a", "v2.06b" and 
>>>>>> "v2.06c"
>>>>>> buckets for this review recycle.
>>>>>>
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I've been focused on chasing down and fixing the rare test failures
>>>>>>> that only pop up rarely. So this round is primarily fixes for races
>>>>>>> with a few additional fixes that came from Karen's review of CR4.
>>>>>>> Thanks Karen!
>>>>>>>
>>>>>>> I have attached the list of fixes from CR4 to CR5 instead of 
>>>>>>> putting
>>>>>>> in the main body of this email.
>>>>>>>
>>>>>>> Main bug URL:
>>>>>>>
>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>
>>>>>>> The project is currently baselined on jdk-13+29. This will 
>>>>>>> likely be
>>>>>>> the last JDK13 baseline for this project and I'll roll to the JDK14
>>>>>>> (jdk/jdk) repo soon...
>>>>>>>
>>>>>>> Here's the full webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.full/
>>>>>>>
>>>>>>> Here's the incremental webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.inc/
>>>>>>>
>>>>>>> I have not yet checked the OpenJDK wiki to see if it needs any 
>>>>>>> updates
>>>>>>> to match the CR5 changes:
>>>>>>>
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>
>>>>>>>
>>>>>>> (I did update the OpenJDK wiki for the CR4 changes back on 
>>>>>>> 2019.06.26)
>>>>>>>
>>>>>>> This version of the patch has been thru Mach5 tier[1-3] testing on
>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running now and
>>>>>>> Mach5 tier[78] will follow. I'll kick off the usual stress testing
>>>>>>> on Linux-X64, macOSX and Solaris-X64 as those machines become 
>>>>>>> available.
>>>>>>> Since I haven't made any performance changes in this round, I'll 
>>>>>>> only
>>>>>>> be running SPECjbb2015 to gather the latest monitorinflation logs.
>>>>>>>
>>>>>>> Next up:
>>>>>>>
>>>>>>> - We're still seeing 4-5% lower performance with SPECjbb2015 on
>>>>>>>   Linux-X64 and we've determined that some of that comes from
>>>>>>>   contention on the gListLock. So I'm going to investigate removing
>>>>>>>   the gListLock. Yes, another lock free set of changes is coming!
>>>>>>> - Of course, going lock free often causes new races and new 
>>>>>>> failures
>>>>>>>   so that's a good reason for make those changes isolated in their
>>>>>>>   own round (and not holding up CR5/v2.05/8-for-jdk13 anymore).
>>>>>>> - I finally have a potential fix for the Win* failure with
>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>   but I haven't run it through Mach5 yet so it'll be in the next 
>>>>>>> round.
>>>>>>> - Some RTM tests were recently re-enabled in Mach5 and I'm 
>>>>>>> seeing some
>>>>>>>   monitor related failures there. I suspect that I need to go 
>>>>>>> take a
>>>>>>>   look at the C2 RTM macro assembler code and look for things 
>>>>>>> that might
>>>>>>>   conflict if Async Monitor Deflation. If you're interested in 
>>>>>>> that kind
>>>>>>>   of issue, then see the macroAssembler_x86.cpp sanity check that I
>>>>>>>   added in this round!
>>>>>>>
>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a fix for an issue that came up during performance testing.
>>>>>>>> Many thanks to Robbin for diagnosing the issue in his SPECjbb2015
>>>>>>>> experiments.
>>>>>>>>
>>>>>>>> Here's the list of changes from CR3 to CR4. The list is a bit
>>>>>>>> verbose due to the complexity of the issue, but the changes
>>>>>>>> themselves are not that big.
>>>>>>>>
>>>>>>>> Functional:
>>>>>>>>   - Change SafepointSynchronize::is_cleanup_needed() from calling
>>>>>>>>     ObjectSynchronizer::is_cleanup_needed() to calling
>>>>>>>> ObjectSynchronizer::is_safepoint_deflation_needed():
>>>>>>>>     - is_safepoint_deflation_needed() returns the result of
>>>>>>>>       monitors_used_above_threshold() for safepoint based
>>>>>>>>       monitor deflation (!AsyncDeflateIdleMonitors).
>>>>>>>>     - For AsyncDeflateIdleMonitors, it only returns true if
>>>>>>>>       there is a special deflation request, e.g., System.gc()
>>>>>>>>       - This solves a bug where there are a bunch of Cleanup
>>>>>>>>         safepoints that simply request async deflation which
>>>>>>>>         keeps the async JavaThreads from making progress on
>>>>>>>>         their async deflation work.
>>>>>>>>   - Add AsyncDeflationInterval diagnostic option. Description:
>>>>>>>>       Async deflate idle monitors every so many milliseconds when
>>>>>>>>       MonitorUsedDeflationThreshold is exceeded (0 is off).
>>>>>>>>   - Replace ObjectSynchronizer::gOmShouldDeflateIdleMonitors() 
>>>>>>>> with
>>>>>>>>     ObjectSynchronizer::is_async_deflation_needed():
>>>>>>>>     - is_async_deflation_needed() returns true when
>>>>>>>>       is_async_cleanup_requested() is true or when
>>>>>>>>       monitors_used_above_threshold() is true (but no more 
>>>>>>>> often than
>>>>>>>>       AsyncDeflationInterval).
>>>>>>>>     - if AsyncDeflateIdleMonitors Service_lock->wait() now 
>>>>>>>> waits for
>>>>>>>>       at most GuaranteedSafepointInterval millis:
>>>>>>>>       - This allows is_async_deflation_needed() to be checked at
>>>>>>>>         the same interval as GuaranteedSafepointInterval.
>>>>>>>>         (default is 1000 millis/1 second)
>>>>>>>>       - Once is_async_deflation_needed() has returned true, it
>>>>>>>>         generally cannot return true for AsyncDeflationInterval.
>>>>>>>>         This is to prevent async deflation from swamping the
>>>>>>>>         ServiceThread.
>>>>>>>>   - The ServiceThread still handles async deflation of the global
>>>>>>>>     in-use list and now it also marks JavaThreads for async 
>>>>>>>> deflation
>>>>>>>>     of their in-use lists.
>>>>>>>>     - The ServiceThread will check for async deflation work every
>>>>>>>>       GuaranteedSafepointInterval.
>>>>>>>>     - A safepoint can still cause the ServiceThread to check for
>>>>>>>>       async deflation work via is_async_deflation_requested.
>>>>>>>>   - Refactor code from ObjectSynchronizer::is_cleanup_needed() 
>>>>>>>> into
>>>>>>>>     monitors_used_above_threshold() and remove 
>>>>>>>> is_cleanup_needed().
>>>>>>>>   - In addition to System.gc(), the VM_Exit VM op and the final
>>>>>>>>     VMThread safepoint now set the is_special_deflation_requested
>>>>>>>>     flag to reduce the in-use monitor population that is 
>>>>>>>> reported by
>>>>>>>>     ObjectSynchronizer::log_in_use_monitor_details() at VM exit.
>>>>>>>>
>>>>>>>> Test update:
>>>>>>>>   - test/hotspot/gtest/oops/test_markOop.cpp is updated to work 
>>>>>>>> with
>>>>>>>>     AsyncDeflateIdleMonitors.
>>>>>>>>
>>>>>>>> Collateral:
>>>>>>>>   - Add/clarify/update some logging messages.
>>>>>>>>
>>>>>>>> Cleanup:
>>>>>>>>   - Updated comments based on Karen's code review.
>>>>>>>>   - Change 'special cleanup' -> 'special deflation' and
>>>>>>>>     'async cleanup' -> 'async deflation'.
>>>>>>>>     - comment and function name changes
>>>>>>>>   - Clarify MonitorUsedDeflationThreshold description;
>>>>>>>>
>>>>>>>>
>>>>>>>> Main bug URL:
>>>>>>>>
>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>
>>>>>>>> The project is currently baselined on jdk-13+22.
>>>>>>>>
>>>>>>>> Here's the full webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.full/ 
>>>>>>>>
>>>>>>>>
>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.inc/
>>>>>>>>
>>>>>>>> I have not updated the OpenJDK wiki to reflect the CR4 changes:
>>>>>>>>
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>
>>>>>>>>
>>>>>>>> The wiki doesn't say a whole lot about the async deflation 
>>>>>>>> invocation
>>>>>>>> mechanism so I have to figure out how to add that content.
>>>>>>>>
>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run is
>>>>>>>> running now. Kitchensink8H on product, fastdebug, and slowdebug 
>>>>>>>> bits
>>>>>>>> are running on Linux-X64, MacOSX and Solaris-X64. I still have 
>>>>>>>> to run
>>>>>>>> my stress kit on Linux-X64. I still have to run the SPECjbb2015
>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>
>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> I had some discussions with Karen about a race that was in the
>>>>>>>>> ObjectMonitor::enter() code in CR2/v2.02/5-for-jdk13. This 
>>>>>>>>> race was
>>>>>>>>> theoretical and I had no test failures due to it. The fix is 
>>>>>>>>> pretty
>>>>>>>>> simple: remove the special case code for async deflation in the
>>>>>>>>> ObjectMonitor::enter() function and rely solely on the ref_count
>>>>>>>>> for ObjectMonitor::enter() protection.
>>>>>>>>>
>>>>>>>>> During those discussions Karen also floated the idea of using the
>>>>>>>>> ref_count field instead of the contentions field for the Async
>>>>>>>>> Monitor Deflation protocol. I decided to go ahead and code up 
>>>>>>>>> that
>>>>>>>>> change and I have run it through the usual stress and Mach5 
>>>>>>>>> testing
>>>>>>>>> with no issues. It's also known as v2.03 (for those for with the
>>>>>>>>> patches) and as webrev/6-for-jdk13 (for those with webrev URLs).
>>>>>>>>> Sorry for all the names...
>>>>>>>>>
>>>>>>>>> Main bug URL:
>>>>>>>>>
>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>
>>>>>>>>> The project is currently baselined on jdk-13+18.
>>>>>>>>>
>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.full/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.inc/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I have also updated the OpenJDK wiki to reflect the CR3 changes:
>>>>>>>>>
>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>> testing on
>>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run 
>>>>>>>>> had
>>>>>>>>> no issues. Kitchensink8H on product, fastdebug, and slowdebug 
>>>>>>>>> bits
>>>>>>>>> had no failures on Linux-X64; MacOSX fastdebug and slowdebug and
>>>>>>>>> Solaris-X64 release had the usual "Too large time diff" 
>>>>>>>>> complaints.
>>>>>>>>> 12 hour Inflate2 runs on product, fastdebug and slowdebug bits on
>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no failures. My Linux-X64
>>>>>>>>> stress kit is running right now.
>>>>>>>>>
>>>>>>>>> I've done the SPECjbb2015 baseline and CR3 runs. I need to gather
>>>>>>>>> the results and analyze them.
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> I have a small but important bug fix for the Async Monitor 
>>>>>>>>>> Deflation
>>>>>>>>>> project ready to go. It's also known as v2.02 (for those for 
>>>>>>>>>> with the
>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for those with webrev 
>>>>>>>>>> URLs). Sorry
>>>>>>>>>> for all the names...
>>>>>>>>>>
>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two days ago so that 
>>>>>>>>>> baseline patch
>>>>>>>>>> is out of our hair.
>>>>>>>>>>
>>>>>>>>>> Main bug URL:
>>>>>>>>>>
>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>
>>>>>>>>>> The project is currently baselined on jdk-13+17.
>>>>>>>>>>
>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR2 
>>>>>>>>>> changes:
>>>>>>>>>>
>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-6] 
>>>>>>>>>> testing on
>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[7-8] is running now.
>>>>>>>>>> My stress kit is running on Solaris-X64 now. Kitchensink8H is 
>>>>>>>>>> running
>>>>>>>>>> now on product, fastdebug, and slowdebug bits on Linux-X64, 
>>>>>>>>>> MacOSX
>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs are running now on 
>>>>>>>>>> product,
>>>>>>>>>> fastdebug and slowdebug bits on Linux-X64, MacOSX and 
>>>>>>>>>> Solaris-X64.
>>>>>>>>>> I'll start my my stress kit on Linux-X64 sometime on Sunday 
>>>>>>>>>> (after
>>>>>>>>>> my jdk-13+18 stress run is done).
>>>>>>>>>>
>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 runs after all the stress
>>>>>>>>>> testing is done.
>>>>>>>>>>
>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> I finally have CR1 for the Async Monitor Deflation project 
>>>>>>>>>>> ready to
>>>>>>>>>>> go. It's also known as v2.01 (for those for with the 
>>>>>>>>>>> patches) and as
>>>>>>>>>>> webrev/4-for-jdk13 (for those with webrev URLs). Sorry for 
>>>>>>>>>>> all the
>>>>>>>>>>> names...
>>>>>>>>>>>
>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>
>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>
>>>>>>>>>>> Baseline bug fixes URL:
>>>>>>>>>>>
>>>>>>>>>>>     JDK-8222295 more baseline cleanups from Async Monitor 
>>>>>>>>>>> Deflation project
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295
>>>>>>>>>>>
>>>>>>>>>>> The project is currently baselined on jdk-13+15.
>>>>>>>>>>>
>>>>>>>>>>> Here's the webrev for the latest baseline changes 
>>>>>>>>>>> (JDK-8222295):
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 only):
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So I'm looking for reviews for both JDK-8222295 and the 
>>>>>>>>>>> latest version
>>>>>>>>>>> of JDK-8153224...
>>>>>>>>>>>
>>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR 
>>>>>>>>>>> changes:
>>>>>>>>>>>
>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3] 
>>>>>>>>>>> testing on
>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running 
>>>>>>>>>>> now and
>>>>>>>>>>> Mach5 tier[78] will be run later today. My stress kit on 
>>>>>>>>>>> Solaris-X64
>>>>>>>>>>> is running now. Linux-X64 stress testing will start on 
>>>>>>>>>>> Sunday. I'm
>>>>>>>>>>> planning to do Kitchensink runs, SPECjbb2015 runs and my 
>>>>>>>>>>> monitor
>>>>>>>>>>> inflation stress tests on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>>>>
>>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>
>>>>>>>>>>>> Welcome to the OpenJDK review thread for my port of 
>>>>>>>>>>>> Carsten's work on:
>>>>>>>>>>>>
>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>
>>>>>>>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>>>>>>>
>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/
>>>>>>>>>>>>
>>>>>>>>>>>> Here's a link to Carsten's original webrev:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
>>>>>>>>>>>>
>>>>>>>>>>>> Earlier versions of this patch have been through several 
>>>>>>>>>>>> rounds of
>>>>>>>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, 
>>>>>>>>>>>> and
>>>>>>>>>>>> Roman for their preliminary code review comments. A very 
>>>>>>>>>>>> special
>>>>>>>>>>>> thanks to Robbin and Roman for building and testing the 
>>>>>>>>>>>> patch in
>>>>>>>>>>>> their own environments (including specJBB2015).
>>>>>>>>>>>>
>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>>>> testing on
>>>>>>>>>>>> Oracle's usual set of platforms. Earlier versions have been 
>>>>>>>>>>>> run
>>>>>>>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers
>>>>>>>>>>>> (product, fastdebug, slowdebug).Earlier versions have run 
>>>>>>>>>>>> Kitchensink
>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, 
>>>>>>>>>>>> fastdebug
>>>>>>>>>>>> and slowdebug). Earlier versions have run my monitor 
>>>>>>>>>>>> inflation stress
>>>>>>>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 
>>>>>>>>>>>> (product,
>>>>>>>>>>>> fastdebug and slowdebug).
>>>>>>>>>>>>
>>>>>>>>>>>> All of the testing done on earlier versions will be redone 
>>>>>>>>>>>> on the
>>>>>>>>>>>> latest version of the patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>>
>>>>>>>>>>>> P.S.
>>>>>>>>>>>> One subtest in 
>>>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>>>> is currently failing in -Xcomp mode on Win* only. I've been 
>>>>>>>>>>>> trying
>>>>>>>>>>>> to characterize/analyze this failure for more than a week 
>>>>>>>>>>>> now. At
>>>>>>>>>>>> this point I'm convinced that Async Monitor Deflation is 
>>>>>>>>>>>> aggravating
>>>>>>>>>>>> an existing bug. However, I plan to have a better handle on 
>>>>>>>>>>>> that
>>>>>>>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>



More information about the hotspot-runtime-dev mailing list