Increased safepoint cleanup tasks in Java 11 (vs 8)

Thu Jun 20 13:03:13 UTC 2019

Hey Dan,

On Wed, Jun 19, 2019 at 4:44 PM Daniel D. Daugherty <
daniel.daugherty at oracle.com> wrote:

> Vitaly,
>
> I filed the following bug:
>
>     JDK-8226416 MonitorUsedDeflationThreshold can caused repeated cleanup
> safepoints
>     https://bugs.openjdk.java.net/browse/JDK-8226416
>
> for this issue. I've created a test that reproduces the bug
> in my JDK13 environment and _should_ also recreate the problem
> in JDK11u... I have not tried it on JDK11u myself...
>
Thanks! You mentioned the following in the bug report:

> at this point gMonitorFreeCount has reached 0 because all of the free
> global monitors have been allocated to per-thread lists.

How do free global monitors get allocated to per-thread lists? In your demo
program, there's only a single (Java/user) thread, so what are the
"per-thread lists" you're referring to? I didn't quite understand why, with
33 monitor inflations, the gMonitorFreeCount dropped by a lot more.  It
must be this per-thread list allocation, but perhaps you can explain a bit
how that works.

>
> Dan
>
Thanks,

Vitaly

>
>
> On 6/18/19 4:09 PM, Vitaly Davidovich wrote:
>
> Hey Dan,
>
> On Tue, Jun 18, 2019 at 3:44 PM Daniel D. Daugherty <
> daniel.daugherty at oracle.com> wrote:
>
>> Sorry for the delay in getting back to this query. I've been focused
>> on other tasks...
>>
> No worries, I understand and appreciate your time and help.
>
>>
>> On 6/12/19 12:46 PM, Vitaly Davidovich wrote:
>>
>>
>>
>> On Thu, Jun 6, 2019 at 10:29 AM Daniel D. Daugherty <
>> daniel.daugherty at oracle.com> wrote:
>>
>>> On 6/6/19 10:08 AM, Vitaly Davidovich wrote:
>>>
>>>
>>>
>>> On Wed, Jun 5, 2019 at 10:13 PM Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 5, 2019 at 7:37 PM Vitaly Davidovich <vitalyd at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 5, 2019 at 7:10 PM Daniel D. Daugherty <
>>>>> daniel.daugherty at oracle.com> wrote:
>>>>>
>>>>>> On 6/5/19 7:03 PM, Vitaly Davidovich wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 5, 2019 at 6:55 PM David Holmes <david.holmes at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On 6/06/2019 2:31 am, Vitaly Davidovich wrote:
>>>>>>> > On Wed, Jun 5, 2019 at 12:15 PM Daniel D. Daugherty <
>>>>>>> > daniel.daugherty at oracle.com> wrote:
>>>>>>> >
>>>>>>> >> On 6/5/19 12:04 PM, Vitaly Davidovich wrote:
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Jun 5, 2019 at 11:44 AM Daniel D. Daugherty <
>>>>>>> >> daniel.daugherty at oracle.com> wrote:
>>>>>>> >>
>>>>>>> >>> On 6/5/19 11:37 AM, Vitaly Davidovich wrote:
>>>>>>> >>>> On Wed, Jun 5, 2019 at 8:58 AM Vitaly Davidovich <
>>>>>>> vitalyd at gmail.com>
>>>>>>> >>> wrote:
>>>>>>> >>>>
>>>>>>> >>>>> Hey Aleksey,
>>>>>>> >>>>>
>>>>>>> >>>>> On Wed, Jun 5, 2019 at 8:52 AM Aleksey Shipilev <
>>>>>>> shade at redhat.com>
>>>>>>> >>> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>>> On 6/5/19 2:33 PM, Vitaly Davidovich wrote:
>>>>>>> >>>>>>> Does anyone have ideas on pinpointing this further?
>>>>>>> >>>>>> I have the educated guess:
>>>>>>> >>>>>>     https://bugs.openjdk.java.net/browse/JDK-8181859
>>>>>>> >>>>>>
>>>>>>> >>>>>> Which means you can try if
>>>>>>> -XX:MonitorUsedDeflationThreshold=0 reduces
>>>>>>> >>>>>> the number of "guaranteed"
>>>>>>> >>>>>> safepoints.
>>>>>>> >>>>>>
>>>>>>> >>>>> Hmm, the application doesn't use JVM monitors (in any
>>>>>>> meaningful
>>>>>>> >>>>> capacity).  But interesting JBS entry - I will try it out,
>>>>>>> thanks! The
>>>>>>> >>>>> description of safepoint clean up in there is/was consistent
>>>>>>> with my
>>>>>>> >>>>> understanding as well.  On Java 8, we'd see the "no vm
>>>>>>> operation"
>>>>>>> >>> safepoint
>>>>>>> >>>>> operations, and I always assumed it was IC buffer cleaning
>>>>>>> (even had a
>>>>>>> >>> mail
>>>>>>> >>>>> thread on that topic a few years ago, either on this list or
>>>>>>> >>>>> hotspot-compiler).  That tended to stabilize (i.e. they went
>>>>>>> away)
>>>>>>> >>> after
>>>>>>> >>>>> the application reached steady state, whereas I'm seeing
>>>>>>> essentially
>>>>>>> >>>>> non-stop safepoint cleanups on 11.
>>>>>>> >>>>>
>>>>>>> >>>> Ok, adding -XX:MonitorUsedDeflationThreshold=0 seems to
>>>>>>> alleviate the
>>>>>>> >>> issue
>>>>>>> >>>> - # of entered safepoint cleanups looks on par with Java 8,
>>>>>>> although I'm
>>>>>>> >>>> still assessing a bit more.
>>>>>>> >>>
>>>>>>> >>> This observation doesn't really match with your earlier comment:
>>>>>>> >>>
>>>>>>> >>>> Hmm, the application doesn't use JVM monitors (in any meaningful
>>>>>>> >>>> capacity).
>>>>>>> >>>
>>>>>>> >>> The default value for MonitorUsedDeflationThreshold is 90. So a
>>>>>>> cleanup
>>>>>>> >>> safepoint should only be triggered if monitor use exceeds 90% of
>>>>>>> the
>>>>>>> >>> monitor pool.
>>>>>>> >>>
>>>>>>> >> Hmm, ok - I see what you mean:
>>>>>>> >>
>>>>>>> https://github.com/openjdk/jdk11u/blob/d46b56eb81b07b452141c7dcd078d33fc8e49d63/src/hotspot/share/runtime/safepoint.cpp#L604
>>>>>>> >>
>>>>>>> >> So there's either monitor usage that I'm not aware of (or forgot
>>>>>>> about) or
>>>>>>> >> the ratio is > 90 but the absolute # of them is small (which I'd
>>>>>>> consider
>>>>>>> >> "don't use in any meaningful capacity").  Is there a way to print
>>>>>>> out
>>>>>>> >> monitor population so I can see?
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> I've been adding a lot of logging support to the Java Monitor
>>>>>>> subsystem,
>>>>>>> >> but those changes are in JDK13 and not in JDK11. I'm trying to
>>>>>>> remember
>>>>>>> >> what there is in JDK11, but ...
>>>>>>> >>
>>>>>>> > Yeah, I'm having a hard time keeping track of what
>>>>>>> features/changes are in
>>>>>>> > what Java release or update version or upstream/downstream
>>>>>>> backport or ...
>>>>>>> >
>>>>>>> > By the way, do JNI (specifically, Java calling into native) calls
>>>>>>> cause
>>>>>>> > monitor inflation? The app in question does have substantial JNI
>>>>>>> usage.  I
>>>>>>> > guess by not using java monitors extensively, I meant the vanilla
>>>>>>> flavor of
>>>>>>> > Java code with synchronized blocks/methods.
>>>>>>>
>>>>>>> JNI calls do not in themselves impact monitors. However if native
>>>>>>> code
>>>>>>> uses the JNI MonitorEnter then that does trigger inflation.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>
>>>>>> Thanks for clarifying, David.
>>>>>>
>>>>>> Pretty certain that JNI code I have is not calling MonitorEnter, but
>>>>>> will double check.
>>>>>>
>>>>>> So beyond monitor contention via vanilla Java code, are there other
>>>>>> scenarios resulting in inflation? Is there perhaps a list of these that
>>>>>> someone could point to? I can spelunk through hotspot code, but maybe
>>>>>> someone can spare me that exercise :).
>>>>>>
>>>>>>
>>>>>> Java Monitor inflation is caused by:
>>>>>>
>>>>>> - contention on the same monitor between 2 or more threads
>>>>>> - monitor.wait() call
>>>>>> - synchronized (obj) { obj.hashCode(); }
>>>>>>
>>>>>> The last one is specific to the HotSpot VM.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>> Awesome, thanks Dan! In lieu of more precise monitor usage logging in
>>>>> 11, I’ll look at our code for these.
>>>>>
>>>> I’m going to try "monitorinflation=debug" tomorrow:
>>>>
>>>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/runtime/synchronizer.cpp#L1519
>>>>
>>>> Maybe that’ll help shed light on inflations.
>>>>
>>> So it doesn't look like there's much monitor inflation - some early
>>> during startup, but they go away fairly quickly (if monitorinflation=debug
>>> is to be trusted comprehensive).
>>>
>>>
>>> Yes the 'monitorinflation=debug' logging in JDK11u will tell you about
>>> every inflation and deflation of an ObjectMonitor. It is rather verbose
>>> if you have a lot of ObjectMonitor inflation/deflation happening. If it
>>> is not overloading you with information, then monitor deflation is not
>>> your issue.
>>>
>> So have been playing around with 'monitorinflation=debug' in conjunction
>> with -XX:MonitorUsedDeflationThreshold=0.  One (so far) puzzling
>> observation is with the UL flag set, I still see recurring safepoint
>> cleanups but no trace of monitor inflation/deflation events.  Naively, I'd
>> expect to see (at least) inflation events if a subsequent safepoint cleanup
>> is entered for deflations.
>>
>>
>> Are you not seeing _any_ monitor inflation events?
>> Or are you seeing _some_ monitor inflation events at the beginning
>> and then no more after a steady state is reached?
>>
> There are some inflation events early in the process' lifetime, and a
> bunch of safepoints (for other reasons as well, e.g. JIT deopts).  But then
> there are _no_ monitor inflation events in UL, yet safepoint cleanups
> continue.  This is why I asked if you think previous monitor cleanups
> didn't make sufficient progress to get under the
> MonitorUsedDeflationThreshold(90), and thus every subsequent guaranteed
> safepoint poll actually enters cleanup.
>
>>
>>
>> Setting MonitorUsedDeflationThreshold=0 does stop the cleanup storm,
>> however.
>>
>>
>> So that indicates that you've found the right knob to stop this change
>> in behavior. That's good.
>>
> Yeah, that seems to have worked.  My hesitation was/is in it being an
> experimental flag, and running a production process with *any* experimental
> flag enabled.  At a minimum, it raises eyebrows.
>
>>
>>
>> So unless there's some inflation that is not reported via UL, it's not
>> clear why there's this seeming disconnect.
>>
>>
>> These two spots in the code are the only two where we inflate a
>> monitor:
>>
>> open/src/hotspot/share/runtime/synchronizer.cpp:
>>
>>     // CASE: stack-locked
>> <snip>
>>       if (log_is_enabled(Debug, monitorinflation)) {
>>         if (object->is_instance()) {
>>           ResourceMark rm;
>>           log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT "
>> , mark " INTPTR_FORMAT " , type %s",
>>                                       p2i(object), p2i(object->mark()),
>>                                       object->klass()->external_name());
>>         }
>>       }
>>       if (event.should_commit()) {
>>         post_monitor_inflate_event(&event, object, cause);
>>       }
>>       return m;
>>     }
>>
>>     // CASE: neutral
>> <snip>
>>     if (log_is_enabled(Debug, monitorinflation)) {
>>       if (object->is_instance()) {
>>         ResourceMark rm;
>>         log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT " ,
>> mark " INTPTR_FORMAT " , type %s",
>>                                     p2i(object), p2i(object->mark()),
>>                                     object->klass()->external_name());
>>       }
>>     }
>>     if (event.should_commit()) {
>>       post_monitor_inflate_event(&event, object, cause);
>>     }
>>     return m;
>>   }
>>
>> I have changed this code a bit in JDK13. For example, I've
>> removed the "if (object->is_instance())" checks so in JDK11u
>> is possible for there to be inflations related to objects
>> where "object->is_instance() == false" and those won't be
>> reported.
>>
>> In my own testing of the logging improvements in JDK13, I
>> saw very few messages where "object->is_instance() == false"
>> and I saw no ill effects from stripping that if-statement
>> so I deleted it...
>>
> Ok, good to know.
>
>>
>> Dan
>>
>>
>>
>>> Just FYI: in JDK13 that particular set of diagnostics have moved from
>>> 'monitorinflation=debug' to 'monitorinflation=trace'.
>>>
>> Ok, good to know - thanks for the heads up.
>>
>>>
>>> Dan
>>>
>>>
>>> Yet safepoints continue.  Interestingly, the timing reported in
>>> safepoint cleanup logging shows miniscule time spent "deflating idle
>>> monitors" (e.g. 0.0000004 secs = 400 ns), which I take as essentially
>>> overhead of going through that codepath without actually doing much of
>>> anything.  "updating inline caches" is about the same (400-600 ns).
>>>
>>> However, "compilation policy safepoint handler" is tens of micros (e.g.
>>> 30-50us).  Looking at the code,
>>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/runtime/compilationPolicy.cpp#L321,
>>> it appears to be doing method invocation counter decaying.  That does
>>> stackwalks (the application has about 30 threads), so fine - it's not very
>>> cheap.  But, according to
>>> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/runtime/safepoint.cpp#L602,
>>> decay itself does no gate doing cleanup operations.
>>>
>>> Unfortunately, it doesn't appear that I can trace ICBuffers in product
>>> builds (TraceICBuffer flag isn't present), so can't tell whether there's
>>> some change (vs java 8) in their regard.  Naively, it also seems odd that a
>>> single stub in the ICBuffer would require a cleanup:
>>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/code/icBuffer.cpp#L163
>>> .
>>>
>>> I'll keep investigating.
>>>
>>>>
>>>> Dan, please let me know if you think this will be fruitless :).
>>>>
>>>>>
>>>>> Only remaining input I’d appreciate from the list is whether setting
>>>>> MonitorUsedDeflationThreshold=0 “restores” 8 behavior, or whether I’d run
>>>>> the risk of something unfortunate as a result (again, with respect to 8).
>>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>> >>
>>>>>>> >> Dan
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Thanks Dan
>>>>>>> >>
>>>>>>> >>>
>>>>>>> >>> Dan
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>>
>>>>>>> >>>> Of course now the trouble is this is an experimental option, and
>>>>>>> >>> enabling
>>>>>>> >>>> things like that in production raises eyebrows.  Is it safe to
>>>>>>> assume,
>>>>>>> >>>> however, that setting this to 0 essentially behaves like Java
>>>>>>> 8? It
>>>>>>> >>> appears
>>>>>>> >>>> so, based on that JBS Aleksey linked above.  But just double
>>>>>>> checking.
>>>>>>> >>>>
>>>>>>> >>>> If that's the case, then perhaps we can overlook the
>>>>>>> experimental bit
>>>>>>> >>> and
>>>>>>> >>>> wait until this is done outside safepoints, per Robbin's email.
>>>>>>> >>>>
>>>>>>> >>>> Thanks to everyone that responded on this thread! Very helpful.
>>>>>>> >>>>
>>>>>>> >>>>>> --
>>>>>>> >>>>>> Thanks,
>>>>>>> >>>>>> -Aleksey
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>
>>>>>>>
>>>>>> --
>>>>>> Sent from my phone
>>>>>>
>>>>>>
>>>>>> --
>>>>> Sent from my phone
>>>>>
>>>> --
>>>> Sent from my phone
>>>>
>>>
>>>
>>
>