Increased safepoint cleanup tasks in Java 11 (vs 8)

Tue Jun 18 20:09:15 UTC 2019

Hey Dan,

On Tue, Jun 18, 2019 at 3:44 PM Daniel D. Daugherty <
daniel.daugherty at oracle.com> wrote:

> Sorry for the delay in getting back to this query. I've been focused
> on other tasks...
>
No worries, I understand and appreciate your time and help.

>
> On 6/12/19 12:46 PM, Vitaly Davidovich wrote:
>
>
>
> On Thu, Jun 6, 2019 at 10:29 AM Daniel D. Daugherty <
> daniel.daugherty at oracle.com> wrote:
>
>> On 6/6/19 10:08 AM, Vitaly Davidovich wrote:
>>
>>
>>
>> On Wed, Jun 5, 2019 at 10:13 PM Vitaly Davidovich <vitalyd at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Jun 5, 2019 at 7:37 PM Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 5, 2019 at 7:10 PM Daniel D. Daugherty <
>>>> daniel.daugherty at oracle.com> wrote:
>>>>
>>>>> On 6/5/19 7:03 PM, Vitaly Davidovich wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 5, 2019 at 6:55 PM David Holmes <david.holmes at oracle.com>
>>>>> wrote:
>>>>>
>>>>>> On 6/06/2019 2:31 am, Vitaly Davidovich wrote:
>>>>>> > On Wed, Jun 5, 2019 at 12:15 PM Daniel D. Daugherty <
>>>>>> > daniel.daugherty at oracle.com> wrote:
>>>>>> >
>>>>>> >> On 6/5/19 12:04 PM, Vitaly Davidovich wrote:
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jun 5, 2019 at 11:44 AM Daniel D. Daugherty <
>>>>>> >> daniel.daugherty at oracle.com> wrote:
>>>>>> >>
>>>>>> >>> On 6/5/19 11:37 AM, Vitaly Davidovich wrote:
>>>>>> >>>> On Wed, Jun 5, 2019 at 8:58 AM Vitaly Davidovich <
>>>>>> vitalyd at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>>
>>>>>> >>>>> Hey Aleksey,
>>>>>> >>>>>
>>>>>> >>>>> On Wed, Jun 5, 2019 at 8:52 AM Aleksey Shipilev <
>>>>>> shade at redhat.com>
>>>>>> >>> wrote:
>>>>>> >>>>>
>>>>>> >>>>>> On 6/5/19 2:33 PM, Vitaly Davidovich wrote:
>>>>>> >>>>>>> Does anyone have ideas on pinpointing this further?
>>>>>> >>>>>> I have the educated guess:
>>>>>> >>>>>>     https://bugs.openjdk.java.net/browse/JDK-8181859
>>>>>> >>>>>>
>>>>>> >>>>>> Which means you can try if -XX:MonitorUsedDeflationThreshold=0
>>>>>> reduces
>>>>>> >>>>>> the number of "guaranteed"
>>>>>> >>>>>> safepoints.
>>>>>> >>>>>>
>>>>>> >>>>> Hmm, the application doesn't use JVM monitors (in any meaningful
>>>>>> >>>>> capacity).  But interesting JBS entry - I will try it out,
>>>>>> thanks! The
>>>>>> >>>>> description of safepoint clean up in there is/was consistent
>>>>>> with my
>>>>>> >>>>> understanding as well.  On Java 8, we'd see the "no vm
>>>>>> operation"
>>>>>> >>> safepoint
>>>>>> >>>>> operations, and I always assumed it was IC buffer cleaning
>>>>>> (even had a
>>>>>> >>> mail
>>>>>> >>>>> thread on that topic a few years ago, either on this list or
>>>>>> >>>>> hotspot-compiler).  That tended to stabilize (i.e. they went
>>>>>> away)
>>>>>> >>> after
>>>>>> >>>>> the application reached steady state, whereas I'm seeing
>>>>>> essentially
>>>>>> >>>>> non-stop safepoint cleanups on 11.
>>>>>> >>>>>
>>>>>> >>>> Ok, adding -XX:MonitorUsedDeflationThreshold=0 seems to
>>>>>> alleviate the
>>>>>> >>> issue
>>>>>> >>>> - # of entered safepoint cleanups looks on par with Java 8,
>>>>>> although I'm
>>>>>> >>>> still assessing a bit more.
>>>>>> >>>
>>>>>> >>> This observation doesn't really match with your earlier comment:
>>>>>> >>>
>>>>>> >>>> Hmm, the application doesn't use JVM monitors (in any meaningful
>>>>>> >>>> capacity).
>>>>>> >>>
>>>>>> >>> The default value for MonitorUsedDeflationThreshold is 90. So a
>>>>>> cleanup
>>>>>> >>> safepoint should only be triggered if monitor use exceeds 90% of
>>>>>> the
>>>>>> >>> monitor pool.
>>>>>> >>>
>>>>>> >> Hmm, ok - I see what you mean:
>>>>>> >>
>>>>>> https://github.com/openjdk/jdk11u/blob/d46b56eb81b07b452141c7dcd078d33fc8e49d63/src/hotspot/share/runtime/safepoint.cpp#L604
>>>>>> >>
>>>>>> >> So there's either monitor usage that I'm not aware of (or forgot
>>>>>> about) or
>>>>>> >> the ratio is > 90 but the absolute # of them is small (which I'd
>>>>>> consider
>>>>>> >> "don't use in any meaningful capacity").  Is there a way to print
>>>>>> out
>>>>>> >> monitor population so I can see?
>>>>>> >>
>>>>>> >>
>>>>>> >> I've been adding a lot of logging support to the Java Monitor
>>>>>> subsystem,
>>>>>> >> but those changes are in JDK13 and not in JDK11. I'm trying to
>>>>>> remember
>>>>>> >> what there is in JDK11, but ...
>>>>>> >>
>>>>>> > Yeah, I'm having a hard time keeping track of what features/changes
>>>>>> are in
>>>>>> > what Java release or update version or upstream/downstream backport
>>>>>> or ...
>>>>>> >
>>>>>> > By the way, do JNI (specifically, Java calling into native) calls
>>>>>> cause
>>>>>> > monitor inflation? The app in question does have substantial JNI
>>>>>> usage.  I
>>>>>> > guess by not using java monitors extensively, I meant the vanilla
>>>>>> flavor of
>>>>>> > Java code with synchronized blocks/methods.
>>>>>>
>>>>>> JNI calls do not in themselves impact monitors. However if native
>>>>>> code
>>>>>> uses the JNI MonitorEnter then that does trigger inflation.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>
>>>>> Thanks for clarifying, David.
>>>>>
>>>>> Pretty certain that JNI code I have is not calling MonitorEnter, but
>>>>> will double check.
>>>>>
>>>>> So beyond monitor contention via vanilla Java code, are there other
>>>>> scenarios resulting in inflation? Is there perhaps a list of these that
>>>>> someone could point to? I can spelunk through hotspot code, but maybe
>>>>> someone can spare me that exercise :).
>>>>>
>>>>>
>>>>> Java Monitor inflation is caused by:
>>>>>
>>>>> - contention on the same monitor between 2 or more threads
>>>>> - monitor.wait() call
>>>>> - synchronized (obj) { obj.hashCode(); }
>>>>>
>>>>> The last one is specific to the HotSpot VM.
>>>>>
>>>>> Dan
>>>>>
>>>> Awesome, thanks Dan! In lieu of more precise monitor usage logging in
>>>> 11, I’ll look at our code for these.
>>>>
>>> I’m going to try "monitorinflation=debug" tomorrow:
>>>
>>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/runtime/synchronizer.cpp#L1519
>>>
>>> Maybe that’ll help shed light on inflations.
>>>
>> So it doesn't look like there's much monitor inflation - some early
>> during startup, but they go away fairly quickly (if monitorinflation=debug
>> is to be trusted comprehensive).
>>
>>
>> Yes the 'monitorinflation=debug' logging in JDK11u will tell you about
>> every inflation and deflation of an ObjectMonitor. It is rather verbose
>> if you have a lot of ObjectMonitor inflation/deflation happening. If it
>> is not overloading you with information, then monitor deflation is not
>> your issue.
>>
> So have been playing around with 'monitorinflation=debug' in conjunction
> with -XX:MonitorUsedDeflationThreshold=0.  One (so far) puzzling
> observation is with the UL flag set, I still see recurring safepoint
> cleanups but no trace of monitor inflation/deflation events.  Naively, I'd
> expect to see (at least) inflation events if a subsequent safepoint cleanup
> is entered for deflations.
>
>
> Are you not seeing _any_ monitor inflation events?
> Or are you seeing _some_ monitor inflation events at the beginning
> and then no more after a steady state is reached?
>
There are some inflation events early in the process' lifetime, and a bunch
of safepoints (for other reasons as well, e.g. JIT deopts).  But then there
are _no_ monitor inflation events in UL, yet safepoint cleanups continue.
This is why I asked if you think previous monitor cleanups didn't make
sufficient progress to get under the MonitorUsedDeflationThreshold(90), and
thus every subsequent guaranteed safepoint poll actually enters cleanup.

>
>
> Setting MonitorUsedDeflationThreshold=0 does stop the cleanup storm,
> however.
>
>
> So that indicates that you've found the right knob to stop this change
> in behavior. That's good.
>
Yeah, that seems to have worked.  My hesitation was/is in it being an
experimental flag, and running a production process with *any* experimental
flag enabled.  At a minimum, it raises eyebrows.

>
>
> So unless there's some inflation that is not reported via UL, it's not
> clear why there's this seeming disconnect.
>
>
> These two spots in the code are the only two where we inflate a
> monitor:
>
> open/src/hotspot/share/runtime/synchronizer.cpp:
>
>     // CASE: stack-locked
> <snip>
>       if (log_is_enabled(Debug, monitorinflation)) {
>         if (object->is_instance()) {
>           ResourceMark rm;
>           log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT "
> , mark " INTPTR_FORMAT " , type %s",
>                                       p2i(object), p2i(object->mark()),
>                                       object->klass()->external_name());
>         }
>       }
>       if (event.should_commit()) {
>         post_monitor_inflate_event(&event, object, cause);
>       }
>       return m;
>     }
>
>     // CASE: neutral
> <snip>
>     if (log_is_enabled(Debug, monitorinflation)) {
>       if (object->is_instance()) {
>         ResourceMark rm;
>         log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT " ,
> mark " INTPTR_FORMAT " , type %s",
>                                     p2i(object), p2i(object->mark()),
>                                     object->klass()->external_name());
>       }
>     }
>     if (event.should_commit()) {
>       post_monitor_inflate_event(&event, object, cause);
>     }
>     return m;
>   }
>
> I have changed this code a bit in JDK13. For example, I've
> removed the "if (object->is_instance())" checks so in JDK11u
> is possible for there to be inflations related to objects
> where "object->is_instance() == false" and those won't be
> reported.
>
> In my own testing of the logging improvements in JDK13, I
> saw very few messages where "object->is_instance() == false"
> and I saw no ill effects from stripping that if-statement
> so I deleted it...
>
Ok, good to know.

>
> Dan
>
>
>
>> Just FYI: in JDK13 that particular set of diagnostics have moved from
>> 'monitorinflation=debug' to 'monitorinflation=trace'.
>>
> Ok, good to know - thanks for the heads up.
>
>>
>> Dan
>>
>>
>> Yet safepoints continue.  Interestingly, the timing reported in safepoint
>> cleanup logging shows miniscule time spent "deflating idle monitors" (e.g.
>> 0.0000004 secs = 400 ns), which I take as essentially overhead of going
>> through that codepath without actually doing much of anything.  "updating
>> inline caches" is about the same (400-600 ns).
>>
>> However, "compilation policy safepoint handler" is tens of micros (e.g.
>> 30-50us).  Looking at the code,
>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/runtime/compilationPolicy.cpp#L321,
>> it appears to be doing method invocation counter decaying.  That does
>> stackwalks (the application has about 30 threads), so fine - it's not very
>> cheap.  But, according to
>> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/runtime/safepoint.cpp#L602,
>> decay itself does no gate doing cleanup operations.
>>
>> Unfortunately, it doesn't appear that I can trace ICBuffers in product
>> builds (TraceICBuffer flag isn't present), so can't tell whether there's
>> some change (vs java 8) in their regard.  Naively, it also seems odd that a
>> single stub in the ICBuffer would require a cleanup:
>> https://github.com/openjdk/jdk11u/blob/737d8437886ad97c6ed21a25b9911c10b3886f61/src/hotspot/share/code/icBuffer.cpp#L163
>> .
>>
>> I'll keep investigating.
>>
>>>
>>> Dan, please let me know if you think this will be fruitless :).
>>>
>>>>
>>>> Only remaining input I’d appreciate from the list is whether setting
>>>> MonitorUsedDeflationThreshold=0 “restores” 8 behavior, or whether I’d run
>>>> the risk of something unfortunate as a result (again, with respect to 8).
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>>
>>>>>>
>>>>>> >
>>>>>> >>
>>>>>> >> Dan
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Thanks Dan
>>>>>> >>
>>>>>> >>>
>>>>>> >>> Dan
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>>
>>>>>> >>>> Of course now the trouble is this is an experimental option, and
>>>>>> >>> enabling
>>>>>> >>>> things like that in production raises eyebrows.  Is it safe to
>>>>>> assume,
>>>>>> >>>> however, that setting this to 0 essentially behaves like Java 8?
>>>>>> It
>>>>>> >>> appears
>>>>>> >>>> so, based on that JBS Aleksey linked above.  But just double
>>>>>> checking.
>>>>>> >>>>
>>>>>> >>>> If that's the case, then perhaps we can overlook the
>>>>>> experimental bit
>>>>>> >>> and
>>>>>> >>>> wait until this is done outside safepoints, per Robbin's email.
>>>>>> >>>>
>>>>>> >>>> Thanks to everyone that responded on this thread! Very helpful.
>>>>>> >>>>
>>>>>> >>>>>> --
>>>>>> >>>>>> Thanks,
>>>>>> >>>>>> -Aleksey
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>
>>>>>>
>>>>> --
>>>>> Sent from my phone
>>>>>
>>>>>
>>>>> --
>>>> Sent from my phone
>>>>
>>> --
>>> Sent from my phone
>>>
>>
>>
>