RFR(L) 8153224 Monitor deflation prolong safepoints

Wed Apr 10 05:15:48 UTC 2019

Hi Carsten, Dan,

I'd like to pick up on one topic - a higher-level discussion about the 
timing of the ObjectMonitor lifecycle as they currently are and with 
these changes:

Carsten wrote:
> I think I was thinking about a cycle where a Java object exhibits
> a monitor inflation, then deflation, then inflation, then deflation.
> Each inflation will be with a new monitor. This behavior could
> increase the number of monitors allocated, especially with my
> original patch as I recycled monitors only after a safepoint. Now
> that I think about it again, such a cycle is incredible unlikely as
> it would require repeated contention on the> java object, yet the
> monitors must not be busy when the deflator thread comes by. And this
> scenario has to repeat itself. This all seems pretty unlikely.

So logically every Object has associated with it an ObjectMonitor but if 
we created the ObjectMonitor at the same time as the Object and kept it 
alive while the Object was alive then we would double our memory use (if 
not worse).

So we lazily create ObjectMonitors only when we need them: contention, 
Object.wait() use, hashcode use.

We could then leave the ObjectMonitors around as long as the Objects are 
alive, but again this has implications for memory use.

So we deflate idle ObjectMonitors to reclaim memory (though in practice 
it is more complex and we maintain pools of them to speed up allocation).

If we aggressively deflate as soon as an ObjectMonitor is idle then we 
risk getting into inflate->deflate->inflate cycles. The likelihood may 
be low but if you hit this pathology in your code you will probably be 
unhappy about the effects on performance.

So instead, IIUC, we use some measure of "memory pressure" and only try 
to deflate under certain conditions. But I'm unclear exactly what those 
conditions are today, and whether they change with async monitor 
deflation. Can you enlighten me please?

Thanks,
David

On 10/04/2019 12:25 pm, Carsten Varming wrote:
> Hi Dan,
> 
> On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty <
> daniel.daugherty at oracle.com> wrote:
> 
>> On 4/5/19 4:59 PM, Karen Kinnear wrote:
>>
>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 <
>> _count
>> with comments that caller ensured _count <= 0
>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0
>> — Am I missing something subtle here or should they be the same guarantees?
>>
>>
>> Here's the code in question:
>>
>> src/hotspot/share/runtime/objectMonitor.cpp:
>>
>> void ObjectMonitor::EnterI(TRAPS) {
>> <snip>
>>    if (_owner == DEFLATER_MARKER) {
>>      guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should
>> have been handled by the caller");
>>      // Deflater thread tried to lock this monitor, but it failed to make
>> _count negative and gave up.
>>
>> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) {
>> <snip>
>>      if (_owner == DEFLATER_MARKER) {
>>        guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER &&
>> _count < 0, monitor must not be owned by deflater thread here");
>>
>>
>> Reading these two guarantee() calls always throws me off stride
>> because I would have written them like this:
>>
>>      guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should
>> have been handled by the caller");
>>
>> and
>>
>>        guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER &&
>> _count < 0, monitor must not be owned by deflater thread here");
>>
>> When rewritten like the above, you have:
>>
>>      "_count > 0" ... _count <= 0
>>
>> and:
>>
>>      "_count >= 0" ... "_count < 0"
>>
>> which is easier for my brain to read... okay... enough sidebar...
>>
> 
> He he. I have pretty much eliminated > and >= from my written vocabulary.
> It makes life simpler. Trust me. :)
> 
> 
>> Short answer: No the guarantees should not be the same.
>>
>> Longer answer: EnterI() is called by enter() after enter() has
>> incremented the _count field to indicate the contended state of
>> things. So in EnterI(), "_count > 0" is the right check.
>> ReenterI() is called after wait() has returned (notified or
>> timedout), and the _count field is not used on reentry ops so
>> "_count >= 0" is the right check.
>>
>> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes,
>> there are two places in EnterI() that do this):
>>
>>      L501:   if (_owner == DEFLATER_MARKER) {
>>                // The deflation protocol finished the first part (setting
>> _owner),
>>                // but it failed the second part (making _count negative)
>> and bailed.
>>                // Because we're called from enter() we have at least one
>> contention.
>>                guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <=
>> 0 should have been handled by the caller");
>>      L504:     // Try to acquire monitor.
>>      L505:     if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) ==
>> DEFLATER_MARKER) {
>>
>>      L629:     if (_owner == DEFLATER_MARKER) {
>>                  // The deflation protocol finished the first part (setting
>> _owner),
>>                  // but it failed the second part (making _count negative)
>> and bailed.
>>                  // Because we're called from enter() we have at least one
>> contention.
>>                  guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count
>> <= 0 should have been handled by the caller");
>>      L632:       if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) ==
>> DEFLATER_MARKER) {
>>
>> And I'm going to tweak the ReenterI() code like this:
>>
>>      L759:     if (_owner == DEFLATER_MARKER) {
>>                  // The deflation protocol finished the first part (setting
>> _owner),
>>                  // but it will observe _waiters != 0 and will bail out.
>> Because we're
>>                  // called from wait() we may or may not have any
>> contentions.
>>                  guarantee(count >= 0, "Impossible: _owner ==
>> DEFLATER_MARKER && _count < 0 should have been handled by the caller");
>>      L761:       if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) ==
>> DEFLATER_MARKER) {
>>
>>
>> You didn't ask this, but it is okay that _count is only used to track
>> contentions in enter()/EnterI() and is not used to track contentions
>> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is
>> used by is_busy() to observe the busy state for an ObjectMonitor that
>> is being wait()'ed for. The _waiters field is decremented after a
>> waiter has returned from ReenterI() so the _owner field takes over
>> answering the is_busy() question...
>>
>>
>> 5. I could use a little help with allocation state transitions,
>> e.g. in deflate_monitor_list_using_JT
>>    you see is_new with object set so you mark it as old so next deflation
>> will check it
>>
>>
>> Here's the code in question:
>>
>> src/hotspot/share/runtime/synchronizer.cpp:
>>
>> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor**
>> listHeadp,
>>                                                        ObjectMonitor**
>> freeHeadp,
>>                                                        ObjectMonitor**
>> freeTailp,
>>                                                        ObjectMonitor**
>> savedMidInUsep) {
>> <snip>
>>      // Only try to deflate if there is an associated Java object and if
>>      // mid is old (is not newly allocated and is not newly freed).
>>      if (mid->object() != NULL && mid->is_old() &&
>>          deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) {
>>        // Deflation succeeded so update the in-use list.
>> <snip>
>>      } else {
>>        // mid is considered in-use if it does not have an associated
>>        // Java object or mid is not old or deflation did not succeed.
>>        // A mid->is_new() node can be seen here when it is freshly returned
>>        // by omAlloc() (and skips the deflation code path).
>>        // A mid->is_old() node can be seen here when deflation failed.
>>        // A mid->is_free() node can be seen here when a fresh node from
>>        // omAlloc() is released by omRelease() due to losing the race
>>        // in inflate().
>>
>>        if (mid->object() != NULL && mid->is_new()) {
>>          // mid has an associated Java object and has now been seen
>>          // as newly allocated so mark it as "old".
>>          mid->set_allocation_state(ObjectMonitor::Old);
>>        }
>>
>>    - why do you set it to old here rather than in inflate once we set
>> values?
>>
>>
>> Inflation is used in quite a few places. If we marked the
>> ObjectMonitor as "Old" in inflate(), then that would make the
>> ObjectMonitor available for deflation by deflate_monitor_using_JT()
>> earlier:
>>
>> src/hotspot/share/runtime/synchronizer.cpp:
>>
>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid,
>>                                                    ObjectMonitor**
>> freeHeadp,
>>                                                    ObjectMonitor**
>> freeTailp) {
>>    assert(AsyncDeflateIdleMonitors, "sanity check");
>>    assert(Thread::current()->is_Java_thread(), "precondition");
>>    // A newly allocated ObjectMonitor should not be seen here so we
>>    // avoid an endless inflate/deflate cycle.
>>    assert(mid->is_old(), "precondition");
>>
>>
>> So the idea behind only deflating ObjectMonitors that have reached
>> allocation state "Old" is to prevent "an endless inflate/deflate cycle".
>> Here's the relevant section from Carsten's JEP:
>>
>> To avoid endless inflation / deflation cycles in the prototype, monitor
>>
>> deflation is only attempted the second time a monitor is seen by the
>>
>> thread marking monitors as deflatable: If the thread (the only thread
>>
>> marking monitors as deflatable; might be service thread or some GC
>>
>> related thread or even a dedicated thread) sees a monitor in state New,
>>
>> then the thread marks the monitor as Old and moves on. So there is
>>
>> little interaction between a thread inflating a lock to a monitor and
>>
>> the deflating thread, the inflating thread just has to make sure the
>>
>> monitor is marked New and this marker is published using appropriate
>>
>> barriers.
>>
>>
>> There isn't an explicit example in the JEP of what Carsten was thinking
>> of with "an endless inflate/deflate cycle". I didn't try to think of
>> such an example for the OpenJDK wiki either. I simple wrote:
>>
> 
> I think I was thinking about a cycle where a Java object exhibits a monitor
> inflation, then deflation, then inflation, then deflation. Each inflation
> will be with a new monitor. This behavior could increase the number of
> monitors allocated, especially with my original patch as I recycled
> monitors only after a safepoint. Now that I think about it again, such a
> cycle is incredible unlikely as it would require repeated contention on the
> java object, yet the monitors must not be busy when the deflator thread
> comes by. And this scenario has to repeat itself. This all seems pretty
> unlikely.
> 
> ObjectMonitor has a new allocation_state field that supports three
>>
>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied
>>
>> to ObjectMonitors that have reached the 'Old' state. When the Async
>>
>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it
>>
>> is changed to the 'Old' state, but is not deflated. This prevents a
>>
>> newly allocated ObjectMonitor from being immediately deflated which
>>
>> could cause an inflation<->deflation oscillation.
>>
>>
>> So let's think about what might happen if an ObjectMonitor is marked
>> as "Old" in inflate(). Here's an example use of inflate() in the
>> "slow enter" code path:
>>
>> src/hotspot/share/runtime/synchronizer.cpp:
>>> void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) {
>> <snip>
>> base<    inflate(THREAD, obj(),
>> inflate_cause_monitor_enter)->enter(THREAD);
>>
>> new>     ObjectMonitorHandle omh;
>> new>     inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter);
>> new>     do_loop = !omh.om_ptr()->enter(THREAD);
>>
>> In the "base" code, we took the return from inflate() and used it to call
>> ObjectMonitor::enter(). If we never changed that bit of code and inflate()
>> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could
>> async deflate the ObjectMonitor while we were trying to call enter() on
>> it... Boom! So we might think that holding off marking an ObjectMonitor
>> as "Old" can save us... and it can, but not in all cases... :-(
>>
>> It is entirely possible that our call to slow_enter() is made on an
>> ObjectMonitor that's already marked "Old". In that case, our thread
>> (T-enter) calls inflate() which returns the existing ObjectMonitor*
>> and we use it to call enter(). If the thread (T-deflate) calling
>> deflate_monitor_using_JT() does its magic before T-enter sets the
>> owner field or the count field... Boom!
>>
>> The previous paragraph is exactly what motivated the _ref_count field,
>> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle*
>> parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr()
>> which increments the ObjectMonitor's ref_count and then checks for async
>> deflation protocol collisions. If there's a collision, then save_om_ptr()
>> returns false and the caller (inflate() in this case) has to retry. When
>> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot
>> be deflated and is safe until the ObjectMonitorHandle is destroyed.
>>
>> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot
>> deflate the ObjectMonitor in the window after inflate() returns and
>> before T-enter sets the owner field or increments the count field. But
>> you know all that already!
>>
>> So let's bring this back to having inflate() mark the ObjectMonitor as
>> "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0,
>> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate().
>> T-deflate cannot deflate it due to ref_count > 0.
>>
>> Here's another crazy thought... inflate() is the only function that
>> calls omAlloc(), and omAlloc() is the only function that sets "New".
>> If we move the setting of "Old" from deflate_monitor_list_using_JT()
>> to inflate(), then the change from "New" -> "Old" never happens
>> outside of the inflate() call so why do we need the allocation state?
>>
>> Small dose of reality: I've found having the allocation state to be
>> very helpful when debugging race related crashes. We could make the
>> allocation state be DEBUG_ONLY, but then what about race debugging of
>> product bits... sigh...
>>
>>
>> 6. Could you get rid of the new goto’s?
>>
>>
>> I believe there is only one left from Carsten's prototype:
>>
> 
> You make it sound like I was throwing gotos around left and right. :) If
> you count continue and break statements, then you might have been right.
> 
> I'll break my response here, so we can return to regular structured
> programming, ;-)
> Carsten
>