RFR(L) 8153224 Monitor deflation prolong safepoints
Karen Kinnear
karen.kinnear at oracle.com
Mon Apr 15 16:17:08 UTC 2019
Dan,
Sorry to be so slow to get back to you,
> On Apr 8, 2019, at 9:04 PM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>
> On 4/5/19 4:59 PM, Karen Kinnear wrote:
>> Dan,
>>
>> Some more minor comments from reading the code:
>
> Thanks for the additional comments. I'm gathering changes for the next
> round of code review (CR1) so these will be resolved in that round...
>
> More below...
>
>
>> 1. Could you add comments to markOop.hpp about
>> the use in the displaced_mark_word of is_marked to prevent any users of is_marked
>> here from needing to have that information saved/restored?
>
> I _think_ I know what you're looking for here... Perhaps this:
>
> src/hotspot/share/oops/markOop.hpp:
> // ObjectMonitor::install_displaced_markword_in_object() uses
> // is_marked() on ObjectMonitor::_header as part of the restoration
> // protocol for an object's header. In this usage, the mark bit is
> // only ever set (and cleared) on the ObjectMonitor::_header field.
> bool is_marked() const {
> return (mask_bits(value(), lock_mask_in_place) == marked_value);
> }
Yes - thank you. Potentially prevents future overlaps.
>
>
>> 2. In objectMonitor.hpp
>> in is_busy you clarify the difference in use between _count (which I think you may be changing
>> to _contended) and _ref_count. Could you possibly also comment where you declare them?
>
> I'll do the rename of _count -> _contentions in a subtask of 8153224
> like the other cleanups of the monitor subsystem.
>
> Here's the comment in question:
>
> src/hotspot/share/runtime/objectMonitor.hpp:
> intptr_t is_busy() const {
> // TODO-FIXME: merge _count and _waiters.
> // TODO-FIXME: assert _owner == null implies _recursions = 0
> // TODO-FIXME: assert _WaitSet != null implies _count > 0
> // We do not include _ref_count in the is_busy() check because
> // _ref_count is for indicating that the ObjectMonitor* is in
> // use which is orthogonal to whether the ObjectMonitor itself
> // is in use for a locking operation.
> return _count|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList);
> }
>
> I don't think this comment clarifies _count vs. _ref_count.
> I added the last four lines of the comment and their purpose is to
> describe why _ref_count isn't used by is_busy(). The TODO-FIXME lines
> need to be revisited since (at least) the third one is wrong.
>
> Here's the existing comment for _ref_count:
>
> volatile jint _ref_count; // ref count for ObjectMonitor*
>
> Here's the existing comment for _count:
>
> volatile jint _count; // reference count to prevent reclamation/deflation
> // at stop-the-world time. See ObjectSynchronizer::deflate_monitor().
> // _count is approximately |_WaitSet| + |_EntryList|
>
> And here's what I proposed to change it to in my reply to your design
> review notes:
>
> volatile jint _contentions; // Number of active contentions in enter(). It is used by is_busy()
> // along with other fields to determine if an ObjectMonitor can be
> // deflated. See ObjectSynchronizer::deflate_monitor().
>
> I think we're good here with the proposed change of comment (and the
> rename) for the _contentions field along with existing comment for
> _ref_count and the existing comment for is_busy(). I may delete the
> third TODO-FIXME line as part of the next cleanup.
Thank you for the rename and comment
>
>
>> 3. clear_using_JT: would it make sense to have an assertion that _owner is either null or DEFLATER_MARKER?
>
> We could add something like:
>
> assert(_owner == NULL ||
> (AsyncDeflateIdleMonitors && _owner == DEFLATER_MARKER),
> "Fatal logic error in ObjectMonitor owner!");
>
> and that will catch any races in async monitor deflation where the
> _owner field is set to a monitor owner value (stack addr or thread*).
> For monitor deflation at a safepoint, the non-NULL _owner field is
> caught in clear() (which calls clear_using_JT()).
That covers my concern.
>
>
>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < _count
>> with comments that caller ensured _count <= 0
>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0
>> — Am I missing something subtle here or should they be the same guarantees?
>
> Here's the code in question:
>
> src/hotspot/share/runtime/objectMonitor.cpp:
>
> void ObjectMonitor::EnterI(TRAPS) {
> <snip>
> if (_owner == DEFLATER_MARKER) {
> guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller");
> // Deflater thread tried to lock this monitor, but it failed to make _count negative and gave up.
>
> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) {
> <snip>
> if (_owner == DEFLATER_MARKER) {
> guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here");
>
>
> Reading these two guarantee() calls always throws me off stride
> because I would have written them like this:
>
> guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller");
>
> and
>
> guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here");
>
> When rewritten like the above, you have:
>
> "_count > 0" ... _count <= 0
>
> and:
>
> "_count >= 0" ... "_count < 0"
>
> which is easier for my brain to read... okay... enough sidebar...
>
> Short answer: No the guarantees should not be the same.
>
> Longer answer: EnterI() is called by enter() after enter() has
> incremented the _count field to indicate the contended state of
> things. So in EnterI(), "_count > 0" is the right check.
> ReenterI() is called after wait() has returned (notified or
> timedout), and the _count field is not used on reentry ops so
> "_count >= 0" is the right check.
Thank you for walking this through.
Wait also can cause a call to enter() -> EnterI(), but in that case _waiters was set while
the monitor was still owned, so we should never get to the logic where EnterI sees
DEFLATER_MARKER.
>
> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes,
> there are two places in EnterI() that do this):
>
> L501: if (_owner == DEFLATER_MARKER) {
> // The deflation protocol finished the first part (setting _owner),
> // but it failed the second part (making _count negative) and bailed.
> // Because we're called from enter() we have at least one contention.
> guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller");
> L504: // Try to acquire monitor.
> L505: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) {
>
> L629: if (_owner == DEFLATER_MARKER) {
> // The deflation protocol finished the first part (setting _owner),
> // but it failed the second part (making _count negative) and bailed.
> // Because we're called from enter() we have at least one contention.
> guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller");
> L632: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) {
>
> And I'm going to tweak the ReenterI() code like this:
>
> L759: if (_owner == DEFLATER_MARKER) {
> // The deflation protocol finished the first part (setting _owner),
> // but it will observe _waiters != 0 and will bail out. Because we're
> // called from wait() we may or may not have any contentions.
> guarantee(count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0 should have been handled by the caller");
> L761: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) {
>
>
> You didn't ask this, but it is okay that _count is only used to track
> contentions in enter()/EnterI() and is not used to track contentions
> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is
> used by is_busy() to observe the busy state for an ObjectMonitor that
> is being wait()'ed for. The _waiters field is decremented after a
> waiter has returned from ReenterI() so the _owner field takes over
> answering the is_busy() question…
Yes - that was my confusion and I had not walked it through carefully enough.
And thank you - the guarantees are easier to read this way.
>
>
>> 5. I could use a little help with allocation state transitions,
>> e.g. in deflate_monitor_list_using_JT
>> you see is_new with object set so you mark it as old so next deflation will check it
>
> Here's the code in question:
>
> src/hotspot/share/runtime/synchronizer.cpp:
>
> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** listHeadp,
> ObjectMonitor** freeHeadp,
> ObjectMonitor** freeTailp,
> ObjectMonitor** savedMidInUsep) {
> <snip>
> // Only try to deflate if there is an associated Java object and if
> // mid is old (is not newly allocated and is not newly freed).
> if (mid->object() != NULL && mid->is_old() &&
> deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) {
> // Deflation succeeded so update the in-use list.
> <snip>
> } else {
> // mid is considered in-use if it does not have an associated
> // Java object or mid is not old or deflation did not succeed.
> // A mid->is_new() node can be seen here when it is freshly returned
> // by omAlloc() (and skips the deflation code path).
> // A mid->is_old() node can be seen here when deflation failed.
> // A mid->is_free() node can be seen here when a fresh node from
> // omAlloc() is released by omRelease() due to losing the race
> // in inflate().
>
> if (mid->object() != NULL && mid->is_new()) {
> // mid has an associated Java object and has now been seen
> // as newly allocated so mark it as "old".
> mid->set_allocation_state(ObjectMonitor::Old);
> }
>
>> - why do you set it to old here rather than in inflate once we set values?
>
> Inflation is used in quite a few places. If we marked the
> ObjectMonitor as "Old" in inflate(), then that would make the
> ObjectMonitor available for deflation by deflate_monitor_using_JT()
> earlier:
>
> src/hotspot/share/runtime/synchronizer.cpp:
>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid,
>> ObjectMonitor** freeHeadp,
>> ObjectMonitor** freeTailp) {
>> assert(AsyncDeflateIdleMonitors, "sanity check");
>> assert(Thread::current()->is_Java_thread(), "precondition");
>> // A newly allocated ObjectMonitor should not be seen here so we
>> // avoid an endless inflate/deflate cycle.
>> assert(mid->is_old(), "precondition");
>
> So the idea behind only deflating ObjectMonitors that have reached
> allocation state "Old" is to prevent "an endless inflate/deflate cycle".
> Here's the relevant section from Carsten's JEP:
>
>> To avoid endless inflation / deflation cycles in the prototype, monitor
>> deflation is only attempted the second time a monitor is seen by the
>> thread marking monitors as deflatable: If the thread (the only thread
>> marking monitors as deflatable; might be service thread or some GC
>> related thread or even a dedicated thread) sees a monitor in state New,
>> then the thread marks the monitor as Old and moves on. So there is
>> little interaction between a thread inflating a lock to a monitor and
>> the deflating thread, the inflating thread just has to make sure the
>> monitor is marked New and this marker is published using appropriate
>> barriers.
>
> There isn't an explicit example in the JEP of what Carsten was thinking
> of with "an endless inflate/deflate cycle". I didn't try to think of
> such an example for the OpenJDK wiki either. I simple wrote:
>
>> ObjectMonitor has a new allocation_state field that supports three
>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied
>> to ObjectMonitors that have reached the 'Old' state. When the Async
>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it
>> is changed to the 'Old' state, but is not deflated. This prevents a
>> newly allocated ObjectMonitor from being immediately deflated which
>> could cause an inflation<->deflation oscillation.
>
> So let's think about what might happen if an ObjectMonitor is marked
> as "Old" in inflate(). Here's an example use of inflate() in the
> "slow enter" code path:
>
> src/hotspot/share/runtime/synchronizer.cpp:
> > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) {
> <snip>
> base< inflate(THREAD, obj(), inflate_cause_monitor_enter)->enter(THREAD);
>
> new> ObjectMonitorHandle omh;
> new> inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter);
> new> do_loop = !omh.om_ptr()->enter(THREAD);
>
> In the "base" code, we took the return from inflate() and used it to call
> ObjectMonitor::enter(). If we never changed that bit of code and inflate()
> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could
> async deflate the ObjectMonitor while we were trying to call enter() on
> it... Boom! So we might think that holding off marking an ObjectMonitor
> as "Old" can save us... and it can, but not in all cases... :-(
>
> It is entirely possible that our call to slow_enter() is made on an
> ObjectMonitor that's already marked "Old". In that case, our thread
> (T-enter) calls inflate() which returns the existing ObjectMonitor*
> and we use it to call enter(). If the thread (T-deflate) calling
> deflate_monitor_using_JT() does its magic before T-enter sets the
> owner field or the count field... Boom!
>
> The previous paragraph is exactly what motivated the _ref_count field,
> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle*
> parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr()
> which increments the ObjectMonitor's ref_count and then checks for async
> deflation protocol collisions. If there's a collision, then save_om_ptr()
> returns false and the caller (inflate() in this case) has to retry. When
> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot
> be deflated and is safe until the ObjectMonitorHandle is destroyed.
>
> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot
> deflate the ObjectMonitor in the window after inflate() returns and
> before T-enter sets the owner field or increments the count field. But
> you know all that already!
> So let's bring this back to having inflate() mark the ObjectMonitor as
> "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0,
> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate().
> T-deflate cannot deflate it due to ref_count > 0.
>
> Here's another crazy thought... inflate() is the only function that
> calls omAlloc(), and omAlloc() is the only function that sets "New".
> If we move the setting of "Old" from deflate_monitor_list_using_JT()
> to inflate(), then the change from "New" -> "Old" never happens
> outside of the inflate() call so why do we need the allocation state?
That was my next question.
>
> Small dose of reality: I've found having the allocation state to be
> very helpful when debugging race related crashes. We could make the
> allocation state be DEBUG_ONLY, but then what about race debugging of
> product bits... sigh...
>
>
>> 6. Could you get rid of the new goto’s?
>
> I believe there is only one left from Carsten's prototype:
>
> src/hotspot/share/runtime/synchronizer.cpp:
>
>> intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) {
> <snip>
>> } else if (mark->has_monitor()) {
>> ObjectMonitorHandle omh;
>> if (!omh.save_om_ptr(obj, mark)) {
>> // Lost a race with async deflation so try again.
>> assert(AsyncDeflateIdleMonitors, "sanity check");
>> goto Retry;
>> }
>
> I can change FastHashCode() to use the same "while (do_loop)" as the
> other code that needs to do retries...
>
Thank you.
>
>> 7. On the updated wiki for the hash race example:
>> Racing Threads: “T-hash is about to inc the ref_count field”
>> actually - T-hash just did - ref_count == 1 - so maybe change middle values
>
> Actually, we're talking about the set up for the race and the
> diagram shows "ref_count == 1" and should show "ref_count == 0".
> So I have fixed that on the "Racing Threads" diagram.
>
> In the following "T-deflate Wins" and "T-hash Wins" diagrams,
> "ref_count == 1" is shown in both initial race results ObjectMonitor
> box. In "T-deflate Wins", it shows ref_count being restored to
> 0 in the second ObjectMonitor box.
>
> Thanks for catching this error. I've fixed it on the wiki.
Thanks.
>
>
>>
>> 8. There is an old comment in FastHashCode
>> that
>> // WARNING:
>> // The displaced header is strictly immutable.
>> // It can NOT be changed in ANY cases.
>>
>> I presume that only applies to the displaced header for a stack lock - could you
>> possibly update that while you are in the code?
>
> Here's the whole comment:
>
>> // WARNING:
>> // The displaced header is strictly immutable.
>> // It can NOT be changed in ANY cases. So we have
>> // to inflate the header into heavyweight monitor
>> // even the current thread owns the lock. The reason
>> // is the BasicLock (stack slot) will be asynchronously
>> // read by other threads during the inflate() function.
>> // Any change to stack may not propagate to other threads
>> // correctly.
>
> That comment applies the displaced header that's in the BasicLock
> on the thread's stack and it definitely needs some cleaning up
> independent of the Async Monitor Deflation project.
Thank you.
>
>
>> Also in FastHashCode
>> // The only update to the header in the monitor (outside GC)
>> 823 // is install the hash code. If someone add new usage of
>> 824 // displaced header, please update this code
>> Can you update that comment as well? I know you’ve already updated the code logic.
>
> I'll revisit that comment as well. I believe Carsten updated it in
> his prototype, but when I backed out that change when I simplified
> the hashcode stuff due to ObjectMonitorHandles/ref_count.
Thank you. And if you are revisiting this to potentially call install_displaced_markword then this
would change yet again.
>
>
>> So I walked the logic for the hashcode interactions - I didn’t find any holes. Thank you for walking most of it in email/wiki.
>> In particular, inflate does the save_om_ptr dance to inc_ref_count, so this code above will
>> be called while preventing async deflation.
>
> Right.
>
>
>> 9. install_displaced_markword_in_object
>> What happens if the cas_set_mark fails?
>
> Here's the code in question:
>
> src/hotspot/share/runtime/objectMonitor.cpp:
>
>> void ObjectMonitor::install_displaced_markword_in_object() {
> <snip>
>> if (dmw->is_marked()) {
>> // The dmw copy is marked which means a hash was not set by a racing
>> // thread. Clear the mark from the copy in preparation for possible
>> // restoration from this thread.
>> assert(dmw->hash() == 0, "must be 0: hash=" INTPTR_FORMAT, dmw->hash());
>> dmw = dmw->set_unmarked();
>> }
>> assert(dmw->is_neutral(), "must be a neutral markword");
>>
>> oop const obj = (oop) object();
>> // Install displaced markword if object markword still points to this
>> // monitor. Both the mutator trying to enter() and the thread deflating
>> // the monitor will reach this point, but only one can win.
>> // Note: If a mutator won the cmpxchg() race above and installed a hash
>> // in _header, then the updated dmw contains that hash and we'll install
>> // it in the object's markword here.
>> obj->cas_set_mark(dmw, markOopDesc::encode(this));
>
> We don't check the return from cas_set_mark() here intentionally.
> If we have just T-enter and T-deflate racing through this code,
> then after the "if (dmw->is_marked()) {" block, both threads
> will have the same 'dmw' value. One thread will set it and the
> other thread will fail to set it, but we don't care because both
> threads wanted to set the same value... As a result of the
> cas_set_mark() call in both threads, both threads will see the
> same value in the object's header (if they happen to look).
>
> I talk about this in the "Either Wins the Second Race" sub-section
> on the wiki.
Yes for the current model of only these two callers, neither modifying the dmw other than
set/clear is_marked) bit. If we extend to FastHashCode - this gets trickier.
>
>
>> I get that today this handles the race with enter and deflate_monitor_using_JT. If we remove
>> the call from enter, is the expectation that we’ve blocked all others who did not set is_marked themselves?
>> If we remove the call from enter would it make sense to ensure that the cas_set_mark succeeds here?
>
> If we remove the install_displaced_markword_in_object() call from enter(),
> then I don't think we need install_displaced_markword_in_object() at
> all and can restore the object's header with:
>
> // Restore the header back to obj
> obj->release_set_mark(mid->header());
>
> just like ObjectSynchronizer::deflate_monitor(). The question is
> whether we think install_displaced_markword_in_object() buys us
> something other than "help" in restoring the object's header.
Yes - that was the question - it adds complexity. Does it help threads make progress.
Thinking about this more and reading Carsten’s reply - I think it does - especially
since seeing DEFLATION_MARKER can make the checker loop, but in the meantime someone
else can acquire the monitor - so that might be a non-trivial spin time.
>
>
>> 10. Is there any benefit in a bit of stress testing with something like a temporary flag that deflates in
>> mAlloc each time it is called?
>
> Maybe? :-) Something like DeflateAsyncMonitorsALot? Can you eloborate
> on your thinking a bit?
Just if you thought it might help shake multi-thread timing testing. If you think it is all shaken out
already - no need.
>
>
>> Looking forward to the performance runs as well as the latency numbers.
>
> I posted the SPECjbb2015 numbers from this past weekend earlier today.
> Rather disappointing on my T7600's... Neutral on my MacMini…
Thank you. And Claus’ response was to be sure they are meaningful before deep-diving - so
reproduce multiple times kind of thing.
>
> When you say "latency numbers", what do you mean? Do you mean how long
> ObjectMonitors that could be deflated are kept inflated? Or do you mean
> something else?
latency I was thinking about was the reduced time during safepoint cleanup - which is one of the goals of this
exercise.
>
> I think I've responded to everything. Please let me know if I missed
> something…
You got it all.
many thanks,
Karen
>
> Dan
>
>
>
>>
>> thanks,
>> Karen
>>
>>> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote:
>>>
>>> Filed:
>>>
>>> JDK-8222034 Thread-SMR functions should be updated to remove work around
>>> https://bugs.openjdk.java.net/browse/JDK-8222034 <https://bugs.openjdk.java.net/browse/JDK-8222034>
>>>
>>> Martin and Robbin, please check it out and make sure that I captured
>>> things correctly...
>>>
>>> Dan
>>>
>>>
>>>
>>> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote:
>>>> On 4/5/19 8:37 AM, Doerr, Martin wrote:
>>>>> Hi everybody,
>>>>>
>>>>>> I think was fixed with:
>>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics
>>>>>> You should get a leading sync and trailing one with the default conservative
>>>>>> model and thus get proper memory ordering.
>>>>>> Martin, I'm I correct?
>>>>> Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter).
>>>>> I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced.
>>>>
>>>> Okay so we need a new bug to update these two Thread-SMR functions:
>>>>
>>>> src/hotspot/share/runtime/threadSMR.cpp:
>>>>
>>>> void ThreadsList::dec_nested_handle_cnt() {
>>>> // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec
>>>> // backend on PPC does not yet conform to these requirements. Therefore
>>>> // the decrement is simulated with an Atomic::sub(1, &addr).
>>>> // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism
>>>> // is not generally safe to use.
>>>> Atomic::sub(1, &_nested_handle_cnt);
>>>> }
>>>>
>>>> void ThreadsList::inc_nested_handle_cnt() {
>>>> // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc
>>>> // backend on PPC does not yet conform to these requirements. Therefore
>>>> // the increment is simulated with a load phi; cas phi + 1; loop.
>>>> // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism
>>>> // is not generally safe to use.
>>>> intx sample = OrderAccess::load_acquire(&_nested_handle_cnt);
>>>> for (;;) {
>>>> if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) {
>>>> return;
>>>> } else {
>>>> sample = OrderAccess::load_acquire(&_nested_handle_cnt);
>>>> }
>>>> }
>>>> }
>>>>
>>>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make
>>>> sure we're all in agreement. Once we decide that Thread-SMR's
>>>> functions look like, I'll adapt my Async Monitor Deflation
>>>> functions...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>>>>> Sent: Freitag, 5. April 2019 14:07
>>>>> To: daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>; hotspot-runtime-dev at openjdk.java.net <mailto:hotspot-runtime-dev at openjdk.java.net>; Carsten Varming <varming at gmail.com <mailto:varming at gmail.com>>; Roman Kennke <rkennke at redhat.com <mailto:rkennke at redhat.com>>; Doerr, Martin <martin.doerr at sap.com <mailto:martin.doerr at sap.com>>
>>>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints
>>>>>
>>>>> Hi Dan,
>>>>>
>>>>> (Martin there is question for you last in this email)
>>>>>
>>>>> After first pass I did not find any real issues.
>>>>> Considering what you had to work with, it looks good!
>>>>>
>>>>> #1
>>>>> There are some assert which are redundant (to me at least) like:
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>> L445
>>>>> if (!dmw->is_marked() && dmw->hash() == 0) {
>>>>> // This dmw is neutral and has not yet started the restoration
>>>>> // protocol so we mark a copy of the dmw to begin the protocol.
>>>>> markOop marked_dmw = dmw->set_marked();
>>>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0,
>>>>> "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT,
>>>>> marked_dmw->is_marked(), marked_dmw->hash());
>>>>>
>>>>> That assert is basically a test that set_marked worked?
>>>>>
>>>>> L505
>>>>> if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) {
>>>>> assert(_succ != Self, "invariant");
>>>>> assert(_owner == Self, "invariant");
>>>>>
>>>>> Assert on _owner checks that our cmpxchg is not broken?
>>>>>
>>>>> I think it's easier to read the code if some on the most obvious asserts are
>>>>> removed. Maybe comments instead.
>>>>>
>>>>> #2
>>>>> Not your doing but I think we should remove TRAPS/Thread * Self and use
>>>>> JavaThread* instead.
>>>>> E.g. so we can change:
>>>>> void ObjectMonitor::EnterI(TRAPS) {
>>>>> Thread * const Self = THREAD;
>>>>> assert(Self->is_Java_thread(), "invariant");
>>>>> assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant");
>>>>>
>>>>> to:
>>>>>
>>>>> void ObjectMonitor::EnterI(JavaThread* Self) {
>>>>> assert(Self->thread_state() == _thread_blocked, "invariant");
>>>>>
>>>>> #3
>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>> 164 inline void ObjectMonitor::inc_ref_count() {
>>>>> 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc
>>>>> 166 // backend on PPC does not yet conform to these requirements. Therefore
>>>>> 167 // the increment is simulated with a load phi; cas phi + 1; loop.
>>>>> 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors
>>>>> 169 // is not safe.
>>>>>
>>>>> I think was fixed with:
>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics
>>>>> You should get a leading sync and trailing one with the default conservative
>>>>> model and thus get proper memory ordering.
>>>>> Martin, I'm I correct?
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on:
>>>>>>
>>>>>> JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 <https://bugs.openjdk.java.net/browse/JDK-8153224>
>>>>>>
>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>
>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation <https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation>
>>>>>>
>>>>>> Here's the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ <http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/>
>>>>>>
>>>>>> Here's a link to Carsten's original webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ <http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/>
>>>>>>
>>>>>> Earlier versions of this patch have been through several rounds of
>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and
>>>>>> Roman for their preliminary code review comments. A very special
>>>>>> thanks to Robbin and Roman for building and testing the patch in
>>>>>> their own environments (including specJBB2015).
>>>>>>
>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>> Oracle's usual set of platforms. Earlier versions have been run
>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers
>>>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink
>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug
>>>>>> and slowdebug). Earlier versions have run my monitor inflation stress
>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product,
>>>>>> fastdebug and slowdebug).
>>>>>>
>>>>>> All of the testing done on earlier versions will be redone on the
>>>>>> latest version of the patch.
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>> P.S.
>>>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>> is currently failing in -Xcomp mode on Win* only. I've been trying
>>>>>> to characterize/analyze this failure for more than a week now. At
>>>>>> this point I'm convinced that Async Monitor Deflation is aggravating
>>>>>> an existing bug. However, I plan to have a better handle on that
>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list