RFR(L) 8153224 Monitor deflation prolong safepoints (CR11/v2.11/14-for-jdk15)

Wed May 13 17:49:50 UTC 2020

Hi Erik,

On 5/13/20 8:40 AM, Erik Österlund wrote:
> Hi Dan,
>
> On 2020-05-12 22:32, Daniel D. Daugherty wrote:
>> Hi Erik,
>>
>> On 5/12/20 10:15 AM, Erik Österlund wrote:
>>> Hi Dan,
>>>
>>> This looks great overall.
>>
>> Thanks! And thanks for your many reviews of this project's code.
>>
>>
>>> Looking through the full webrev.
>>>
>>> Spotted a tiny logging error in objectMonitor.cpp:
>>>
>>>  455   // Install displaced mark word if the object's header still 
>>> points
>>>  456   // to this ObjectMonitor. All racing callers to this function 
>>> will
>>>  457   // reach this point, but only one can win.
>>>  458   markWord res = obj->cas_set_mark(dmw, markWord::encode(this));
>>>  459   if (res != markWord::encode(this)) {
>>>  460     // This should be rare so log at the Info level when it 
>>> happens.
>>>  461 
>>> log_info(monitorinflation)("install_displaced_markword_in_object: "
>>>  462                                "failed cas_set_mark: new_mark=" 
>>> INTPTR_FORMAT
>>>  463                                ", old_mark=" INTPTR_FORMAT ", 
>>> res=" INTPTR_FORMAT,
>>>  464                                dmw.value(), 
>>> markWord::encode(this).value(),
>>>  465                                res.value());
>>>  466   }
>>>
>>> Note that res will contain the value that was there before the CAS. 
>>> Therefore, the if statement
>>> should compare if (res != dmw) instead. Otherwise this will log 
>>> almost every time.
>>
>> Hmmmm.... Here's the code for cas_set_mark():
>>
>> src/hotspot/share/oops/oop.inline.hpp:
>>
>> markWord oopDesc::cas_set_mark(markWord new_mark, markWord old_mark) {
>>   uintptr_t v = HeapAccess<>::atomic_cmpxchg_at(as_oop(), 
>> mark_offset_in_bytes(), old_mark.value(), new_mark.value());
>>   return markWord(v);
>> }
>>
>> so I'm passing 'dmw' as new_mark and 'markWord::encode(this)' as 
>> old_mark.
>> The return value from cas_set_mark() is the current/old value in the 
>> location
>> before we try to cmpxchg() it...
>>
>> My if-statement is this:
>>
>>     L459:   if (res != markWord::encode(this)) {
>>
>> which I read as "if the current/old value in the location is not 
>> old_mark",
>> then we do the logging... Or to rephrase: if I didn't change old_mark to
>> new_mark, then do the logging...
>>
>> I'm not trying to log when we successfully change old_mark to new_mark
>> because we'll do that once per deflation. What I'm trying to log is 
>> when we
>> have two racing calls to install_displaced_markword_in_object(), both of
>> them happen to reach the cas_set_mark() call (happens rarely), and 
>> one of
>> them loses. I want to log that rare losing condition.
>>
>> This losing condition is interesting because it is rare. In an 8 hour
>> Kitchensink run, I see just a few hundred of these log messages.
>>
>> I need to change this comment since L443-446 can weed out some racers:
>>
>>     L455:   // Install displaced mark word if the object's header 
>> still points
>>     L456:   // to this ObjectMonitor. All racing callers to this 
>> function will
>>     L457:   // reach this point, but only one can win.
>>
>> to this:
>>
>>     L455:   // Install displaced mark word if the object's header 
>> still points
>>     L456:   // to this ObjectMonitor. More than one racing caller to 
>> this function
>>     L457:   // can rarely reach this point, but only one can win.
>>
>> Hopefully, what I'm trying to do here is more clear now. I don't 
>> think I'm
>> wrong about the logic, but please let me know if I am...
>
> That makes sense. I think I got confused by the argument orders of 
> cas_set_mark() being the opposite
> of all other CAS functions.

Glad we're now on the same page...

>
>>
>>> In synchronizer.cpp:
>>>
>>>  495     if (AsyncDeflateIdleMonitors) {
>>>  496       // An async deflation can race us before we manage to 
>>> make the
>>>  497       // ObjectMonitor busy by setting the owner below. If we 
>>> detect
>>>  498       // that race we just bail out to the slow-path here.
>>>  499       if (m->object() == NULL) {
>>>  500         return false;
>>>  501       }
>>>  502     } else {
>>>  503       assert(m->object() == obj, "invariant");
>>>  504     }
>>>
>>> The new special handling of m->object() == NULL does not seem 
>>> necessary. The described race
>>> will sort itself out once we try to acquire the owner and fail. I'm 
>>> guessing you changed this
>>> because the assert triggers otherwise when m->object() == NULL. 
>>> Maybe the assert should just
>>> be modified instead to something like assert(m->object() == obj || 
>>> m->is_being_async_deflated()),
>>> or something like that.
>>
>> Hmmmm... I think I'm going to have to disagree here... :-)
>>
>> The _object field is set to NULL by deflate_monitor_using_JT()'s call to
>> clear_using_JT() (being renamed to clear_common()). At that point in 
>> time,
>> the object no longer refers to the ObjectMonitor* because of the call to
>> install_displaced_markword_in_object() that preceded clear_using_JT().
>>
>> So "(m->object() == NULL)" means the object and ObjectMonitor* are no
>> longer associated with each other, i.e., a clear disconnect.
>>
>> ObjectSynchronizer::quick_enter() is a very hot code path so bailing out
>> to the slow-path as soon as we detect the "disconnect" is better than
>> executing the following code to get to same slow-path return:
>>
>>     L505:     Thread* const owner = (Thread *) m->_owner;
>>     L506:
>>     L512:     if (owner == self) {
>>     L515:     }
>>     L527: lock->set_displaced_header(markWord::unused_mark());
>>     L529:     if (owner == NULL && m->try_set_owner_from(NULL, self) 
>> == NULL) {
>>     L532:     }
>>     L533:   }
>>     L542:   return false;        // revert to slow-path
>
> Right. But then you have added checks to bail out a bit earlier in the 
> uncommon case of entering a concurrently
> deflating monitor (probably very rare).

I added _one_ check:

     L499:       if (m->object() == NULL) {
     L500:         return false;

and that check is only made when AsyncDeflateIdleMonitors is true.

> But the same checks are run by the common case (no concurrent deflation),
> making it slower.

No, the (m->object() == NULL) check is _not_ made in the common case.
Nor does it need to be. With safepoint based deflation, the object field
cannot be NULL'ed except at a safepoint and quick_enter() cannot be
executing at a safepoint...

It occurs to be that when you say "no concurrent deflation" above, you might
mean quick_enter() is not racing with the async deflater thread at this
moment in time... You may not mean the !AsyncDeflateIdleMonitors case...
(which is what my first paragraph is about above)...

For that possibility, I agree that I've added _one_ NULL check here.

> In orther words, you have made an uncommon case slightly faster and 
> the common case
> slightly slower. That does not seem like an obvious win for me. But I 
> won't insist if you still think it is a good idea.

Thanks for not insisting here. If I drop that NULL check, I'll have to
validate it with another multi-day round of stress testing since this
quick_enter() function has been shown to be racy in more than one round
of testing...

On the plus side, relative to v2.10, we got rid of:

     old L495:       ObjectMonitorHandle omh;
     old L496:       if (!omh.save_om_ptr(obj, mark)) {

Both of which showed up in Eric C's perf testing where he was analyzing
the DaCapo-h2 perf degradation in v2.10...

>
>>
>>> I also noticed that when entering a monitor, you help async 
>>> deflation finish unlinking of the mark
>>> word, so that a re-try will make progress, and not act as a spin 
>>> lock, waiting for the deflating thread
>>> to make progress. That seems great.
>>
>> Right. That used to happen with the ObjectMonitorHandle helper via the
>> associated save_om_ptr() call. Carsten had similar code to what I have
>> now in ObjectMonitor::enter() in his prototype.
>>
>> So when I took out ObjectMonitorHandle, I added back the direct call to
>> install_displaced_markword_in_object() in ObjectMonitor::enter()...
>
> Okay.
>
>>
>>> However, when a hashCode is computed, you just re-try instead,
>>> without helping deflation along the way. It seems to me this creates 
>>> a similar spin lock kind of
>>> situation, where the thread computing a hashCode will have to wait 
>>> quite a bit if the deflating
>>> thread gets preempted at the wrong time.
>>
>> Again, that used to happen with the ObjectMonitorHandle helper via the
>> associated save_om_ptr() call. Obviously when I took out 
>> ObjectMonitorHandle,
>> I had to decide how to handle the race. I chose to simply retry.
>>
>> In Carsten's prototype, he had some logic that tried to convert the
>> object's header from an ObjectMonitor* ref to the 'dmw' with the
>> hashcode value included. That logic depended on the marking of the
>> dmw that used to be done by install_displaced_markword_in_object().
>>
>> I chose to simplify the interaction of async deflation and hashcode
>> installation races. That allowed me to drop the marking of 'dmw' in
>> the ObjectMonitor in install_displaced_markword_in_object() which
>> meant one less write to ObjectMonitor's _header/dmw field. It also
>> meant dropping the reads necessary to see if the mark bit was set.
>
> That sounds good. I was not a fan of exposing is_marked() markWords 
> outside of safepoints
> as they are inherently not designed to ever be outside of safepoints. 
> There are checks that no
> longer work, such as has_monitor() returning true for all marked 
> markWords regardless of whether
> they encode a monitor or not. So yeah, happy we won't have to deal 
> with convincing ourselves
> that the marked markWords are never exposed to code that will subtly 
> break.

But, that was so much fun for you and I to verify in one of the previous
bug hunts! I live for chasing possible leakage of markWords into other
places in the VM :-)

More seriously, I agree that dropping the marking of the dmw/header field
in ObjectMonitor greatly simplifies analysis of these set of changes!!

>
>> Here's where I retry in FastHashCode:
>>
>>     L1099:       if (monitor->is_being_async_deflated()) {
>>     L1100:         // If we detect that async deflation has occurred, 
>> then we
>>     L1101:         // simply retry so that the hash value can be 
>> stored in either
>>     L1102:         // the object's header or in the re-inflated 
>> ObjectMonitor's
>>     L1103:         // header as appropriate.
>>     L1104:         continue;
>>     L1105:       }
>>
>> First, I just realized that I forget to check AsyncDeflateIdleMonitors
>> before that call to is_being_async_deflated()... oops.
>
> Yeah a spontaneous thought is that maybe is_being_async_deflated 
> should check if !AsyncDeflateIdleMonitors
> and then return false. It reads quite intuitively then that if 
> AsyncDeflateIdleMonitors is disabled, then it is
> obviously also not being async deflated.

Well, I could do that and it does make sense. Currently all calls to
is_being_async_deflated() are now guarded with AsyncDeflateIdleMonitors
so move the check inside is_being_async_deflated() doesn't invalidate
my current testing...

If Async Monitor Deflation makes it into JDK15, then I plan to remove
AsyncDeflateIdleMonitors and friends early in JDK16...

>
>> Second, I have a good object ref there ('obj') so I could easily add a
>> call to "install_displaced_markword_in_object(obj)" before the continue
>> and we would only retry once. That would get us out of the spin lock
>> like situation. What do you think?
>
> That sounds okay. The more general issue though is that everywhere 
> there is a retry, we have
> to remember to call install_displaced_markword_in_object somewhere or 
> there will be issues.

If by "issues" you mean we might do a spin loop, then I agree. I don't
think there are correctness issues...

> I guess there are not that many places though in the end, so maybe 
> just sprinkling in another
> install_displaced_markword_in_object is fine.

Definitely fewer calls to install_displaced_markword_in_object() than we
used to have with ObjectMonitorHandle and save_om_ptr(). I believe we have
a total of four call sites in the current code. In v2.10, we only had two
call sites, but save_om_ptr() was called from many more places...

>
>>
>>> Perhaps one way of dealing with this is to move the async monitor
>>> deflation helping from entering the monitor to 
>>> ObjectSynchronizer::inflate. When inflate() find that
>>> the header has a monitor, now it just returns that monitor. Perhaps 
>>> it could instead check if it is being
>>> deflated, help with deflation, and re-inflate a new monitor. That 
>>> way, any code that grabs a monitor,
>>> tries to do something with it, and fails, can just retry, without 
>>> having to deal with the re-try not
>>> making any progress unless manually helping async deflation. Just an 
>>> idea.
>>
>> Definitely something to mull on, but not for this fix. Keep in mind that
>> the async deflation could happen just after the call to inflate() here
>> on L588:
>>
>> src/hotspot/share/runtime/synchronizer.cpp:
>>
>>     L551: void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, 
>> TRAPS) {
>> <snip>
>>     L587:   while (true) {
>>     L588:     ObjectMonitor* monitor = inflate(THREAD, obj(), 
>> inflate_cause_monitor_enter);
>>     L589:     if (monitor->enter(THREAD)) {
>>
>> and before L297 in the ObjectMonitor::enter() call we make on L589 
>> above:
>>
>> src/hotspot/share/runtime/objectMonitor.cpp:
>>
>>     L243: bool ObjectMonitor::enter(TRAPS) {
>> <snip>
>>     L297:   Atomic::inc(&_contentions);
>>     L298:   if (AsyncDeflateIdleMonitors && is_being_async_deflated()) {
>>     L299:     // Async deflation is in progress and our contentions 
>> increment
>>     L300:     // above lost the race to async deflation. Undo the 
>> work and
>>     L301:     // force the caller to retry.
>>     L302:     const oop l_object = (oop)object();
>>     L303:     if (l_object != NULL) {
>>     L304:       // Attempt to restore the header/dmw to the object's 
>> header so that
>>     L305:       // we only retry once if the deflater thread happens 
>> to be slow.
>>     L306: install_displaced_markword_in_object(l_object);
>>
>> So we still need the check in ObjectMonitor::enter() even if we add
>> one in inflate(). Of course, when we had ref_counting, we didn't have
>> this problem because the > 0 ref_count would have kept that async
>> deflation from landing the way that I described here.
>
> Yes. The idea was that potentially failing things like enter and 
> hashCode only have to check
> if it is async deflating, and then fail, without having to remember 
> that failing followed by retry
> would need another manual call somewhere to 
> install_displaced_markword_in_object; at least
> that part would be done for you, and you'd only have to return false 
> and say it failed. But
> again, there are kind of few places this happens so I am okay with a 
> less generic solution.

I don't see how we can do a generic solution that works (in one place).
We had a generic solution when we had ObjectMonitorHandle, ref_counting
and save_om_ptr(), but we had to give that up due to the big impact on
performance by the ref_counting... (sad now :-( )

>
>>
>>> Related to the hashCode stuff:
>>>
>>> 044     } else if (mark.has_monitor()) {
>>> 1045       // The first stage of a racing async deflation won't 
>>> affect the
>>> 1046       // hash value if this ObjectMonitor happens to already 
>>> have one.
>>> 1047       monitor = mark.monitor();
>>> 1048       temp = monitor->header();
>>> 1049       assert(temp.is_neutral(), "invariant: header=" 
>>> INTPTR_FORMAT, temp.value());
>>> 1050       hash = temp.hash();
>>> 1051       if (hash != 0) {                  // if it has a hash, 
>>> just return it
>>> 1052         return hash;
>>> 1053       }
>>> 1054       // Fall thru so we only have one place that installs the 
>>> hash in
>>> 1055       // the ObjectMonitor.
>>>
>>> The comment says that if the hashCode has been set (not 0), then it 
>>> is safe to use it without any weird
>>> async deflation interactions. I am not convinded that is true. In 
>>> the previous version of this, you had
>>> a way of poisoning the displaced mark word during deflation, so that 
>>> the CAS installing a new hashCode
>>> would fail. But now with this new approach, that code is gone 
>>> (unless I missed something).
>>
>> Actually in the previous version of this, the ref_count would have kept
>> async deflation from interfering with saving the hash code value in the
>> ObjectMonitor's _header/dmw field. The logic was different in much older
>> versions of the code...
>
> Ah yes. I recalled the even older logic. This has been going on for a 
> while. :)

Yes, yes it has... I have been living with this code for much of my time
for the last year plus... :-|

>
>> The reason the above is safe is because the hash code value that we 
>> found
>> in the ObjectMonitor will be the same hash code value that will be 
>> restored
>> to the object's header by the deflation process.
>>
>> Or at least that was my thinking when I wrote that comment... more 
>> below...
>> (Yes, I was wrong!)
>
> :)
>
>>
>>> I think this
>>> can yield an unfortunate race when two threads call hashCode() while 
>>> it is being deflated.
>>>
>>> JT1: load monitor from object
>>> JT2: load monitor from object
>>> ServiceThread: deflate monitor
>>> JT1: Check if there is an existing hashCode. There is not.
>>> JT1: Install new displaced mark word successfully with new non-zero 
>>> hashCode
>>> JT1: Figure out that this monitor has been deflated, and retry
>>> JT2: Check if there is an existing hashCode. There is; the discarded 
>>> result of JT1.
>>> JT1: Install new hashCode, not knowing that JT1 caught the old one.
>> Correction: you mean 'JT2 caught the old one' here.
>>
>>
>>> Now JT1 and JT2 have acquired different values for hashCode.
>>
>> Definitely an interesting scenario. Sigh... this is yet another race 
>> that
>> wasn't possible with ref_counting. Grumble, grumble, grumble... :-)
>
> Yes, that was nice with the ObjectMonitorHandle abstraction.

One of the "problems" I've run into with back tracking away from
ObjectMonitorHandles and the ref_counting is that I removed/changed
code that no longer had to worry about races because ref_counting
made things _safe_.

I've had to very carefully compare the baseline to v2.11 and make
sure that any code that I removed wasn't needed for race safety.
It has been painful...

We had ObjectMonitorHandle and ref_counting for most of this project
so getting out of that "of course, things are safe" mindset is hard.

>
>>
>>> Maybe one could poke the contended counter,
>>> or find a convenient way of poisoning the displaced mark word so 
>>> that the CAS installing it on JT1 fails.
>>> Anyway, there is some thinking to be done here I think.
>>
>> Agreed. More mulling is/was needed. Here's the revised block:
>>
>>     } else if (mark.has_monitor()) {
>>       monitor = mark.monitor();
>>       temp = monitor->header();
>>       assert(temp.is_neutral(), "invariant: header=" INTPTR_FORMAT, 
>> temp.value());
>>       hash = temp.hash();
>>       if (hash != 0) {
>>         // It has a hash.
>>         if (AsyncDeflateIdleMonitors && 
>> monitor->is_being_async_deflated()) {
>>           // But we can't safely use the hash if we detect that async
>>           // deflation has occurred. So we attempt to restore the
>>           // header/dmw to the object's header so that we only retry
>>           // once if the deflater thread happens to be slow.
>>           install_displaced_markword_in_object(obj);
>>           continue;
>>         }
>>         return hash;
>>       }
>
> Yeah, something like that should do the trick. Although, unfortunately 
> I think we have to sprinkle some awkward
> fencing in there to make sure the load of the displaced mark word and 
> the loads in is_being_async_deflated()
> are not reordered.

So is_being_async_deflated() looks like this right now:

inline bool ObjectMonitor::is_being_async_deflated() {
   return AsyncDeflateIdleMonitors && owner_is_DEFLATER_MARKER() && 
contentions() < 0;
}

contentions() uses an Atomic::load() as of the v2.11 patch. However,
owner_is_DEFLATER_MARKER() does not:

inline bool ObjectMonitor::owner_is_DEFLATER_MARKER() {
   return _owner == DEFLATER_MARKER;
}

So I'm going to change it to:

inline bool ObjectMonitor::owner_is_DEFLATER_MARKER() {
   return Atomic::load(&_owner) == DEFLATER_MARKER;
}

It should have been that way when I created owner_is_DEFLATER_MARKER(),
but I was sloppy. Bad Dan!

Here's how the dmw/header field is fetched:

inline markWord ObjectMonitor::header() const {
   return Atomic::load(&_header);
}

So if all three of the players (_header, _owner, _contentions) are
loaded with Atomic::load(), doesn't that keep them from being
reordered?

Thinking more, that should keep the loads() from being reordered
relative to each other, but I haven't done anything with respect
to the stores... sigh...

> In other places, I have seen that it is being used after an atomic 
> operation, hence not needing
> further fencing. But here there are only loads involved. They will not 
> have a consistent global ordering, and the lack
> thereof can lead us back to the race still happening. The intuitive 
> solution would to put in an OrderAccess::loadload()
> right before asking if monitor->is_being_async_deflated(). However, a 
> loadload on non-multiple copy atomic (nMCA)
> machine, does not protect against the loads happening in an 
> inconsistent order, when the stores are performed
> on different threads. In this case, if the current thread is A, then 
> the displaced mark word could be set by a thread
> B, and the state in is_being_async_deflated() could be updated by a 
> thread C. Let's say there is a fourth thread,
> D, that ends up in the same situation as thread A, then on a nMCA 
> machine, even if there is a loadload() before calling
> is_being_async_deflated(), the ordering of the two loads could be 
> different on thread A and thread D. This is the famous
> Independent Reads of Independent Writes (IRIW) scenario that is widely 
> believed to not matter in practice as there are
> barely any algorithms where it matters. Here it does, because thread A 
> might think that the racy hashCode is okay,
> while thread C says it is not okay. That means that one of the two 
> reading threads may get an inconsistent hashCode
> value, unless there is a full fence before is_being_async_deflated(), 
> depending on what side of the IRIW race they
> end up in. We need them both to always think it is *not* okay, and 
> hence end up on the same side of the IRIW race.
>
> In order to synchronize this appropriately on a nMCA machine, you need 
> a full fence() (instead of a loadload) between
> the load of the displaced mark word, and the loads in 
> is_being_async_deflated(). The fence will make sure that if you
> read the updated (non-zero hashCode) displaced markWord, then it will 
> globally sync stale memory view (caches) such that when
> the loads in is_being_async_deflated() are being executed, they 
> observe non-stale values(). Normally you would
> need MO_SEQ_CST loads for all of these loads to really be able to 
> reason about its correctness in a sane way, but since
> we have the domain knowledge that it is only the flavour of race where 
> the DMW is not stale and the is_being_async_deflated
> values are, that we end up in trouble. So while a sane model would 
> have leading fences on all the loads involved, we
> only need one. On normal platforms a loadload() is sufficient. You can 
> check which one you are running on like this:
>
> // Long incomprehensible comment describing what on earth the point of 
> this synchronization voodoo is
> if (support_IRIW_for_not_multiple_copy_atomic_cpu) {
>   OrderAccess::fence();
> } else {
>   OrderAccess::loadload();
> }
>
> Or we could go for a different solution that is easier to reason 
> about, if you prefer that.

Based on what you're saying here, the first call to 
is_being_async_deflated()
in FastHashCode() needs to have an OrderAccess::loadload() barrier to ensure
proper ordering.

The is_being_async_deflated() call in ObjectMonitor::enter() is okay because
it follows inc_contentions() which has a barrier. The second call to
is_being_async_deflated() call in FastHashCode should be okay because it
is preceded by an Atomic::cmpxchg() call...

For now, I'm going with this:

       temp = monitor->header();
       assert(temp.is_neutral(), "invariant: header=" INTPTR_FORMAT, 
temp.value());
       hash = temp.hash();
       if (hash != 0) {
         // It has a hash.

         // Separate load of dmw/header above from the loads in
         // is_being_async_deflated(). Has to be re-examined for a
         // non-multiple copy atomic (nMCA) machine.
         OrderAccess::loadload();
         if (monitor->is_being_async_deflated()) {

I've seen discussions about nMCA machines and the Independent Reads of 
Independent
Writes (IRIW) scenario, but I don't know enough about it yet to be 
comfortable
with putting an alternate fence() call in there...

>
>>
>>> Another thing I noticed...
>>>
>>> 1248       if (!mid->is_free() && mid->object() != NULL) {
>>> 1249         // Only process with closure if the object is set.
>>> 1250
>>> 1251         // monitors_iterate() is only called at a safepoint or 
>>> when the
>>> 1252         // target thread is suspended or when the target thread is
>>> 1253         // operating on itself. The closures are only 
>>> interested in an
>>> 1254         // owned ObjectMonitor and ownership cannot be dropped 
>>> under the
>>> 1255         // calling contexts so the ObjectMonitor cannot be 
>>> async deflated.
>>> 1256         closure->do_monitor(mid);
>>> 1257       }
>>>
>>> Your new comment says that this is safe with the pre-condition that 
>>> the user of the API filter away
>>> anything that is not an owned monitor. If that is true (I presume it 
>>> is), then it seems that the new
>>> mid->is_free() check should not be needed either. An owned monitor 
>>> will never be mid->is_free(). Also,
>>> clear_using_JT both clears the owner and sets the state to free.
>>
>> Correction: clear_using_JT() clears the object field and sets the 
>> state to free.
>
> Ah yes. I meant that, I just wrote something else that started with 
> 'o'. Oh dear...
>
>>
>>> So the null check would also filter
>>> away monitors that have been freed. But then again, using monitors 
>>> that have been freed is not safe for
>>> any user of this API.
>>
>> You are right that the "!mid->is_free()" part of the check is not needed
>> so I'll remove that. The baseline code did not call 
>> closure->do_monitor(mid)
>> when the object field was not set and that is sufficient. I'm basing 
>> that
>> on the fact that clear_using_JT() (renamed to clear_common()), clears 
>> the
>> object field before setting the state to free.
>
> Great.

Robbin spotted another use of is_free() that wasn't part of debug code so
I've also fixed that one.

>
>> This part of the comment:
>>
>>     L1253:                                 The closures are only 
>> interested in an
>>     L1254:         // owned ObjectMonitor and ownership cannot be 
>> dropped under the
>>     L1255:         // calling contexts so the ObjectMonitor cannot be 
>> async deflated.
>>
>> is meant to describe how the closures passed to the current callers of
>> monitors_iterate() work today and why they aren't a problem for async
>> monitor deflation. I wasn't trying to claim that monitor_iterate()
>> required that all closures only care about owned monitors.
>>
>> I tweaked the comment a little bit:
>>
>>         //                      The current closures in use today are
>>         // only interested in an owned ObjectMonitor and ownership
>>         // cannot be dropped under the calling contexts so the
>>         // ObjectMonitor cannot be async deflated.
>
> Excellent.
>
>>
>>> Perhaps that should be removed, and the name of the function 
>>> ObjectSynchronizer::monitors_iterate should
>>> change to owned_monitors_iterate(Thread* owner), where owner == NULL 
>>> implies all owned monitors, which
>>> is only valid in a safepoint. This makes the API safer to use, and 
>>> supports only what we can safely
>>> support, instead of handing off non-sticky monitors, hoping the 
>>> arbitrary user closer will know
>>> what monitors it can and can not use safely.
>>
>> For now, I think what we have is safe with respect to async deflation.
>> I don't want to redesign ObjectSynchronizer::monitors_iterate() as part
>> of this bug fix. I can see filing an RFE to follow up with additional
>> work on ObjectSynchronizer::monitors_iterate().
>
> Sounds fine, we can fix that another day.

Thanks for agreeing. I'll post when I file the new RFE...

>
>>
>>> Thanks,
>>> /Erik
>>
>> Thanks for the thorough review! I always appreciate it when you
>> crawl through my code. :-)
>
> Thanks for moving this mountain closer to jdk-jdk!

Hey now! This mountain has become much smaller as we've split things
off into sub-task after sub-task... :-)

Dan

>
> /Erik
>
>> Dan
>>
>>
>>>
>>>
>>> On 2020-05-07 19:08, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have made changes to the Async Monitor Deflation code in response to
>>>> the CR10/v2.10/13-for-jdk15 code review cycle and DaCapo-h2 perf 
>>>> testing.
>>>> Thanks to Erik O., Robbin and David H. for their OpenJDK reviews in 
>>>> the
>>>> v2.10 round! Thanks to Eric C. for his help in isolating the DaCapo-h2
>>>> performance regression.
>>>>
>>>> With the removal of ref_counting and the ObjectMonitorHandle class, 
>>>> the
>>>> Async Monitor Deflation project is now closer to Carsten's original
>>>> prototype. While ref_counting gave us ObjectMonitor* safety 
>>>> enforced by
>>>> code, I saw a ~22.8% slow down with -XX:-AsyncDeflateIdleMonitors 
>>>> ("off"
>>>> mode). The slow down with "on" mode -XX:+AsyncDeflateIdleMonitors 
>>>> is ~17%.
>>>>
>>>> I have attached the change list from CR10 to CR11 instead of 
>>>> putting it in
>>>> the body of this email. I've also added a link to the 
>>>> CR10-to-CR11-changes
>>>> file to the webrevs so it should be easy to find.
>>>>
>>>> Main bug URL:
>>>>
>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>     https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>
>>>> The project is currently baselined on jdk-15+21.
>>>>
>>>> Here's the full webrev URL for those folks that want to see all of the
>>>> current Async Monitor Deflation code in one go (v2.11 full):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for-jdk15%2b21.v2.11.full/ 
>>>>
>>>>
>>>> Some folks might want to see just what has changed since the last 
>>>> review
>>>> cycle so here's a webrev for that (v2.11 inc):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for-jdk15%2b21.v2.11.inc/ 
>>>>
>>>>
>>>> Because of the removal of ref_counting and the ObjectMonitorHandle 
>>>> class, the
>>>> incremental webrev is a bit noisier than I would have preferred.
>>>>
>>>>
>>>> The OpenJDK wiki has NOT YET been updated for this round of changes:
>>>>
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>
>>>> The jdk-15+21 based v2.11 version of the patch has been thru Mach5 
>>>> tier[1-6]
>>>> testing on Oracle's usual set of platforms. Mach5 tier[78] are 
>>>> still running.
>>>> I'm running the v2.11 patch through my usual set of stress testing on
>>>> Linux-X64 and macOSX.
>>>>
>>>> I'm planning to do a SPECjbb2015, DaCapo-h2 and volano round on the
>>>> CR11/v2.11/14-for-jdk15 bits.
>>>>
>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 2/26/20 5:22 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have made changes to the Async Monitor Deflation code in 
>>>>> response to
>>>>> the CR9/v2.09/12-for-jdk14 code review cycle. Thanks to Robbin and 
>>>>> Erik O.
>>>>> for their comments in this round!
>>>>>
>>>>> With the extraction and push of {8235931,8236035,8235795} to 
>>>>> JDK15, the
>>>>> Async Monitor Deflation code is back to "just" async deflation 
>>>>> changes!
>>>>>
>>>>> I have attached the change list from CR9 to CR10 instead of 
>>>>> putting it in
>>>>> the body of this email. I've also added a link to the 
>>>>> CR9-to-CR10-changes
>>>>> file to the webrevs so it should be easy to find.
>>>>>
>>>>> Main bug URL:
>>>>>
>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>
>>>>> The project is currently baselined on jdk-15+11.
>>>>>
>>>>> Here's the full webrev URL for those folks that want to see all of 
>>>>> the
>>>>> current Async Monitor Deflation code in one go (v2.10 full):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for-jdk15+11.v2.10.full/ 
>>>>>
>>>>>
>>>>> Some folks might want to see just what has changed since the last 
>>>>> review
>>>>> cycle so here's a webrev for that (v2.10 inc):
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for-jdk15+11.v2.10.inc/ 
>>>>>
>>>>>
>>>>> Since we backed out the HandshakeAfterDeflateIdleMonitors option 
>>>>> and the
>>>>> C2 ref_count changes and updated the copyright years, the "inc" 
>>>>> webrev has
>>>>> a bit more noise in it than usual. Sorry about that!
>>>>>
>>>>> The OpenJDK wiki has been updated for this round of changes:
>>>>>
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>
>>>>> The jdk-15+11 based v2.10 version of the patch has been thru Mach5 
>>>>> tier[1-7]
>>>>> testing on Oracle's usual set of platforms. Mach5 tier8 is still 
>>>>> running.
>>>>> I'm running the v2.10 patch through my usual set of stress testing on
>>>>> Linux-X64 and macOSX.
>>>>>
>>>>> I'm planning to do a SPECjbb2015 round on the 
>>>>> CR10/v2.20/13-for-jdk15 bits.
>>>>>
>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> This project is no longer targeted to JDK14 so this is NOT an 
>>>>>> urgent code
>>>>>> review request.
>>>>>>
>>>>>> I've extracted the following three fixes from the Async Monitor 
>>>>>> Deflation
>>>>>> project code:
>>>>>>
>>>>>>     JDK-8235931 add OM_CACHE_LINE_SIZE and use smaller size on 
>>>>>> SPARCv9 and X64
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931
>>>>>>
>>>>>>     JDK-8236035 refactor ObjectMonitor::set_owner() and _owner 
>>>>>> field setting
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035
>>>>>>
>>>>>>     JDK-8235795 replace monitor list 
>>>>>> mux{Acquire,Release}(&gListLock) with spin locks
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795
>>>>>>
>>>>>> Each of these has been reviewed separately and will be pushed to 
>>>>>> JDK15
>>>>>> in the near future (possibly by the end of this week). Of course, 
>>>>>> there
>>>>>> were improvements during these review cycles and the purpose of this
>>>>>> e-mail is to provided updated webrevs for this fix 
>>>>>> (CR9/v2.09/12-for-jdk14)
>>>>>> within the revised context provided by {8235931, 8236035, 8235795}.
>>>>>>
>>>>>> Main bug URL:
>>>>>>
>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>
>>>>>> The project is currently baselined on jdk-14+34.
>>>>>>
>>>>>> Here's the full webrev URL for those folks that want to see all 
>>>>>> of the
>>>>>> current Async Monitor Deflation code along with {8235931, 
>>>>>> 8236035, 8235795}
>>>>>> in one go (v2.09b full):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09b.full/ 
>>>>>>
>>>>>>
>>>>>> Compare the open.patch file in 12-for-jdk14.v2.09.full and 
>>>>>> 12-for-jdk14.v2.09b.full
>>>>>> using your favorite file comparison/merge tool to see how Async 
>>>>>> Monitor Deflation
>>>>>> evolved due to {8235931, 8236035, 8235795}.
>>>>>>
>>>>>> Some folks might want to see just the Async Monitor Deflation 
>>>>>> code on top of
>>>>>> {8235931, 8236035, 8235795} so here's a webrev for that (v2.09b 
>>>>>> inc):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09b.inc/ 
>>>>>>
>>>>>>
>>>>>> These webrevs have gone thru several Mach5 Tier[1-8] runs along with
>>>>>> my usual stress testing and SPECjbb2015 testing and there aren't any
>>>>>> surprises relative to CR9/v2.09/12-for-jdk14.
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have made changes to the Async Monitor Deflation code in 
>>>>>>> response to
>>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. Thanks to David 
>>>>>>> H., Robbin
>>>>>>> and Erik O. for their comments!
>>>>>>>
>>>>>>> This project is no longer targeted to JDK14 so this is NOT an 
>>>>>>> urgent code
>>>>>>> review request. The primary purpose of this webrev is simply to 
>>>>>>> close the
>>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to let folks see how 
>>>>>>> I resolved
>>>>>>> the code review comments from that round.
>>>>>>>
>>>>>>> Most of the comments in the CR8/v2.08/11-for-jdk14 code review 
>>>>>>> cycle were
>>>>>>> on the monitor list changes so I'm going to take a look at 
>>>>>>> extracting those
>>>>>>> changes into a standalone patch. Switching from 
>>>>>>> Thread::muxAcquire(&gListLock)
>>>>>>> and Thread::muxRelease(&gListLock) to finer grained internal 
>>>>>>> spin locks needs
>>>>>>> to be thoroughly reviewed and the best way to do that is 
>>>>>>> separately from the
>>>>>>> Async Monitor Deflation changes. Thanks to Coleen for suggesting 
>>>>>>> doing this
>>>>>>> extraction earlier.
>>>>>>>
>>>>>>> I have attached the change list from CR8 to CR9 instead of 
>>>>>>> putting it in
>>>>>>> the body of this email. I've also added a link to the 
>>>>>>> CR8-to-CR9-changes
>>>>>>> file to the webrevs so it should be easy to find.
>>>>>>>
>>>>>>> Main bug URL:
>>>>>>>
>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>
>>>>>>> The project is currently baselined on jdk-14+26.
>>>>>>>
>>>>>>> Here's the full webrev URL for those folks that want to see all 
>>>>>>> of the
>>>>>>> current Async Monitor Deflation code in one go (v2.09 full):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09.full/ 
>>>>>>>
>>>>>>>
>>>>>>> Some folks might want to see just what has changed since the 
>>>>>>> last review
>>>>>>> cycle so here's a webrev for that (v2.09 inc):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for-jdk14.v2.09.inc/ 
>>>>>>>
>>>>>>>
>>>>>>> The OpenJDK wiki has NOT yet been updated for this round of 
>>>>>>> changes:
>>>>>>>
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>
>>>>>>>
>>>>>>> The jdk-14+26 based v2.09 version of the patch has been thru 
>>>>>>> Mach5 tier[1-7]
>>>>>>> testing on Oracle's usual set of platforms. Mach5 tier8 is still 
>>>>>>> running.
>>>>>>> A slightly older version of the v2.09 patch has also been 
>>>>>>> through my usual
>>>>>>> set of stress testing on Linux-X64 and macOSX with the addition 
>>>>>>> of Robbin's
>>>>>>> "MoCrazy 1024" test running in parallel on Linux-X64 with the 
>>>>>>> other tests in
>>>>>>> my lab. The "MoCrazy 1024" has been going for > 5 days and 6700+ 
>>>>>>> iterations
>>>>>>> without any failures.
>>>>>>>
>>>>>>> I'm planning to do a SPECjbb2015 round on the 
>>>>>>> CR9/v2.09/12-for-jdk14 bits.
>>>>>>>
>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have made changes to the Async Monitor Deflation code in 
>>>>>>>> response to
>>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. Thanks to David 
>>>>>>>> H., Robbin
>>>>>>>> and Erik O. for their comments!
>>>>>>>>
>>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, 2019 and the 
>>>>>>>> Async Monitor
>>>>>>>> Deflation project needs to push before Nov. 12, 2019 in order 
>>>>>>>> to allow
>>>>>>>> for sufficient bake time for such a big change. Nov. 12 is 
>>>>>>>> _next_ Tuesday
>>>>>>>> so we have 8 days from today to finish this code review cycle 
>>>>>>>> and push
>>>>>>>> this code for JDK14.
>>>>>>>>
>>>>>>>> Carsten and Roman! Time for you guys to chime in again on the 
>>>>>>>> code reviews.
>>>>>>>>
>>>>>>>> I have attached the change list from CR7 to CR8 instead of 
>>>>>>>> putting it in
>>>>>>>> the body of this email. I've also added a link to the 
>>>>>>>> CR7-to-CR8-changes
>>>>>>>> file to the webrevs so it should be easy to find.
>>>>>>>>
>>>>>>>> Main bug URL:
>>>>>>>>
>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>
>>>>>>>> The project is currently baselined on jdk-14+21.
>>>>>>>>
>>>>>>>> Here's the full webrev URL for those folks that want to see all 
>>>>>>>> of the
>>>>>>>> current Async Monitor Deflation code in one go (v2.08 full):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.full 
>>>>>>>>
>>>>>>>>
>>>>>>>> Some folks might want to see just what has changed since the 
>>>>>>>> last review
>>>>>>>> cycle so here's a webrev for that (v2.08 inc):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.inc/ 
>>>>>>>>
>>>>>>>>
>>>>>>>> The OpenJDK wiki did not need any changes for this round:
>>>>>>>>
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>
>>>>>>>>
>>>>>>>> The jdk-14+21 based v2.08 version of the patch has been thru 
>>>>>>>> Mach5 tier[1-8]
>>>>>>>> testing on Oracle's usual set of platforms. It has also been 
>>>>>>>> through my usual
>>>>>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with 
>>>>>>>> the addition
>>>>>>>> of Robbin's "MoCrazy 1024" test running in parallel with the 
>>>>>>>> other tests in
>>>>>>>> my lab. Some testing is still running, but so far there are no 
>>>>>>>> new regressions.
>>>>>>>>
>>>>>>>> I have not yet done a SPECjbb2015 round on the 
>>>>>>>> CR8/v2.08/11-for-jdk14 bits.
>>>>>>>>
>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> The Async Monitor Deflation project is reaching the end game. 
>>>>>>>>> I have no
>>>>>>>>> changes planned for the project at this time so all that is 
>>>>>>>>> left is code
>>>>>>>>> review and any changes that results from those reviews.
>>>>>>>>>
>>>>>>>>> Carsten and Roman! Time for you guys to chime in again on the 
>>>>>>>>> code reviews.
>>>>>>>>>
>>>>>>>>> I have attached the list of fixes from CR6 to CR7 instead of 
>>>>>>>>> putting it
>>>>>>>>> in the main body of this email.
>>>>>>>>>
>>>>>>>>> Main bug URL:
>>>>>>>>>
>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>
>>>>>>>>> The project is currently baselined on jdk-14+19.
>>>>>>>>>
>>>>>>>>> Here's the full webrev URL for those folks that want to see 
>>>>>>>>> all of the
>>>>>>>>> current Async Monitor Deflation code in one go (v2.07 full):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.full 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Some folks might want to see just what has changed since the 
>>>>>>>>> last review
>>>>>>>>> cycle so here's a webrev for that (v2.07 inc):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.inc/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The OpenJDK wiki has been updated to match the 
>>>>>>>>> CR7/v2.07/10-for-jdk14 changes:
>>>>>>>>>
>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The jdk-14+18 based v2.07 version of the patch has been thru 
>>>>>>>>> Mach5 tier[1-8]
>>>>>>>>> testing on Oracle's usual set of platforms. It has also been 
>>>>>>>>> through my usual
>>>>>>>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 
>>>>>>>>> with the addition
>>>>>>>>> of Robbin's "MoCrazy 1024" test running in parallel with the 
>>>>>>>>> other tests in
>>>>>>>>> my lab.
>>>>>>>>>
>>>>>>>>> The jdk-14+19 based v2.07 version of the patch has been thru 
>>>>>>>>> Mach5 tier[1-3]
>>>>>>>>> test on Oracle's usual set of platforms. Mach5 tier[4-8] are 
>>>>>>>>> in process.
>>>>>>>>>
>>>>>>>>> I did another round of SPECjbb2015 testing in Oracle's Aurora 
>>>>>>>>> Performance lab
>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs:
>>>>>>>>>
>>>>>>>>>     - "base" is jdk-14+18
>>>>>>>>>     - "v2.07" is the latest version and includes C2 
>>>>>>>>> inc_om_ref_count() support
>>>>>>>>>       on LP64 X64 and the new 
>>>>>>>>> HandshakeAfterDeflateIdleMonitors option
>>>>>>>>>     - "off" is with -XX:-AsyncDeflateIdleMonitors specified
>>>>>>>>>     - "handshake" is with 
>>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors specified
>>>>>>>>>
>>>>>>>>>          hbIR           hbIR
>>>>>>>>>     (max attempted)  (settled)  max-jOPS critical-jOPS runtime
>>>>>>>>>     ---------------  ---------  -------- ------------- -------
>>>>>>>>>            34282.00   30635.90  28831.30 20969.20 3841.30 base
>>>>>>>>>            34282.00   30973.00  29345.80 21025.20 3964.10 v2.07
>>>>>>>>>            34282.00   31105.60  29174.30 21074.00 3931.30 
>>>>>>>>> v2.07_handshake
>>>>>>>>>            34282.00   30789.70  27151.60 19839.10 3850.20 
>>>>>>>>> v2.07_off
>>>>>>>>>
>>>>>>>>>     - The Aurora Perf comparison tool reports:
>>>>>>>>>
>>>>>>>>>         Comparison              max-jOPS critical-jOPS
>>>>>>>>>         ---------------------- -------------------- 
>>>>>>>>> --------------------
>>>>>>>>>         base vs 2.07            +1.78% (s, p=0.000) +0.27% 
>>>>>>>>> (ns, p=0.790)
>>>>>>>>>         base vs 2.07_handshake  +1.19% (s, p=0.007) +0.58% 
>>>>>>>>> (ns, p=0.536)
>>>>>>>>>         base vs 2.07_off        -5.83% (ns, p=0.394) -5.39% 
>>>>>>>>> (ns, p=0.347)
>>>>>>>>>
>>>>>>>>>         (s) - significant  (ns) - not-significant
>>>>>>>>>
>>>>>>>>>     - For historical comparison, the Aurora Perf comparision tool
>>>>>>>>>         reported for v2.06 with a baseline of jdk-13+31:
>>>>>>>>>
>>>>>>>>>         Comparison              max-jOPS critical-jOPS
>>>>>>>>>         ---------------------- -------------------- 
>>>>>>>>> --------------------
>>>>>>>>>         base vs 2.06            -0.32% (ns, p=0.345) +0.71% 
>>>>>>>>> (ns, p=0.646)
>>>>>>>>>         base vs 2.06_off        +0.49% (ns, p=0.292) -1.21% 
>>>>>>>>> (ns, p=0.481)
>>>>>>>>>
>>>>>>>>>         (s) - significant  (ns) - not-significant
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> The Async Monitor Deflation project has rebased to JDK14 so 
>>>>>>>>>> it's time
>>>>>>>>>> for our first code review in that new context!!
>>>>>>>>>>
>>>>>>>>>> I've been focused on changing the monitor list management 
>>>>>>>>>> code to be
>>>>>>>>>> lock-free in order to make SPECjbb2015 happier. Of course 
>>>>>>>>>> with a change
>>>>>>>>>> like that, it takes a while to chase down all the new and 
>>>>>>>>>> wonderful
>>>>>>>>>> races. At this point, I have the code back to the same 
>>>>>>>>>> stability that
>>>>>>>>>> I had with CR5/v2.05/8-for-jdk13.
>>>>>>>>>>
>>>>>>>>>> To lay the ground work for this round of review, I pushed the 
>>>>>>>>>> following
>>>>>>>>>> two fixes to jdk/jdk earlier today:
>>>>>>>>>>
>>>>>>>>>>     JDK-8230184 rename, whitespace, indent and comments 
>>>>>>>>>> changes in preparation
>>>>>>>>>>                 for lock free Monitor lists
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230184
>>>>>>>>>>
>>>>>>>>>>     JDK-8230317 serviceability/sa/ClhsdbPrintStatics.java 
>>>>>>>>>> fails after 8230184
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230317
>>>>>>>>>>
>>>>>>>>>> I have attached the list of fixes from CR5 to CR6 instead of 
>>>>>>>>>> putting
>>>>>>>>>> in the main body of this email.
>>>>>>>>>>
>>>>>>>>>> Main bug URL:
>>>>>>>>>>
>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>
>>>>>>>>>> The project is currently baselined on jdk-14+11 plus the 
>>>>>>>>>> fixes for
>>>>>>>>>> JDK-8230184 and JDK-8230317.
>>>>>>>>>>
>>>>>>>>>> Here's the full webrev URL for those folks that want to see 
>>>>>>>>>> all of the
>>>>>>>>>> current Async Monitor Deflation code in one go (v2.06 full):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.full/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The primary focus of this review cycle is on the lock-free 
>>>>>>>>>> Monitor List
>>>>>>>>>> management changes so here's a webrev for just that patch 
>>>>>>>>>> (v2.06c):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06c.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The secondary focus of this review cycle is on the bug fixes 
>>>>>>>>>> that have
>>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so here's a webrev for 
>>>>>>>>>> just that
>>>>>>>>>> patch (v2.06b):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06b.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The third and final bucket for this review cycle is the 
>>>>>>>>>> rename, whitespace,
>>>>>>>>>> indent and comments changes made in preparation for lock free 
>>>>>>>>>> Monitor list
>>>>>>>>>> management. Almost all of that was extracted into JDK-8230184 
>>>>>>>>>> for the
>>>>>>>>>> baseline so this bucket now has just a few comment changes 
>>>>>>>>>> relative to
>>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for the remainder 
>>>>>>>>>> (v2.06a):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06a.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Some folks might want to see just what has changed since the 
>>>>>>>>>> last review
>>>>>>>>>> cycle so here's a webrev for that (v2.06 inc):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Last, but not least, some folks might want to see the code 
>>>>>>>>>> before the
>>>>>>>>>> addition of lock-free Monitor List management so here's a 
>>>>>>>>>> webrev for
>>>>>>>>>> that (v2.00 -> v2.05):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.05.inc/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The OpenJDK wiki will need minor updates to match the CR6 
>>>>>>>>>> changes:
>>>>>>>>>>
>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> but that should only be changes to describe per-thread list 
>>>>>>>>>> async monitor
>>>>>>>>>> deflation being done by the ServiceThread.
>>>>>>>>>>
>>>>>>>>>> (I did update the OpenJDK wiki for the CR5 changes back on 
>>>>>>>>>> 2019.08.14)
>>>>>>>>>>
>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>> testing on
>>>>>>>>>> Oracle's usual set of platforms. It has also been through my 
>>>>>>>>>> usual set
>>>>>>>>>> of stress testing on Linux-X64, macOSX and Solaris-X64.
>>>>>>>>>>
>>>>>>>>>> I did a bunch of SPECjbb2015 testing in Oracle's Aurora 
>>>>>>>>>> Performance lab
>>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs. 
>>>>>>>>>> This was using
>>>>>>>>>> this patch baselined on jdk-13+31 (for stability):
>>>>>>>>>>
>>>>>>>>>>           hbIR           hbIR
>>>>>>>>>>      (max attempted)  (settled)  max-jOPS critical-jOPS runtime
>>>>>>>>>>      ---------------  ---------  -------- ------------- -------
>>>>>>>>>>             34282.00   28837.20 27905.20 19817.40 3658.10 base
>>>>>>>>>>             34965.70   29798.80 27814.90 19959.00 3514.60 v2.06d
>>>>>>>>>>             34282.00   29100.70 28042.50 19577.00 3701.90 
>>>>>>>>>> v2.06d_off
>>>>>>>>>>             34282.00   29218.50 27562.80 19397.30 3657.60 
>>>>>>>>>> v2.06d_ocache
>>>>>>>>>>             34965.70   29838.30 26512.40 19170.60 3569.90 v2.05
>>>>>>>>>>             34282.00   28926.10 27734.00 19835.10 3588.40 
>>>>>>>>>> v2.05_off
>>>>>>>>>>
>>>>>>>>>> The "off" configs are with -XX:-AsyncDeflateIdleMonitors 
>>>>>>>>>> specified and
>>>>>>>>>> the "ocache" config is with 128 byte cache line sizes instead 
>>>>>>>>>> of 64 byte
>>>>>>>>>> cache lines sizes. "v2.06d" is the last set of changes that I 
>>>>>>>>>> made before
>>>>>>>>>> those changes were distributed into the "v2.06a", "v2.06b" 
>>>>>>>>>> and "v2.06c"
>>>>>>>>>> buckets for this review recycle.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> I've been focused on chasing down and fixing the rare test 
>>>>>>>>>>> failures
>>>>>>>>>>> that only pop up rarely. So this round is primarily fixes 
>>>>>>>>>>> for races
>>>>>>>>>>> with a few additional fixes that came from Karen's review of 
>>>>>>>>>>> CR4.
>>>>>>>>>>> Thanks Karen!
>>>>>>>>>>>
>>>>>>>>>>> I have attached the list of fixes from CR4 to CR5 instead of 
>>>>>>>>>>> putting
>>>>>>>>>>> in the main body of this email.
>>>>>>>>>>>
>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>
>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>
>>>>>>>>>>> The project is currently baselined on jdk-13+29. This will 
>>>>>>>>>>> likely be
>>>>>>>>>>> the last JDK13 baseline for this project and I'll roll to 
>>>>>>>>>>> the JDK14
>>>>>>>>>>> (jdk/jdk) repo soon...
>>>>>>>>>>>
>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.full/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.inc/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have not yet checked the OpenJDK wiki to see if it needs 
>>>>>>>>>>> any updates
>>>>>>>>>>> to match the CR5 changes:
>>>>>>>>>>>
>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 changes back on 
>>>>>>>>>>> 2019.06.26)
>>>>>>>>>>>
>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3] 
>>>>>>>>>>> testing on
>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running 
>>>>>>>>>>> now and
>>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off the usual stress 
>>>>>>>>>>> testing
>>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as those machines 
>>>>>>>>>>> become available.
>>>>>>>>>>> Since I haven't made any performance changes in this round, 
>>>>>>>>>>> I'll only
>>>>>>>>>>> be running SPECjbb2015 to gather the latest monitorinflation 
>>>>>>>>>>> logs.
>>>>>>>>>>>
>>>>>>>>>>> Next up:
>>>>>>>>>>>
>>>>>>>>>>> - We're still seeing 4-5% lower performance with SPECjbb2015 on
>>>>>>>>>>>   Linux-X64 and we've determined that some of that comes from
>>>>>>>>>>>   contention on the gListLock. So I'm going to investigate 
>>>>>>>>>>> removing
>>>>>>>>>>>   the gListLock. Yes, another lock free set of changes is 
>>>>>>>>>>> coming!
>>>>>>>>>>> - Of course, going lock free often causes new races and new 
>>>>>>>>>>> failures
>>>>>>>>>>>   so that's a good reason for make those changes isolated in 
>>>>>>>>>>> their
>>>>>>>>>>>   own round (and not holding up CR5/v2.05/8-for-jdk13 anymore).
>>>>>>>>>>> - I finally have a potential fix for the Win* failure with
>>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>>>   but I haven't run it through Mach5 yet so it'll be in the 
>>>>>>>>>>> next round.
>>>>>>>>>>> - Some RTM tests were recently re-enabled in Mach5 and I'm 
>>>>>>>>>>> seeing some
>>>>>>>>>>>   monitor related failures there. I suspect that I need to 
>>>>>>>>>>> go take a
>>>>>>>>>>>   look at the C2 RTM macro assembler code and look for 
>>>>>>>>>>> things that might
>>>>>>>>>>>   conflict if Async Monitor Deflation. If you're interested 
>>>>>>>>>>> in that kind
>>>>>>>>>>>   of issue, then see the macroAssembler_x86.cpp sanity check 
>>>>>>>>>>> that I
>>>>>>>>>>>   added in this round!
>>>>>>>>>>>
>>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>
>>>>>>>>>>>> I have a fix for an issue that came up during performance 
>>>>>>>>>>>> testing.
>>>>>>>>>>>> Many thanks to Robbin for diagnosing the issue in his 
>>>>>>>>>>>> SPECjbb2015
>>>>>>>>>>>> experiments.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the list of changes from CR3 to CR4. The list is a bit
>>>>>>>>>>>> verbose due to the complexity of the issue, but the changes
>>>>>>>>>>>> themselves are not that big.
>>>>>>>>>>>>
>>>>>>>>>>>> Functional:
>>>>>>>>>>>>   - Change SafepointSynchronize::is_cleanup_needed() from 
>>>>>>>>>>>> calling
>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to calling
>>>>>>>>>>>> ObjectSynchronizer::is_safepoint_deflation_needed():
>>>>>>>>>>>>     - is_safepoint_deflation_needed() returns the result of
>>>>>>>>>>>>       monitors_used_above_threshold() for safepoint based
>>>>>>>>>>>>       monitor deflation (!AsyncDeflateIdleMonitors).
>>>>>>>>>>>>     - For AsyncDeflateIdleMonitors, it only returns true if
>>>>>>>>>>>>       there is a special deflation request, e.g., System.gc()
>>>>>>>>>>>>       - This solves a bug where there are a bunch of Cleanup
>>>>>>>>>>>>         safepoints that simply request async deflation which
>>>>>>>>>>>>         keeps the async JavaThreads from making progress on
>>>>>>>>>>>>         their async deflation work.
>>>>>>>>>>>>   - Add AsyncDeflationInterval diagnostic option. Description:
>>>>>>>>>>>>       Async deflate idle monitors every so many 
>>>>>>>>>>>> milliseconds when
>>>>>>>>>>>>       MonitorUsedDeflationThreshold is exceeded (0 is off).
>>>>>>>>>>>>   - Replace 
>>>>>>>>>>>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with
>>>>>>>>>>>> ObjectSynchronizer::is_async_deflation_needed():
>>>>>>>>>>>>     - is_async_deflation_needed() returns true when
>>>>>>>>>>>>       is_async_cleanup_requested() is true or when
>>>>>>>>>>>>       monitors_used_above_threshold() is true (but no more 
>>>>>>>>>>>> often than
>>>>>>>>>>>>       AsyncDeflationInterval).
>>>>>>>>>>>>     - if AsyncDeflateIdleMonitors Service_lock->wait() now 
>>>>>>>>>>>> waits for
>>>>>>>>>>>>       at most GuaranteedSafepointInterval millis:
>>>>>>>>>>>>       - This allows is_async_deflation_needed() to be 
>>>>>>>>>>>> checked at
>>>>>>>>>>>>         the same interval as GuaranteedSafepointInterval.
>>>>>>>>>>>>         (default is 1000 millis/1 second)
>>>>>>>>>>>>       - Once is_async_deflation_needed() has returned true, it
>>>>>>>>>>>>         generally cannot return true for 
>>>>>>>>>>>> AsyncDeflationInterval.
>>>>>>>>>>>>         This is to prevent async deflation from swamping the
>>>>>>>>>>>>         ServiceThread.
>>>>>>>>>>>>   - The ServiceThread still handles async deflation of the 
>>>>>>>>>>>> global
>>>>>>>>>>>>     in-use list and now it also marks JavaThreads for async 
>>>>>>>>>>>> deflation
>>>>>>>>>>>>     of their in-use lists.
>>>>>>>>>>>>     - The ServiceThread will check for async deflation work 
>>>>>>>>>>>> every
>>>>>>>>>>>>       GuaranteedSafepointInterval.
>>>>>>>>>>>>     - A safepoint can still cause the ServiceThread to 
>>>>>>>>>>>> check for
>>>>>>>>>>>>       async deflation work via is_async_deflation_requested.
>>>>>>>>>>>>   - Refactor code from 
>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() into
>>>>>>>>>>>>     monitors_used_above_threshold() and remove 
>>>>>>>>>>>> is_cleanup_needed().
>>>>>>>>>>>>   - In addition to System.gc(), the VM_Exit VM op and the 
>>>>>>>>>>>> final
>>>>>>>>>>>>     VMThread safepoint now set the 
>>>>>>>>>>>> is_special_deflation_requested
>>>>>>>>>>>>     flag to reduce the in-use monitor population that is 
>>>>>>>>>>>> reported by
>>>>>>>>>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit.
>>>>>>>>>>>>
>>>>>>>>>>>> Test update:
>>>>>>>>>>>>   - test/hotspot/gtest/oops/test_markOop.cpp is updated to 
>>>>>>>>>>>> work with
>>>>>>>>>>>>     AsyncDeflateIdleMonitors.
>>>>>>>>>>>>
>>>>>>>>>>>> Collateral:
>>>>>>>>>>>>   - Add/clarify/update some logging messages.
>>>>>>>>>>>>
>>>>>>>>>>>> Cleanup:
>>>>>>>>>>>>   - Updated comments based on Karen's code review.
>>>>>>>>>>>>   - Change 'special cleanup' -> 'special deflation' and
>>>>>>>>>>>>     'async cleanup' -> 'async deflation'.
>>>>>>>>>>>>     - comment and function name changes
>>>>>>>>>>>>   - Clarify MonitorUsedDeflationThreshold description;
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>
>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>
>>>>>>>>>>>> The project is currently baselined on jdk-13+22.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.full/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.inc/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I have not updated the OpenJDK wiki to reflect the CR4 
>>>>>>>>>>>> changes:
>>>>>>>>>>>>
>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The wiki doesn't say a whole lot about the async deflation 
>>>>>>>>>>>> invocation
>>>>>>>>>>>> mechanism so I have to figure out how to add that content.
>>>>>>>>>>>>
>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>>>> testing on
>>>>>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit 
>>>>>>>>>>>> run is
>>>>>>>>>>>> running now. Kitchensink8H on product, fastdebug, and 
>>>>>>>>>>>> slowdebug bits
>>>>>>>>>>>> are running on Linux-X64, MacOSX and Solaris-X64. I still 
>>>>>>>>>>>> have to run
>>>>>>>>>>>> my stress kit on Linux-X64. I still have to run the 
>>>>>>>>>>>> SPECjbb2015
>>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had some discussions with Karen about a race that was in 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> ObjectMonitor::enter() code in CR2/v2.02/5-for-jdk13. This 
>>>>>>>>>>>>> race was
>>>>>>>>>>>>> theoretical and I had no test failures due to it. The fix 
>>>>>>>>>>>>> is pretty
>>>>>>>>>>>>> simple: remove the special case code for async deflation 
>>>>>>>>>>>>> in the
>>>>>>>>>>>>> ObjectMonitor::enter() function and rely solely on the 
>>>>>>>>>>>>> ref_count
>>>>>>>>>>>>> for ObjectMonitor::enter() protection.
>>>>>>>>>>>>>
>>>>>>>>>>>>> During those discussions Karen also floated the idea of 
>>>>>>>>>>>>> using the
>>>>>>>>>>>>> ref_count field instead of the contentions field for the 
>>>>>>>>>>>>> Async
>>>>>>>>>>>>> Monitor Deflation protocol. I decided to go ahead and code 
>>>>>>>>>>>>> up that
>>>>>>>>>>>>> change and I have run it through the usual stress and 
>>>>>>>>>>>>> Mach5 testing
>>>>>>>>>>>>> with no issues. It's also known as v2.03 (for those for 
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for those with webrev 
>>>>>>>>>>>>> URLs).
>>>>>>>>>>>>> Sorry for all the names...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>
>>>>>>>>>>>>> The project is currently baselined on jdk-13+18.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.full/ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the incremental webrev URL:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.inc/ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have also updated the OpenJDK wiki to reflect the CR3 
>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>>>>> testing on
>>>>>>>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit 
>>>>>>>>>>>>> run had
>>>>>>>>>>>>> no issues. Kitchensink8H on product, fastdebug, and 
>>>>>>>>>>>>> slowdebug bits
>>>>>>>>>>>>> had no failures on Linux-X64; MacOSX fastdebug and 
>>>>>>>>>>>>> slowdebug and
>>>>>>>>>>>>> Solaris-X64 release had the usual "Too large time diff" 
>>>>>>>>>>>>> complaints.
>>>>>>>>>>>>> 12 hour Inflate2 runs on product, fastdebug and slowdebug 
>>>>>>>>>>>>> bits on
>>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no failures. My 
>>>>>>>>>>>>> Linux-X64
>>>>>>>>>>>>> stress kit is running right now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've done the SPECjbb2015 baseline and CR3 runs. I need to 
>>>>>>>>>>>>> gather
>>>>>>>>>>>>> the results and analyze them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have a small but important bug fix for the Async 
>>>>>>>>>>>>>> Monitor Deflation
>>>>>>>>>>>>>> project ready to go. It's also known as v2.02 (for those 
>>>>>>>>>>>>>> for with the
>>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for those with webrev 
>>>>>>>>>>>>>> URLs). Sorry
>>>>>>>>>>>>>> for all the names...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two days ago so that 
>>>>>>>>>>>>>> baseline patch
>>>>>>>>>>>>>> is out of our hair.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The project is currently baselined on jdk-13+17.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's the full webrev URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the 
>>>>>>>>>>>>>> CR2 changes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-6] 
>>>>>>>>>>>>>> testing on
>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[7-8] is 
>>>>>>>>>>>>>> running now.
>>>>>>>>>>>>>> My stress kit is running on Solaris-X64 now. 
>>>>>>>>>>>>>> Kitchensink8H is running
>>>>>>>>>>>>>> now on product, fastdebug, and slowdebug bits on 
>>>>>>>>>>>>>> Linux-X64, MacOSX
>>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs are running now on 
>>>>>>>>>>>>>> product,
>>>>>>>>>>>>>> fastdebug and slowdebug bits on Linux-X64, MacOSX and 
>>>>>>>>>>>>>> Solaris-X64.
>>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 sometime on 
>>>>>>>>>>>>>> Sunday (after
>>>>>>>>>>>>>> my jdk-13+18 stress run is done).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 runs after all the 
>>>>>>>>>>>>>> stress
>>>>>>>>>>>>>> testing is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I finally have CR1 for the Async Monitor Deflation 
>>>>>>>>>>>>>>> project ready to
>>>>>>>>>>>>>>> go. It's also known as v2.01 (for those for with the 
>>>>>>>>>>>>>>> patches) and as
>>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with webrev URLs). Sorry 
>>>>>>>>>>>>>>> for all the
>>>>>>>>>>>>>>> names...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Main bug URL:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Baseline bug fixes URL:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     JDK-8222295 more baseline cleanups from Async 
>>>>>>>>>>>>>>> Monitor Deflation project
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The project is currently baselined on jdk-13+15.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's the webrev for the latest baseline changes 
>>>>>>>>>>>>>>> (JDK-8222295):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 only):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I'm looking for reviews for both JDK-8222295 and the 
>>>>>>>>>>>>>>> latest version
>>>>>>>>>>>>>>> of JDK-8153224...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I still have to update the OpenJDK wiki to reflect the 
>>>>>>>>>>>>>>> CR changes:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3] 
>>>>>>>>>>>>>>> testing on
>>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is 
>>>>>>>>>>>>>>> running now and
>>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. My stress kit on 
>>>>>>>>>>>>>>> Solaris-X64
>>>>>>>>>>>>>>> is running now. Linux-X64 stress testing will start on 
>>>>>>>>>>>>>>> Sunday. I'm
>>>>>>>>>>>>>>> planning to do Kitchensink runs, SPECjbb2015 runs and my 
>>>>>>>>>>>>>>> monitor
>>>>>>>>>>>>>>> inflation stress tests on Linux-X64, MacOSX and 
>>>>>>>>>>>>>>> Solaris-X64.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread for my port of 
>>>>>>>>>>>>>>>> Carsten's work on:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here's a link to Carsten's original webrev:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Earlier versions of this patch have been through 
>>>>>>>>>>>>>>>> several rounds of
>>>>>>>>>>>>>>>> preliminary review. Many thanks to Carsten, Coleen, 
>>>>>>>>>>>>>>>> Robbin, and
>>>>>>>>>>>>>>>> Roman for their preliminary code review comments. A 
>>>>>>>>>>>>>>>> very special
>>>>>>>>>>>>>>>> thanks to Robbin and Roman for building and testing the 
>>>>>>>>>>>>>>>> patch in
>>>>>>>>>>>>>>>> their own environments (including specJBB2015).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>>>>>>>> testing on
>>>>>>>>>>>>>>>> Oracle's usual set of platforms. Earlier versions have 
>>>>>>>>>>>>>>>> been run
>>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 
>>>>>>>>>>>>>>>> servers
>>>>>>>>>>>>>>>> (product, fastdebug, slowdebug).Earlier versions have 
>>>>>>>>>>>>>>>> run Kitchensink
>>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 
>>>>>>>>>>>>>>>> (product, fastdebug
>>>>>>>>>>>>>>>> and slowdebug). Earlier versions have run my monitor 
>>>>>>>>>>>>>>>> inflation stress
>>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 
>>>>>>>>>>>>>>>> (product,
>>>>>>>>>>>>>>>> fastdebug and slowdebug).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> All of the testing done on earlier versions will be 
>>>>>>>>>>>>>>>> redone on the
>>>>>>>>>>>>>>>> latest version of the patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks, in advance, for any questions, comments or 
>>>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> P.S.
>>>>>>>>>>>>>>>> One subtest in 
>>>>>>>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on Win* only. I've 
>>>>>>>>>>>>>>>> been trying
>>>>>>>>>>>>>>>> to characterize/analyze this failure for more than a 
>>>>>>>>>>>>>>>> week now. At
>>>>>>>>>>>>>>>> this point I'm convinced that Async Monitor Deflation 
>>>>>>>>>>>>>>>> is aggravating
>>>>>>>>>>>>>>>> an existing bug. However, I plan to have a better 
>>>>>>>>>>>>>>>> handle on that
>>>>>>>>>>>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>