RFR(L) 8153224 Monitor deflation prolong safepoints (CR8/v2.08/11-for-jdk14)

Tue Dec 10 21:10:55 UTC 2019

Hi Robbin,

Again, thanks for the thorough review!!

As always, replies are embedded below...

On 11/8/19 8:35 AM, Robbin Ehn wrote:
> Hi Dan,
>
> Thanks for looking into this, some comments on v8:
>
> ##################
> src/hotspot/cpu/sparc/globalDefinitions_sparc.hpp
> src/hotspot/cpu/x86/globalDefinitions_x86.hpp
> src/hotspot/share/logging/logTag.hpp
> src/hotspot/share/oops/markWord.hpp
> src/hotspot/share/runtime/basicLock.cpp
> src/hotspot/share/runtime/safepoint.cpp
> src/hotspot/share/runtime/serviceThread.cpp
> src/hotspot/share/runtime/sharedRuntime.cpp
> src/hotspot/share/runtime/synchronizer.hpp
> src/hotspot/share/runtime/vmOperations.cpp
> src/hotspot/share/runtime/vmOperations.hpp
> src/hotspot/share/runtime/vmStructs.cpp
> src/hotspot/share/runtime/vmThread.cpp
> test/hotspot/gtest/oops/test_markWord.cpp
>
> No comments.
>
> ##################
> I don't see the benefit of having the 
> -HandshakeAfterDeflateIdleMonitors code paths.
> Removing that option would mean these files can be reverted:
> src/hotspot/cpu/aarch64/globals_aarch64.hpp
> src/hotspot/cpu/arm/globals_arm.hpp
> src/hotspot/cpu/ppc/globals_ppc.hpp
> src/hotspot/cpu/s390/globals_s390.hpp
> src/hotspot/cpu/sparc/globals_sparc.hpp
> src/hotspot/cpu/x86/globals_x86.hpp
> src/hotspot/cpu/x86/macroAssembler_x86.cpp
> src/hotspot/cpu/x86/macroAssembler_x86.hpp
> src/hotspot/cpu/zero/globals_zero.hpp
>
> And one less option here:
> src/hotspot/share/runtime/globals.hpp

The fate of the -XX:[+-]HandshakeAfterDeflateIdleMonitors option will be
decided after we reach another stabilization point and have perf tested
the changes again.

> ##################
> src/hotspot/share/prims/jvm.cpp
>
> Unclear if this is a good idea.

The whole is_special_deflation_requested() mechanism was necessary
when we had JavaThreads deflating their own ObjectMonitors. Now that
the ServiceThread is doing the async deflation, the mechanism is only
around as a potential work around for unforeseen issues.

Even if it stays for the initial push of this project, I can see it
going away in the short term future like the AsyncDeflateIdleMonitors
option and ADIM_guarantee()...

> ##################
> src/hotspot/share/prims/whitebox.cpp
>
> This would assume the test expects the right thing, but that is not 
> obvious.

This is the equivalent of the System.gc() support for special deflation
requests in src/hotspot/share/prims/jvm.cpp. As usual, we have more than
one way of doing things in this VM... :-) Although the whitebox APIs here
are more fine grained about the requested GC operation than System.gc().

The fate of this change is like that of src/hotspot/share/prims/jvm.cpp
changes above.

> ##################
> src/hotspot/share/prims/jvmtiEnvBase.cpp
>
> The current pending and waiting monitor is only changed by the 
> JavaThread itself.
> It only sets it after _contentions is increased.
> It clears it before _contentions is decreased.
> We are depending on safepoint or the thread is suspended, so it can't 
> be deflated since _contentions are > 0.
> Plus the thread have already increased the ref count and can't 
> decrease it (since at safepoint or suspended).

Context: This comment is about these functions:

   L649: JvmtiEnvBase::get_current_contended_monitor(...
   L724: JvmtiEnvBase::get_locked_objects_in_frame(...

which both call:

     Thread::current_waiting_monitor(&omh)
     Thread::current_pending_monitor(&omh)

It is also about:

   L686: JvmtiEnvBase::get_owned_monitors(...

which calls get_locked_objects_in_frame().

The _current_pending_monitor is set and cleared by
set_current_pending_monitor(ObjectMonitor* monitor).
The _current_waiting_monitor is set and cleared by
set_current_waiting_monitor(ObjectMonitor* monitor).
Both of the setter functions have been updated to manage the
ObjectMonitor's ref_count (inc or dec as needed).

JvmtiEnvBase::get_current_contended_monitor() is called from two places:

     JvmtiEnv::GetCurrentContendedMonitor()
         - when the target thread is the current thread

     VM_GetCurrentContendedMonitor::doit()
         - when the target thread is NOT the current thread

When the target thread is the current thread it cannot make a call to
set_current_pending_monitor(NULL) or set_current_waiting_monitor(NULL)
since it is executing JVM/TI code likely called from an event handler.

When the target thread is NOT the current thread, we use a safepoint
VM operation to do the work so the target thread is at a safepoint and
it cannot make a call to set_current_pending_monitor(NULL) or
set_current_waiting_monitor(NULL).

So JvmtiEnvBase::get_current_contended_monitor() does not need to use
an ObjectMonitorHandle to protect the ObjectMonitor* that we get back
from Thread::current_waiting_monitor() or Thread::current_pending_monitor().

Note: There is also ThreadService::get_current_contended_monitor() but
     that one is in a different source file so we'll skip it for now.

JvmtiEnvBase::get_locked_objects_in_frame() is only called from
JvmtiEnvBase::get_owned_monitors() and that function is called from
three places:

     JvmtiEnv::GetOwnedMonitorInfo()
         - when the target thread is the current thread

     JvmtiEnv::GetOwnedMonitorStackDepthInfo()
         - when the target thread is the current thread

     VM_GetOwnedMonitorInfo::doit()
         - when the target thread is NOT the current thread

Again when the target thread is the current thread it cannot make a call
to set_current_pending_monitor(NULL) or set_current_waiting_monitor(NULL)
since it is executing JVM/TI code likely called from an event handler.

Again when the target thread is NOT the current thread, we use a safepoint
VM operation to do the work so the target thread is at a safepoint and
it cannot make a call to set_current_pending_monitor(NULL) or
set_current_waiting_monitor(NULL).

So JvmtiEnvBase::get_locked_objects_in_frame() does not need to use
an ObjectMonitorHandle to protect the ObjectMonitor* that we get back
from Thread::current_waiting_monitor() or Thread::current_pending_monitor().

Okay, let's look at other callers of Thread::current_waiting_monitor():

     javaVFrame::locked_monitors()
         - only called when at a safepoint or by the current thread

     ThreadService::get_current_contended_monitor()
         - ThreadSnapshot::initialize() which is called by:
           - ThreadDumpResult::add_thread_snapshot() which is called by:
             - VM_ThreadDump::doit()
               - called at a safepoint VM operation
             - VM_ThreadDump::snapshot_thread()
               - called by VM_ThreadDump::doit() (safepoint VM op)
             - jmm_GetThreadInfo()
               - This is a JVM_ENTRY point called via:
                 jmmInterface_1_ jmm_interface
               - src/java.management/share/native/libmanagement/ThreadImpl.c

                 JNIEXPORT void JNICALL
                 Java_sun_management_ThreadImpl_getThreadInfo1
                   (JNIEnv *env, jclass cls, jlongArray ids, jint maxDepth,
                    jobjectArray infoArray)
                 {
                    jmm_interface->GetThreadInfo(env, ids, maxDepth, 
infoArray);
                 }

So java/lang/management/ThreadInfo gives a Java program access to
ThreadService::get_current_contended_monitor() for any target thread
without going to a safepoint:

https://docs.oracle.com/en/java/javase/11/docs/api/java.management/java/lang/management/ThreadInfo.html

This should sound familar. We went down into this code when we did
the Thread-SMR work. We had to get ThreadLists in place to prevent a
JavaThread from exiting while we were trying to query it... we weren't
at a safepoint so the JavaThread could exit at anytime...

ThreadService::get_current_contended_monitor() gets us to both:

     Thread::current_waiting_monitor(&omh)
     Thread::current_pending_monitor(&omh)

So we have one call site of these two functions that needs the
ObjectMonitorHandle for safety. I could change those two functions
to accept a NULL for those call sites where we have reasoned out
that the world is safe and add a comment to those call sites
explaining why they are safe. However, we have been trying very
hard to do things in code to make things safe rather than relying
on comments.

Okay, for completeness, let's look at other callers of
Thread::current_pending_monitor():

     GrowableArray<JavaThread*>* 
Threads::get_pending_threads(ThreadsList * t_list,
       - Only uses the return from current_pending_monitor() as a compare
         value against the target monitor. Only saves thread references that
         match so the ObjectMonitor* ref does not need an 
ObjectMonitorHandle.

       So this another case where we have reasoned about the call site and
       determined it to be safe.

       Finishing the caller analysis for completeness:

       - called by JvmtiEnvBase::get_object_monitor_usage(JavaThread* 
calling_thread,
         jobject object, jvmtiMonitorUsage* info_ptr) {
         - Initially called at a non-safepoint if the thread that owns the
           object's monitor and all pending threads are suspended. If any
           are not suspended, then we return an intermediate error and try
           again at a safepoint.

           Note: This means that the first get_pending_threads() call 
does its
           work at a non-safepoint, but we've determined from code analysis
           that get_pending_thread() is safe.
         - Called at a safepoint if we detect a target thread isn't 
suspended.

     javaVFrame::print_lock_info_on(outputStream* st, int frame_count) {
        - Discussed below in src/hotspot/share/runtime/vframe.cpp analysis.

     ThreadService::find_deadlocks_at_safepoint(ThreadsList * t_list, 
bool concurrent_locks) {
     DeadlockCycle::print_on_with(ThreadsList * t_list, outputStream* 
st) const {
        - Discussed below in 
src/hotspot/share/services/threadService.cpp analysis.

     Handle ThreadService::get_current_contended_monitor(JavaThread* 
thread) {
       - already analyzed above.

My current plan is to leave the jvmtiEnvBase.cpp changes in place because
the ObjectMonitorHandle is needed by M&M version of the API. Please let me
know if this decision is acceptable.

Update: I've gone through and added a comment at the decl for the
ObjectMonitorHandles associated the calls to:

     Thread::current_waiting_monitor(&omh)
     Thread::current_pending_monitor(&omh)

Hopefully, it makes things more clear.

> ##################
> src/hotspot/share/runtime/objectMonitor.cpp
>
> ###1
> You have several these (and in other files):
> 242   jint l_ref_count = ref_count();
> 243   ADIM_guarantee(l_ref_count > 0, "must be positive: 
> l_ref_count=%d, ref_count=%d", l_ref_count, ref_count());
> Please use Atomic::load() in ref_count.
> Since this is dependent on ref_count being volatile, otherwise the 
> compiler may only do one load.

As discussed in a sub-thread of this thread (CR8/v2.08/11-for-jdk14),
I'll change ref_count() to use Atomic::load().

> ###2
> 307   // Prevent deflation. See ObjectSynchronizer::deflate_monitor(),
> ...
> 311   Atomic::add(1, &_contentions);
> In ObjectSynchronizer::deflate_monitor if you would check ref count 
> instead of _contetion, we could remove contention.

We do check ref_count in ObjectSynchronizer::deflate_monitor().
We had to add it because an ObjectMonitorHandle can be held across
a safepoint to keep an ObjectMonitor* alive.

> Since all waiters also have a ref count it looks like we don't need 
> waiters either.
> In ObjectSynchronizer::deflate_monitor:
> if (mid->_contentions != 0 || mid->_waiters != 0) {
> Why not just do:
> if (mid->ref_count()) {
> ?

More context from v2.08:

  307   // Prevent deflation. See ObjectSynchronizer::deflate_monitor(),
  308   // ObjectSynchronizer::deflate_monitor_using_JT() and is_busy().
  309   // Ensure the object <-> monitor relationship remains stable while
  310   // there's contention.
  311   Atomic::add(1, &_contentions);

The comment is a bit stale with the addition of ref_count since a
ref_count > 0 will keep the ObjectMonitor from being deflated.

The code switched from:

     old L299:   Atomic::inc(&_contentions);

to:

     new L311:   Atomic::add(1, &_contentions);

but we can go back to the original code since we don't care about the
return value from Atomic::add() anymore (Carsten's prototype did care).
I missed this reversion from when we added ref_count (and stopped using
contentions for the async deflate protocol).

Also notice that both _contentions and _waiters provide information to
other parts of the system:

- jvmtiEnvBase.cpp: JvmtiEnvBase::get_object_monitor_usage() does this:

   if (mon != NULL) {
     // this object has a heavyweight monitor
     nWant = mon->contentions(); // # of threads contending for monitor
     nWait = mon->waiters();     // # of threads in Object.wait()
     ret.waiter_count = nWant + nWait;
     ret.notify_waiter_count = nWait;

   For JVM/TI, there is a bunch more code that uses waiters().

- threadService.hpp: JavaThreadBlockedOnMonitorEnterState does this:

     if (is_alive() && obj_m->contentions() > 0) {
       _stat = java_thread->get_thread_stat();
       _active = contended_enter_begin(java_thread);
     }

So at this point contentions and waiters are just providers of interesting
count info for the system. The waiters count can probably be derived by
counting the number of elements on the ObjectMonitor's waiters list. The
contentions count is more difficult since we'd have to visit every thread
to see if that thread is contending on our ObjectMonitor. Neither alternate
implementation is as cheap as the _contentions or _waiters counters.

I'll change the comment and code like this:

  307   // Keep track of contention for JVM/TI and M&M queries.
  308   Atomic::inc(&_contentions);

I'll revisit other mentions of contentions and waiters in the code to see
if other adjustments would be helpful.

> ##################
> src/hotspot/share/runtime/objectMonitor.hpp
>
> ###1
>  252   intptr_t is_busy() const {
>  253     // TODO-FIXME: assert _owner == null implies _recursions = 0
>  254     // We do not include _ref_count in the is_busy() check because
>  255     // _ref_count is for indicating that the ObjectMonitor* is in
>  256     // use which is orthogonal to whether the ObjectMonitor itself
>  257     // is in use for a locking operation.
>
> But in the non-debug code we always check:
> +  if (mid->is_busy() || mid->ref_count() != 0) {
>
> So it seem like you should have a method including ref count.

The comment on L254-L257 is pretty clear on the reason for not using
ref_count in is_busy() so I won't repeat it...

Changing is_busy() to include _ref_count would cause some assert()'s
and guarantee()'s to fire, but I think you noticed that when you said:

> But in the non-debug code we always check:

It would also cause races with checks on free list entries via
ObjectSynchronizer::chk_free_entry() when an ObjectMonitorHandle
is created for an ObjectMonitor* and then abandoned because the
code detected that the ObjectMonitor* can't be used so we bail
and retry. The checking code would detect these "flickers" and
issue errors.

> ##################
> src/hotspot/share/runtime/objectMonitor.inline.hpp
>
> Use Atomic::load for ref count.

As discussed in a sub-thread of this thread (CR8/v2.08/11-for-jdk14),
I'll change ref_count() to use Atomic::load().

> ##################
> src/hotspot/share/runtime/synchronizer.cpp
>
> ###1
>  139 static volatile int g_om_free_count = 0;    // # on g_free_list
>  140 static volatile int g_om_in_use_count = 0;  // # on g_om_in_use_list
>  141 static volatile int g_om_population = 0;    // # Extant -- in 
> circulation
>  142 static volatile int g_om_wait_count = 0;    // # on g_wait_list
> No padding here, aren't they more contended than the fields in the OM?

I put all global lists except for g_block_list and all of the global
counter variables on their own cache lines.

> ###2
> 151 static bool is_next_marked(ObjectMonitor* om) {
>
> Is only used in ObjectSynchronizer::om_flush.
> Here you fetch a OM and read the next field, this do not need LA 
> semantics on supported platforms.
> This would only need Atomic::load.

Renamed is_next_marked() -> is_locked() and switched to Atomic::load().

> ###3
> 191 static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
>
> In no place you need SR,

I think you meant "RS" here since we were using release_store() here.

I changed set_next() to this:

// Set the next field in an ObjectMonitor to the specified value.
static void set_next(ObjectMonitor* om, ObjectMonitor* value) {
   Atomic::store(&om->_next_om, value);
}

> in the only places it would made a difference:
>  345       OrderAccess::storestore();
>  346       set_next(cur, next);  // Unmark the previous list head.

Context: This is the v2.08 prepend_to_common() code:

  340     if (mark_list_head(list_p, &cur, &next)) {
  341       // List head is now marked so we can safely switch it.
  342       set_next(m, cur);  // m now points to cur (and unmarks m)
  343       *list_p = m;  // Switch list head to unmarked m.
  344       // mark_list_head() used cmpxchg() above, switching list 
head can be lazier:
  345       OrderAccess::storestore();
  346       set_next(cur, next);  // Unmark the previous list head.

This is what I've changed it to:

     if ((cur = get_list_head_locked(list_p)) != NULL) {
       // List head is now locked so we can safely switch it.
       set_next(m, cur);  // m now points to cur (and unlocks m)
       Atomic::store(list_p, m);  // Switch list head to unlocked m.
       om_unlock(cur);

The "Atomic::store(list_p, m)" is still lazy and I don't
think I need the OrderAccess::storestore() anymore.

> and
> 1714     OrderAccess::storestore();
> 1715     set_next(in_use_list, next);

Context: This is the v2.08 om_flush() code:

1711     // Clear the in-use list head (which also unmarks it):
1712     self->om_in_use_list = (ObjectMonitor*)NULL;
1713     // mark_list_head() used cmpxchg() above, clearing the 
disconnected list head can be lazier:
1714     OrderAccess::storestore();
1715     set_next(in_use_list, next);
1716   }

This is what I've changed it to:

     // Clear the in-use list head (which also unlocks it):
     Atomic::store(&self->om_in_use_list, (ObjectMonitor*)NULL);
     om_unlock(in_use_list);
   }

The "Atomic::store(&self->om_in_use_list, (ObjectMonitor*)NULL)" is still
lazy and I don't think I need the OrderAccess::storestore() anymore.

>
> You have a storestore already!

No longer have the storestore() in those two locations, but I think
switching the list head updates to use Atomic::store() keeps my
updates lazy (but still correct!).

> This code reads as:
> OrderAccess::storestore();
> OrderAccess::loadstore();
> OrderAccess::storestore();
> om->_next_om = value
>
> So it should be an Atomic::store.

I pasted the new code above and it now uses Atomic::store().

> ###4
> 198 static bool mark_list_head(ObjectMonitor* volatile * list_p
>
> Since the mark is an embedded spinlock I think the terminology should 
> be changed.

Yup. We talked about it in June using "marking the next field" for
the concept and I literally coded exactly that. David H. had a similar
comment about not hiding the fact that this is really a spinlock.

Renamed mark_list_head() -> get_list_head_locked().

> (that the spinlock is inside a the next pointer should be abstracted 
> away)

It is mostly abstracted away now. I also put the lock related static
functions in a "Spinlock functions" section. I've only left two
functions that mention marking: mark_om_ptr() and unmarked_next().
I didn't like the name lock_om_ptr() and add_lock_bit_to_om_ptr()
is too long. Similarly, I didn't like the name unlocked_next" and
get_next_without_lock_bit() is also too long.

> E.g. mark_next_loop would just be lock.

I renamed mark_next_loop() -> om_lock() instead of just lock().

> The load of the list heads should use Atmoic:load.

Fixed. Also made sure that all list head updates use Atomic::store().
I also dropped 'volatile' from the list variables and parameters.
We're now relying on Atomic::load() and Atomic::store() to keep the
compiler from messing with the list variables.

I've also dropped 'volatile' from all the list counters and parameters.
With similar reliance on Atomic::load(), Atomic::store(), inc(), and
dec(). I'm getting a start on JDK-8234192 related work since I have to
touch these fields and functions anyway.

> It also seem a bit wired to return next for the locking method.
> And output parameter can just be returned, and return NULL if list 
> head is NULL.
> E.g.
>
>  198 static ObjectMonitor* get_list_head_locked(ObjectMonitor* 
> volatile * list_p) {

I considered renaming mark_list_head() -> lock_list_head(), but ended
up doing mark_list_head() -> get_list_head_locked() since I also changed
the return type from 'bool' to 'ObjectMonitor*' so 'mid' is now returned
by the function instead of as a return parameter.

I wanted to keep the return of the 'unmarked next field' via the 'next_p'
return parameter since every caller will need that ObjectMonitor* value to
make an update to something in the list... but a number of those 
"somethings"
are calls to set_next() which is going to (mostly) change to om_unlock(). It
doesn't feel right to pass the "unlock value" to om_unlock().

>  200 while (true) {
>  201     ObjectMonitor* mid = Atomic::load(list_p);
>  202     if (mid == NULL) {
>  203       return NULL;  // The list is empty.
>  204     }
>  205     if (try_lock(mid)) {

I renamed mark_next() -> try_om_lock() instead of try_lock().

>  206       if (Atmoic::load(list_p) != mid) {
>  207         // The list head changed so we have to retry.
>  208         unlock(mid);

I added om_unlock() instead of unlock().
I kept set_next() for those cases where we were setting a specific value.

>  210       } else {
>              return mid;
>        }
>  214     }
>          // Yield ?

I'm not a fan of yield() since it hasn't worked properly on all
platforms for a very long time. I do have one intentional delay
when an exiting thread in om_flush() runs into a deflater thread.
I'm not worried about that os::naked_short_sleep(1) because that
JavaThread is exiting.

>  215 }
>  216 }
>
> With colleteral changes.

Yup. Collateral changes for mark_list_head() -> get_list_head_locked()
rename are done.

> ###5
> 220 static ObjectMonitor* unmarked_next(ObjectMonitor* om)
> Atomic::store is what needed.

I think you mean that Atomic::load() is what is needed. And that is
now what is called.

> ###6
> 333 static void prepend_to_common(
>
>  345       OrderAccess::storestore();
>  346       set_next(cur, next);  // Unmark the previous list head.
> Double storestore. (fixed by changing set_next to Atomic::store)

Mentioned above as part of another comment. Fixed code shown above.

> ###7
>  375 static ObjectMonitor* take_from_start_of_common(ObjectMonitor* 
> volatile * list_p,
>
> Triple storestore here.
>
>  386   Atomic::dec(count_p);
>  387   // mark_list_head() used cmpxchg() above, switching list head 
> can be lazier:
>  388   OrderAccess::storestore();
>  389   // Unmark take, but leave the next value for any lagging list
>  390   // walkers. It will get cleaned up when take is prepended to
>  391   // the in-use list:
>  392   set_next(take, next);
>  393   return take;
>
> Reads:
> count_p--
> OrderAccess::loadstore();
> OrderAccess::storestore();
> OrderAccess::storestore();
> OrderAccess::loadstore();
> OrderAccess::storestore();
> take->_next_om = next;
>
> Fixed by changing set_next to Atomic::store and removing the 
> OrderAccess::storestore();

Context: here is more of the v2.08 take_from_start_of_common():

  379   // Mark the list head to guard against A-B-A race:
  380   if (!mark_list_head(list_p, &take, &next)) {
  381     return NULL;  // None are available.
  382   }
  383   // Switch marked list head to next (which unmarks the list head, but
  384   // leaves take marked):
  385   *list_p = next;
  386   Atomic::dec(count_p);
  387   // mark_list_head() used cmpxchg() above, switching list head 
can be lazier:
  388   OrderAccess::storestore();
  389   // Unmark take, but leave the next value for any lagging list
  390   // walkers. It will get cleaned up when take is prepended to
  391   // the in-use list:
  392   set_next(take, next);
  393   return take;

And here is what I changed it to:

   // Lock the list head to guard against A-B-A race:
   if ((take = get_list_head_locked(list_p)) == NULL) {
     return NULL;  // None are available.
   }
   ObjectMonitor* next = unmarked_next(take);
   // Switch locked list head to next (which unlocks the list head, but
   // leaves take locked):
   Atomic::store(list_p, next);
   Atomic::dec(count_p);
   // Unlock take, but leave the next value for any lagging list
   // walkers. It will get cleaned up when take is prepended to
   // the in-use list:
   om_unlock(take);
   return take;

The "Atomic::store(list_p, next)" is still lazy and I agree that I don't
need the OrderAccess::storestore() anymore.

> ###8
> ObjectSynchronizer::om_release(
>
> 1591       if (m == mid) {
> 1592         // We found 'm' on the per-thread in-use list so try to 
> extract it.
> 1593         if (cur_mid_in_use == NULL) {
> 1594           // mid is the list head and it is marked. Switch the 
> list head
> 1595           // to next which unmarks the list head, but leaves mid 
> marked:
> 1596           self->om_in_use_list = next;
> 1597           // mark_list_head() used cmpxchg() above, switching 
> list head can be lazier:
> 1598           OrderAccess::storestore();
> 1599         } else {
> 1600           // mid and cur_mid_in_use are marked. Switch 
> cur_mid_in_use's
> 1601           // next field to next which unmarks cur_mid_in_use, but 
> leaves
> 1602           // mid marked:
> 1603 OrderAccess::release_store(&cur_mid_in_use->_next_om, next);
> 1604         }
> 1605         extracted = true;
> 1606         Atomic::dec(&self->om_in_use_count);
> 1607         // Unmark mid, but leave the next value for any lagging list
> 1608         // walkers. It will get cleaned up when mid is prepended to
> 1609         // the thread's free list:
> 1610         set_next(mid, next);
> 1611         break;
> 1612       }
>
> This does not look correct.

Here's what that block looks like now:

       if (m == mid) {
         // We found 'm' on the per-thread in-use list so try to extract it.
         if (cur_mid_in_use == NULL) {
           // mid is the list head and it is locked. Switch the list head
           // to next which unlocks the list head, but leaves mid locked:
           Atomic::store(&self->om_in_use_list, next);
         } else {
           // mid and cur_mid_in_use are locked. Switch cur_mid_in_use's
           // next field to next which unlocks cur_mid_in_use, but leaves
           // mid locked:
           set_next(cur_mid_in_use, next);
         }
         extracted = true;
         Atomic::dec(&self->om_in_use_count);
         // Unlock mid, but leave the next value for any lagging list
         // walkers. It will get cleaned up when mid is prepended to
         // the thread's free list:
         om_unlock(mid);
         break;
       }

The "Atomic::store(&self->om_in_use_list, next)" is still lazy and I
agree that I don't need the OrderAccess::storestore() anymore.

> Before taking this branch we have done a cmpxchg in mark_list_head or 
> mark_next_loop.
> This is how it reads:
> OrderAccess::storestore(); // from previous cmpxchg
> OrderAccess::loadstore(); // from previous cmpxchg
> 1591       if (m == mid) {
> 1593         if (cur_mid_in_use == NULL) {
> 1596           self->om_in_use_list = next;
> 1598           OrderAccess::storestore();
> 1599         } else {
>                OrderAccess::storestore();
>                OrderAccess::loadstore();
> 1603           cur_mid_in_use->_next_om = next;
> 1604         }
> 1605         extracted = true;
>              OrderAccess::storestore();
>              OrderAccess::fence(); // 
> storestore|storeload|loadstore|loadload
>          self->om_in_use_count--; // Atomic::dec
>              OrderAccess::storestore();
>              OrderAccess::loadstore();
>              OrderAccess::storestore();
>              OrderAccess::loadstore();
>          mid->_next_om = next; // Atomic::store
> 1611         break;
> 1612       }
>
> extracted is local variable so you so not need any orderaccess before 
> it set.
> Fixed by changing set_next to Atomic::store, removing the 
> OrderAccess::storestore() and changing OrderAccess::release_store to 
> Atmoic::store();

The OrderAccess::storestore() is gone and we've switched to using
Atomic::store() so these changes are all done.

> ###9
> 1653 void ObjectSynchronizer::om_flush(Thread* self) {
>
> 1714     OrderAccess::storestore();
> 1715     set_next(in_use_list, next);
> Fixed by changing set_next to Atomic::store.

Context: here's more of the v2.08 om_flush() code:

1711     // Clear the in-use list head (which also unmarks it):
1712     self->om_in_use_list = (ObjectMonitor*)NULL;
1713     // mark_list_head() used cmpxchg() above, clearing the 
disconnected list head can be lazier:
1714     OrderAccess::storestore();
1715     set_next(in_use_list, next);
1716   }

And here's how the code is fixed:

     // Clear the in-use list head (which also unlocks it):
     Atomic::store(&self->om_in_use_list, (ObjectMonitor*)NULL);
     om_unlock(in_use_list);
   }

The "Atomic::store(&self->om_in_use_list, (ObjectMonitor*)NULL)" is still
lazy and I agree that I don't need the OrderAccess::storestore() anymore.

> ###10
> 1737     self->om_free_list = NULL;
> 1738     OrderAccess::storestore();  // Lazier memory is okay for list 
> walkers.
>
> prepend_list_to_g_free_list/prepend_list_to_g_om_in_use_list does 
> first thing cmpxchg so there is no need for this storestore.

Here's the fixed code:

     Atomic::store(&self->om_free_list, (ObjectMonitor*)NULL);

The "Atomic::store(&self->om_free_list, (ObjectMonitor*)NULL)" is still
lazy and I agree that I don't need the OrderAccess::storestore() anymore.

> ###11
> 1797 void ObjectSynchronizer::inflate(ObjectMonitorHandle* omh_p, 
> Thread* self,
>
> 1938       // Once ObjectMonitor is configured and the object is 
> associated
> 1939       // with the ObjectMonitor, it is safe to allow async 
> deflation:
> 1940       assert(m->is_new(), "freshly allocated monitor must be new");
> 1941       m->set_allocation_state(ObjectMonitor::Old);
>
> So we use ref count, contention, waiter, owner and allocation state to 
> keep OM alive in different scenarios.

If you look at is_busy() you'll see that even more fields are checked...
It's kinda crazy, but it's all there for a reason... maybe not a good
reason after so many years, but a reason...

> There is not way for me to keep track of that.

I completely agree. I should consider follow on work to simplify how
deflation eligibility is determined.

> I don't see why you would need more than owner and ref count.
> If you allocate the om with ref count 1 you can remove 
> _allocation_state and just decrease ref count here instead.

_allocation_state provided a means of delaying async deflation by one
pass in Carsten's prototype. So an eligible ObjectMonitor had to be
seen twice by async deflation in order to be deflated. According to
Carsten's comments and JEP, this was done to prevent tight cycles of:

     inflate -> async-deflate -> inflate -> async-deflate

Karen proposed that I test changing the allocation state to 'Old' in
inflate() and see if we got any of these "inflate <-> async-deflate"
storms. I never saw anything like that in many, many months of testing
so I've left that change in place.

Currently  _allocation_state provides a state that's useful for
assert()'s and guarantee()'s and also provides a useful debugging
crumb when trying to figure out races and crashes.

I can see removing _allocation_state down the road...

> ###12
> 2079 bool ObjectSynchronizer::deflate_monitor
>
> 2112     if (AsyncDeflateIdleMonitors) {
> 2113       // clear() expects the owner field to be NULL and we won't 
> race
> 2114       // with the simple C2 ObjectMonitor
>
> The macro assambler code is not just executed by C2, so this comment 
> is a bit misleading. (there are some more also)

I'll have to disagree (somewhat)...

src/hotspot/cpu/x86/macroAssembler_x86.cpp:

1297 #ifdef COMPILER2

1680 // fast_lock and fast_unlock used by C2

1756 void MacroAssembler::fast_lock(Register objReg, Register boxReg, 
Register tmpReg,

1977 void MacroAssembler::fast_unlock(Register objReg, Register boxReg, 
Register tmpReg, bool use_rtm) {

2173 #endif // COMPILER2

so if COMPILER2 is not built, then the code that I'm talking about
is not in the system...

> ###13
> 2306 int ObjectSynchronizer::deflate_monitor_list(
>
> Same issue as ObjectSynchronizer::om_release.
> Fixed by changing set_next to Atomic::store, removing the 
> OrderAccess::storestore() and changing OrderAccess::release_store to 
> Atmoic::store();

Context: Here's the v2.08 deflate_monitor_list() code:

2327       if (cur_mid_in_use == NULL) {
2328         // mid is the list head and it is marked. Switch the list head
2329         // to next which unmarks the list head, but leaves mid marked:
2330         *list_p = next;
2331         // mark_list_head() used cmpxchg() above, switching list 
head can be lazier:
2332         OrderAccess::storestore();
2333       } else {
2334         // mid is marked. Switch cur_mid_in_use's next field to next
2335         // which is safe because we have no parallel list deletions,
2336         // but we leave mid marked:
2337 OrderAccess::release_store(&cur_mid_in_use->_next_om, next);
2338       }

And here's the fixed code:

       if (cur_mid_in_use == NULL) {
         // mid is the list head and it is locked. Switch the list head
         // to next which unlocks the list head, but leaves mid locked:
         Atomic::store(list_p, next);
       } else {
         // mid is locked. Switch cur_mid_in_use's next field to next
         // which is safe because we have no parallel list deletions,
         // but we leave mid locked:
         set_next(cur_mid_in_use, next);
       }

The "Atomic::store(list_p, next)" is still lazy and I agree that I don't
need the OrderAccess::storestore() anymore.

> ###14
> 2474       if (SafepointSynchronize::is_synchronizing() &&
>
> This is the wrong method to call, it should 
> SafepointMechanism::should_block(Thread* thread);

Hmmm... is_synchronizing() is defined here:

src/hotspot/share/runtime/safepoint.hpp:

   static bool is_synchronizing()                  { return _state == 
_synchronizing; }

and has been the way to ask if we're beginning a safepoint...

There are still a few existing calls to is_synchronizing():

src/hotspot/share/jvmci/jvmciRuntime.cpp:
     JRT_LEAF(jboolean, JVMCIRuntime::object_notify(JavaThread *thread, 
oopDesc* obj))
     JRT_LEAF(jboolean, JVMCIRuntime::object_notifyAll(JavaThread 
*thread, oopDesc* obj))

src/hotspot/share/opto/runtime.cpp:
     JRT_BLOCK_ENTRY(void, OptoRuntime::monitor_notify_C(oopDesc* obj, 
JavaThread *thread))
     JRT_BLOCK_ENTRY(void, OptoRuntime::monitor_notifyAll_C(oopDesc* 
obj, JavaThread *thread))

src/hotspot/share/runtime/sharedRuntime.cpp:
     JRT_BLOCK_ENTRY(void, 
SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
lock, JavaThread* thread))

src/hotspot/share/runtime/thread.inline.hpp:
     void JavaThread::enter_critical() {

src/hotspot/share/utilities/ostream.cpp:
     intx defaultStream::hold(intx writer_id) {

src/hotspot/share/utilities/vmError.cpp:
     void VMError::report(outputStream* st, bool _verbose) {

The two calls that I added are in:

src/hotspot/share/runtime/synchronizer.cpp:
     int 
ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** list_p,
     void ObjectSynchronizer::deflate_common_idle_monitors_using_JT(bool 
is_global, JavaThread* target) {

So here's should_block():

src/hotspot/share/runtime/safepointMechanism.inline.hpp:

bool SafepointMechanism::should_block(Thread* thread) {
   if (uses_thread_local_poll()) {
     return local_poll(thread);
   } else {
     return global_poll();
   }
}

and here's global_poll():

bool SafepointMechanism::global_poll() {
   return (SafepointSynchronize::_state != 
SafepointSynchronize::_not_synchronized);
}

Ohhhhhh.... by using is_synchronizing() all those call sites will
do or not do whatever is being controlled based on whether a
safepoint is beginning, but they won't check for a local_poll()...

I think that means that a local_poll() request will take longer to
be honored right?

I'll change the new calls that I added to synchronizer.cpp and I'll
let you know if testing reveals anything interesting. I think I'll
have to look at the other existing monitor related sites as a
separate bug. Please let me know if you are okay with deferring the
calls to is_synchronizing() that I didn't add in this project.

> ###15
> 2578 void ObjectSynchronizer::deflate_idle_monitors_using_JT() {
>
> 2616     g_wait_list = NULL;
> 2617     OrderAccess::storestore();  // Lazier memory sync is okay for 
> list walkers.

The code was changed to:

     Atomic::store(&g_wait_list, (ObjectMonitor*)NULL);

The "Atomic::store(&g_wait_list, (ObjectMonitor*)NULL)" is still lazy
and I agree that I don't need the OrderAccess::storestore() anymore.

>
> I don't see that g_wait_list is ever simutainously read.
> Either it is accessed by serviceThread outside a safepoint or by 
> VMThread inside a safepoint?
>
> It looks like g_wait_list can just be a local in:
> void ObjectSynchronizer::deflate_idle_monitors_using_JT()
>
> (disregarding the debug code that might read it in a safepoint)

g_wait_list is checked by 
ObjectSynchronizer::chk_global_wait_list_and_count()
which is called by ObjectSynchronizer::audit_and_print_stats().

audit_and_print_stats() is called from three places:

void exit_globals() {
       // These other two audit_and_print_stats() calls are done at the
       // Debug level at a safepoint:
       // - for safepoint based deflation auditing:
       //   ObjectSynchronizer::finish_deflate_idle_monitors()
       // - for async deflation auditing:
       //   ObjectSynchronizer::do_safepoint_work()
       ObjectSynchronizer::audit_and_print_stats(true /* on_exit */);

void ObjectSynchronizer::do_safepoint_work(DeflateMonitorCounters* 
counters) {
   if (log_is_enabled(Debug, monitorinflation)) {
     // exit_globals()'s call to audit_and_print_stats() is done
     // at the Info level and not at a safepoint.
     // For safepoint based deflation, audit_and_print_stats() is called
     // in ObjectSynchronizer::finish_deflate_idle_monitors() at the
     // Debug level at a safepoint.
     ObjectSynchronizer::audit_and_print_stats(false /* on_exit */);
   }

void 
ObjectSynchronizer::finish_deflate_idle_monitors(DeflateMonitorCounters* 
counters) {
   if (log_is_enabled(Debug, monitorinflation)) {
     // exit_globals()'s call to audit_and_print_stats() is done
     // at the Info level and not at a safepoint.
     // For async deflation, audit_and_print_stats() is called in
     // ObjectSynchronizer::do_safepoint_work() at the Debug level
     // at a safepoint.
     ObjectSynchronizer::audit_and_print_stats(false /* on_exit */);

So g_wait_list can be read simultaneously at a non-safepoint at VM exit
time. Since I don't want auditing failures to crop up at VM exit time,
g_wait_list needs to remain where it is. And yes, some of my stress runs
have showed crashes at VM exit time due to auditing failures when things
are kept "in sync"...

> ###16
> 2722         assert(SafepointSynchronize::is_synchronizing(), "sanity 
> check");
>
> This is the wrong method to call, it should 
> SafepointMechanism::should_block(Thread* thread);

See above reply about is_synchronizing() and should_block().

> ##################
> src/hotspot/share/runtime/vframe.cpp
>
> We are at safepoint or current thread or in a handshake, current 
> pending and waiting monitor is already stable.

Context:

116 GrowableArray<MonitorInfo*>* javaVFrame::locked_monitors() {

is calling both:

   126   ObjectMonitor *waiting_monitor = 
thread()->current_waiting_monitor(&omh);
   129     pending_monitor = thread()->current_pending_monitor(&omh);

and:

   163 void javaVFrame::print_lock_info_on(outputStream* st, int 
frame_count) {

is calling:

   241                 mark.monitor() == 
thread()->current_pending_monitor(&omh)

We know from the long analysis for src/hotspot/share/prims/jvmtiEnvBase.cpp
that javaVFrame::locked_monitors() is only called:

  117   assert(SafepointSynchronize::is_at_safepoint() || 
JavaThread::current() == thread(),
  118          "must be at safepoint or it's a java frame of the current 
thread");

but I didn't do any analysis for javaVFrame::print_lock_info_on().

javaVFrame::print_lock_info_on() is called by:
     The calling thread is operating on itself so the ObjectMonitorHandle
     is not strictly necessary.

     Do the caller analysis for completeness:
     JavaThread::print_stack_on() which is called by:
         JRT_LEAF(void, JVMCIRuntime::monitorexit(JavaThread* thread, 
oopDesc* obj, BasicLock* lock))
           - for an assert() failure diagnostic
         WB_ENTRY(jint, WB_HandshakeWalkStack(JNIEnv* env, jobject wb, 
jobject thread_handle, jboolean all_threads))
           - not sure of the calling context for this one...
         Threads::print_on(outputStream* st, bool print_stacks,
           - is called at safepoint by VM_PrintThreads operation
             so the caller is the VMThread...
         Threads::print_stack()
           - is called from many places for diagnostic purposes; some of
             those places are not assert() or abort() paths.
         void DeadlockCycle::print_on_with()
           - called at a safepoint IIRC

I think the code in javaVFrame::print_lock_info_on() is very careful about
its call to thread()->current_pending_monitor(&omh) so if we do decide to
allow for passing NULL to current_pending_monitor(), then this would be a
spot where I think it would be safe. However, we have been trying very hard
to do things in code to make things safe rather than relying on comments.

Resolving this comment relies on how we decide to resolve the long thread
for src/hotspot/share/prims/jvmtiEnvBase.cpp above.

Update: I've gone through and added a comment at the decl for the
ObjectMonitorHandles associated the calls to:

     Thread::current_waiting_monitor(&omh)
     Thread::current_pending_monitor(&omh)

Hopefully, it makes things more clear.

> ##################
> src/hotspot/share/services/threadService.cpp
>
> These changes are only needed for the 
> -HandshakeAfterDeflateIdleMonitors path.

The long analysis in src/hotspot/share/prims/jvmtiEnvBase.cpp above
shows ThreadService::get_current_contended_monitor() can be called
by M&M code on arbitrary threads that are not at a safepoint. And
handshakes after deflation doesn't solve this location because the
object field going NULL happens during deflation and doesn't wait
for the handshake to be completed. See deflate_monitor_using_JT().
In other words, the racy damage is already done if we don't have an
ObjectMonitorHandle.

The jt->current_pending_monitor(&omh) calls in the deadlock detection
code could safely be passed NULL, if we decide to go that route because
the deadlock detection code is called only at a safepoint. However, we
have been trying very hard to do things in code to make things safe
rather than relying on comments.

Resolving some of this comment relies on how we decide to resolve the long
thread for src/hotspot/share/prims/jvmtiEnvBase.cpp above.

Update: I've gone through and added a comment at the decl for the
ObjectMonitorHandles associated the calls to:

     Thread::current_waiting_monitor(&omh)
     Thread::current_pending_monitor(&omh)

Hopefully, it makes things more clear.

> ##################
> test/jdk/java/rmi/server/UnicastRemoteObject/unexportObject/UnexportLeak.java 
>
>
> Note: if OM had a weak to object instead this would not be needed.

I discussed changing ObjectMonitor to have a weak ref for _object with
David H., but we quickly ran into something that required it to be a
strong reference. I don't remember the details off the top of my head...

I'm planning to do a few test runs with this fix (and others) disabled
to see if the ServiceThread doing the async deflation is "good enough"
to avoid intermittent failures.

I think I've replied to all of the comments. Please let me know if I
missed anything... Thanks again for the crawl thru code review!

Dan

>
> Thanks, Robbin
>
>
> On 11/4/19 10:03 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have made changes to the Async Monitor Deflation code in response to
>> the CR7/v2.07/10-for-jdk14 code review cycle. Thanks to David H., Robbin
>> and Erik O. for their comments!
>>
>> JDK14 Rampdown phase one is coming on Dec. 12, 2019 and the Async 
>> Monitor
>> Deflation project needs to push before Nov. 12, 2019 in order to allow
>> for sufficient bake time for such a big change. Nov. 12 is _next_ 
>> Tuesday
>> so we have 8 days from today to finish this code review cycle and push
>> this code for JDK14.
>>
>> Carsten and Roman! Time for you guys to chime in again on the code 
>> reviews.
>>
>> I have attached the change list from CR7 to CR8 instead of putting it in
>> the body of this email. I've also added a link to the CR7-to-CR8-changes
>> file to the webrevs so it should be easy to find.
>>
>> Main bug URL:
>>
>>      JDK-8153224 Monitor deflation prolong safepoints
>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>
>> The project is currently baselined on jdk-14+21.
>>
>> Here's the full webrev URL for those folks that want to see all of the
>> current Async Monitor Deflation code in one go (v2.08 full):
>>
>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.full 
>>
>>
>> Some folks might want to see just what has changed since the last review
>> cycle so here's a webrev for that (v2.08 inc):
>>
>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for-jdk14.v2.08.inc/ 
>>
>>
>> The OpenJDK wiki did not need any changes for this round:
>>
>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>
>> The jdk-14+21 based v2.08 version of the patch has been thru Mach5 
>> tier[1-8]
>> testing on Oracle's usual set of platforms. It has also been through 
>> my usual
>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with the 
>> addition
>> of Robbin's "MoCrazy 1024" test running in parallel with the other 
>> tests in
>> my lab. Some testing is still running, but so far there are no new 
>> regressions.
>>
>> I have not yet done a SPECjbb2015 round on the CR8/v2.08/11-for-jdk14 
>> bits.
>>
>> Thanks, in advance, for any questions, comments or suggestions.
>>
>> Dan
>>
>>
>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> The Async Monitor Deflation project is reaching the end game. I have no
>>> changes planned for the project at this time so all that is left is 
>>> code
>>> review and any changes that results from those reviews.
>>>
>>> Carsten and Roman! Time for you guys to chime in again on the code 
>>> reviews.
>>>
>>> I have attached the list of fixes from CR6 to CR7 instead of putting it
>>> in the main body of this email.
>>>
>>> Main bug URL:
>>>
>>>     JDK-8153224 Monitor deflation prolong safepoints
>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>
>>> The project is currently baselined on jdk-14+19.
>>>
>>> Here's the full webrev URL for those folks that want to see all of the
>>> current Async Monitor Deflation code in one go (v2.07 full):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.full 
>>>
>>>
>>> Some folks might want to see just what has changed since the last 
>>> review
>>> cycle so here's a webrev for that (v2.07 inc):
>>>
>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for-jdk14.v2.07.inc/ 
>>>
>>>
>>> The OpenJDK wiki has been updated to match the 
>>> CR7/v2.07/10-for-jdk14 changes:
>>>
>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>
>>> The jdk-14+18 based v2.07 version of the patch has been thru Mach5 
>>> tier[1-8]
>>> testing on Oracle's usual set of platforms. It has also been through 
>>> my usual
>>> set of stress testing on Linux-X64, macOSX and Solaris-X64 with the 
>>> addition
>>> of Robbin's "MoCrazy 1024" test running in parallel with the other 
>>> tests in
>>> my lab.
>>>
>>> The jdk-14+19 based v2.07 version of the patch has been thru Mach5 
>>> tier[1-3]
>>> test on Oracle's usual set of platforms. Mach5 tier[4-8] are in 
>>> process.
>>>
>>> I did another round of SPECjbb2015 testing in Oracle's Aurora 
>>> Performance lab
>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs:
>>>
>>>     - "base" is jdk-14+18
>>>     - "v2.07" is the latest version and includes C2 
>>> inc_om_ref_count() support
>>>       on LP64 X64 and the new HandshakeAfterDeflateIdleMonitors option
>>>     - "off" is with -XX:-AsyncDeflateIdleMonitors specified
>>>     - "handshake" is with -XX:+HandshakeAfterDeflateIdleMonitors 
>>> specified
>>>
>>>          hbIR           hbIR
>>>     (max attempted)  (settled)  max-jOPS  critical-jOPS runtime
>>>     ---------------  ---------  --------  ------------- -------
>>>            34282.00   30635.90  28831.30       20969.20 3841.30 base
>>>            34282.00   30973.00  29345.80       21025.20 3964.10 v2.07
>>>            34282.00   31105.60  29174.30       21074.00 3931.30 
>>> v2.07_handshake
>>>            34282.00   30789.70  27151.60       19839.10 3850.20 
>>> v2.07_off
>>>
>>>     - The Aurora Perf comparison tool reports:
>>>
>>>         Comparison              max-jOPS critical-jOPS
>>>         ----------------------  -------------------- 
>>> --------------------
>>>         base vs 2.07            +1.78% (s, p=0.000)   +0.27% (ns, 
>>> p=0.790)
>>>         base vs 2.07_handshake  +1.19% (s, p=0.007)   +0.58% (ns, 
>>> p=0.536)
>>>         base vs 2.07_off        -5.83% (ns, p=0.394)  -5.39% (ns, 
>>> p=0.347)
>>>
>>>         (s) - significant  (ns) - not-significant
>>>
>>>     - For historical comparison, the Aurora Perf comparision tool
>>>         reported for v2.06 with a baseline of jdk-13+31:
>>>
>>>         Comparison              max-jOPS critical-jOPS
>>>         ----------------------  -------------------- 
>>> --------------------
>>>         base vs 2.06            -0.32% (ns, p=0.345)  +0.71% (ns, 
>>> p=0.646)
>>>         base vs 2.06_off        +0.49% (ns, p=0.292)  -1.21% (ns, 
>>> p=0.481)
>>>
>>>         (s) - significant  (ns) - not-significant
>>>
>>> Thanks, in advance, for any questions, comments or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> The Async Monitor Deflation project has rebased to JDK14 so it's time
>>>> for our first code review in that new context!!
>>>>
>>>> I've been focused on changing the monitor list management code to be
>>>> lock-free in order to make SPECjbb2015 happier. Of course with a 
>>>> change
>>>> like that, it takes a while to chase down all the new and wonderful
>>>> races. At this point, I have the code back to the same stability that
>>>> I had with CR5/v2.05/8-for-jdk13.
>>>>
>>>> To lay the ground work for this round of review, I pushed the 
>>>> following
>>>> two fixes to jdk/jdk earlier today:
>>>>
>>>>     JDK-8230184 rename, whitespace, indent and comments changes in 
>>>> preparation
>>>>                 for lock free Monitor lists
>>>> https://bugs.openjdk.java.net/browse/JDK-8230184
>>>>
>>>>     JDK-8230317 serviceability/sa/ClhsdbPrintStatics.java fails 
>>>> after 8230184
>>>> https://bugs.openjdk.java.net/browse/JDK-8230317
>>>>
>>>> I have attached the list of fixes from CR5 to CR6 instead of putting
>>>> in the main body of this email.
>>>>
>>>> Main bug URL:
>>>>
>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>
>>>> The project is currently baselined on jdk-14+11 plus the fixes for
>>>> JDK-8230184 and JDK-8230317.
>>>>
>>>> Here's the full webrev URL for those folks that want to see all of the
>>>> current Async Monitor Deflation code in one go (v2.06 full):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.full/ 
>>>>
>>>>
>>>>
>>>> The primary focus of this review cycle is on the lock-free Monitor 
>>>> List
>>>> management changes so here's a webrev for just that patch (v2.06c):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06c.inc/ 
>>>>
>>>>
>>>> The secondary focus of this review cycle is on the bug fixes that have
>>>> been made since CR5/v2.05/8-for-jdk13 so here's a webrev for just that
>>>> patch (v2.06b):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06b.inc/ 
>>>>
>>>>
>>>> The third and final bucket for this review cycle is the rename, 
>>>> whitespace,
>>>> indent and comments changes made in preparation for lock free 
>>>> Monitor list
>>>> management. Almost all of that was extracted into JDK-8230184 for the
>>>> baseline so this bucket now has just a few comment changes relative to
>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for the remainder (v2.06a):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06a.inc/ 
>>>>
>>>>
>>>>
>>>> Some folks might want to see just what has changed since the last 
>>>> review
>>>> cycle so here's a webrev for that (v2.06 inc):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.06.inc/ 
>>>>
>>>>
>>>>
>>>> Last, but not least, some folks might want to see the code before the
>>>> addition of lock-free Monitor List management so here's a webrev for
>>>> that (v2.00 -> v2.05):
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for-jdk14.v2.05.inc/ 
>>>>
>>>>
>>>> The OpenJDK wiki will need minor updates to match the CR6 changes:
>>>>
>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>
>>>> but that should only be changes to describe per-thread list async 
>>>> monitor
>>>> deflation being done by the ServiceThread.
>>>>
>>>> (I did update the OpenJDK wiki for the CR5 changes back on 2019.08.14)
>>>>
>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>> Oracle's usual set of platforms. It has also been through my usual set
>>>> of stress testing on Linux-X64, macOSX and Solaris-X64.
>>>>
>>>> I did a bunch of SPECjbb2015 testing in Oracle's Aurora Performance 
>>>> lab
>>>> using using their tuned SPECjbb2015 Linux-X64 G1 configs. This was 
>>>> using
>>>> this patch baselined on jdk-13+31 (for stability):
>>>>
>>>>           hbIR           hbIR
>>>>      (max attempted)  (settled)  max-jOPS  critical-jOPS runtime
>>>>      ---------------  ---------  --------  ------------- -------
>>>>             34282.00   28837.20  27905.20       19817.40 3658.10 base
>>>>             34965.70   29798.80  27814.90       19959.00 3514.60 
>>>> v2.06d
>>>>             34282.00   29100.70  28042.50       19577.00 3701.90 
>>>> v2.06d_off
>>>>             34282.00   29218.50  27562.80       19397.30 3657.60 
>>>> v2.06d_ocache
>>>>             34965.70   29838.30  26512.40       19170.60 3569.90 v2.05
>>>>             34282.00   28926.10  27734.00       19835.10 3588.40 
>>>> v2.05_off
>>>>
>>>> The "off" configs are with -XX:-AsyncDeflateIdleMonitors specified and
>>>> the "ocache" config is with 128 byte cache line sizes instead of 64 
>>>> byte
>>>> cache lines sizes. "v2.06d" is the last set of changes that I made 
>>>> before
>>>> those changes were distributed into the "v2.06a", "v2.06b" and 
>>>> "v2.06c"
>>>> buckets for this review recycle.
>>>>
>>>>
>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I've been focused on chasing down and fixing the rare test failures
>>>>> that only pop up rarely. So this round is primarily fixes for races
>>>>> with a few additional fixes that came from Karen's review of CR4.
>>>>> Thanks Karen!
>>>>>
>>>>> I have attached the list of fixes from CR4 to CR5 instead of putting
>>>>> in the main body of this email.
>>>>>
>>>>> Main bug URL:
>>>>>
>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>
>>>>> The project is currently baselined on jdk-13+29. This will likely be
>>>>> the last JDK13 baseline for this project and I'll roll to the JDK14
>>>>> (jdk/jdk) repo soon...
>>>>>
>>>>> Here's the full webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.full/
>>>>>
>>>>> Here's the incremental webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for-jdk13.inc/
>>>>>
>>>>> I have not yet checked the OpenJDK wiki to see if it needs any 
>>>>> updates
>>>>> to match the CR5 changes:
>>>>>
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation
>>>>>
>>>>> (I did update the OpenJDK wiki for the CR4 changes back on 
>>>>> 2019.06.26)
>>>>>
>>>>> This version of the patch has been thru Mach5 tier[1-3] testing on
>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running now and
>>>>> Mach5 tier[78] will follow. I'll kick off the usual stress testing
>>>>> on Linux-X64, macOSX and Solaris-X64 as those machines become 
>>>>> available.
>>>>> Since I haven't made any performance changes in this round, I'll only
>>>>> be running SPECjbb2015 to gather the latest monitorinflation logs.
>>>>>
>>>>> Next up:
>>>>>
>>>>> - We're still seeing 4-5% lower performance with SPECjbb2015 on
>>>>>   Linux-X64 and we've determined that some of that comes from
>>>>>   contention on the gListLock. So I'm going to investigate removing
>>>>>   the gListLock. Yes, another lock free set of changes is coming!
>>>>> - Of course, going lock free often causes new races and new failures
>>>>>   so that's a good reason for make those changes isolated in their
>>>>>   own round (and not holding up CR5/v2.05/8-for-jdk13 anymore).
>>>>> - I finally have a potential fix for the Win* failure with
>>>>>     gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>   but I haven't run it through Mach5 yet so it'll be in the next 
>>>>> round.
>>>>> - Some RTM tests were recently re-enabled in Mach5 and I'm seeing 
>>>>> some
>>>>>   monitor related failures there. I suspect that I need to go take a
>>>>>   look at the C2 RTM macro assembler code and look for things that 
>>>>> might
>>>>>   conflict if Async Monitor Deflation. If you're interested in 
>>>>> that kind
>>>>>   of issue, then see the macroAssembler_x86.cpp sanity check that I
>>>>>   added in this round!
>>>>>
>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a fix for an issue that came up during performance testing.
>>>>>> Many thanks to Robbin for diagnosing the issue in his SPECjbb2015
>>>>>> experiments.
>>>>>>
>>>>>> Here's the list of changes from CR3 to CR4. The list is a bit
>>>>>> verbose due to the complexity of the issue, but the changes
>>>>>> themselves are not that big.
>>>>>>
>>>>>> Functional:
>>>>>>   - Change SafepointSynchronize::is_cleanup_needed() from calling
>>>>>>     ObjectSynchronizer::is_cleanup_needed() to calling
>>>>>>     ObjectSynchronizer::is_safepoint_deflation_needed():
>>>>>>     - is_safepoint_deflation_needed() returns the result of
>>>>>>       monitors_used_above_threshold() for safepoint based
>>>>>>       monitor deflation (!AsyncDeflateIdleMonitors).
>>>>>>     - For AsyncDeflateIdleMonitors, it only returns true if
>>>>>>       there is a special deflation request, e.g., System.gc()
>>>>>>       - This solves a bug where there are a bunch of Cleanup
>>>>>>         safepoints that simply request async deflation which
>>>>>>         keeps the async JavaThreads from making progress on
>>>>>>         their async deflation work.
>>>>>>   - Add AsyncDeflationInterval diagnostic option. Description:
>>>>>>       Async deflate idle monitors every so many milliseconds when
>>>>>>       MonitorUsedDeflationThreshold is exceeded (0 is off).
>>>>>>   - Replace ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with
>>>>>>     ObjectSynchronizer::is_async_deflation_needed():
>>>>>>     - is_async_deflation_needed() returns true when
>>>>>>       is_async_cleanup_requested() is true or when
>>>>>>       monitors_used_above_threshold() is true (but no more often 
>>>>>> than
>>>>>>       AsyncDeflationInterval).
>>>>>>     - if AsyncDeflateIdleMonitors Service_lock->wait() now waits for
>>>>>>       at most GuaranteedSafepointInterval millis:
>>>>>>       - This allows is_async_deflation_needed() to be checked at
>>>>>>         the same interval as GuaranteedSafepointInterval.
>>>>>>         (default is 1000 millis/1 second)
>>>>>>       - Once is_async_deflation_needed() has returned true, it
>>>>>>         generally cannot return true for AsyncDeflationInterval.
>>>>>>         This is to prevent async deflation from swamping the
>>>>>>         ServiceThread.
>>>>>>   - The ServiceThread still handles async deflation of the global
>>>>>>     in-use list and now it also marks JavaThreads for async 
>>>>>> deflation
>>>>>>     of their in-use lists.
>>>>>>     - The ServiceThread will check for async deflation work every
>>>>>>       GuaranteedSafepointInterval.
>>>>>>     - A safepoint can still cause the ServiceThread to check for
>>>>>>       async deflation work via is_async_deflation_requested.
>>>>>>   - Refactor code from ObjectSynchronizer::is_cleanup_needed() into
>>>>>>     monitors_used_above_threshold() and remove is_cleanup_needed().
>>>>>>   - In addition to System.gc(), the VM_Exit VM op and the final
>>>>>>     VMThread safepoint now set the is_special_deflation_requested
>>>>>>     flag to reduce the in-use monitor population that is reported by
>>>>>>     ObjectSynchronizer::log_in_use_monitor_details() at VM exit.
>>>>>>
>>>>>> Test update:
>>>>>>   - test/hotspot/gtest/oops/test_markOop.cpp is updated to work with
>>>>>>     AsyncDeflateIdleMonitors.
>>>>>>
>>>>>> Collateral:
>>>>>>   - Add/clarify/update some logging messages.
>>>>>>
>>>>>> Cleanup:
>>>>>>   - Updated comments based on Karen's code review.
>>>>>>   - Change 'special cleanup' -> 'special deflation' and
>>>>>>     'async cleanup' -> 'async deflation'.
>>>>>>     - comment and function name changes
>>>>>>   - Clarify MonitorUsedDeflationThreshold description;
>>>>>>
>>>>>>
>>>>>> Main bug URL:
>>>>>>
>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>
>>>>>> The project is currently baselined on jdk-13+22.
>>>>>>
>>>>>> Here's the full webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.full/
>>>>>>
>>>>>> Here's the incremental webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for-jdk13.inc/
>>>>>>
>>>>>> I have not updated the OpenJDK wiki to reflect the CR4 changes:
>>>>>>
>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>
>>>>>>
>>>>>> The wiki doesn't say a whole lot about the async deflation 
>>>>>> invocation
>>>>>> mechanism so I have to figure out how to add that content.
>>>>>>
>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run is
>>>>>> running now. Kitchensink8H on product, fastdebug, and slowdebug bits
>>>>>> are running on Linux-X64, MacOSX and Solaris-X64. I still have to 
>>>>>> run
>>>>>> my stress kit on Linux-X64. I still have to run the SPECjbb2015
>>>>>> baseline and CR4 runs on Linux-X64, MacOSX and Solaris-X64.
>>>>>>
>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I had some discussions with Karen about a race that was in the
>>>>>>> ObjectMonitor::enter() code in CR2/v2.02/5-for-jdk13. This race was
>>>>>>> theoretical and I had no test failures due to it. The fix is pretty
>>>>>>> simple: remove the special case code for async deflation in the
>>>>>>> ObjectMonitor::enter() function and rely solely on the ref_count
>>>>>>> for ObjectMonitor::enter() protection.
>>>>>>>
>>>>>>> During those discussions Karen also floated the idea of using the
>>>>>>> ref_count field instead of the contentions field for the Async
>>>>>>> Monitor Deflation protocol. I decided to go ahead and code up that
>>>>>>> change and I have run it through the usual stress and Mach5 testing
>>>>>>> with no issues. It's also known as v2.03 (for those for with the
>>>>>>> patches) and as webrev/6-for-jdk13 (for those with webrev URLs).
>>>>>>> Sorry for all the names...
>>>>>>>
>>>>>>> Main bug URL:
>>>>>>>
>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>
>>>>>>> The project is currently baselined on jdk-13+18.
>>>>>>>
>>>>>>> Here's the full webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.full/
>>>>>>>
>>>>>>> Here's the incremental webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for-jdk13.inc/
>>>>>>>
>>>>>>> I have also updated the OpenJDK wiki to reflect the CR3 changes:
>>>>>>>
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>
>>>>>>>
>>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on
>>>>>>> Oracle's usual set of platforms. My Solaris-X64 stress kit run had
>>>>>>> no issues. Kitchensink8H on product, fastdebug, and slowdebug bits
>>>>>>> had no failures on Linux-X64; MacOSX fastdebug and slowdebug and
>>>>>>> Solaris-X64 release had the usual "Too large time diff" complaints.
>>>>>>> 12 hour Inflate2 runs on product, fastdebug and slowdebug bits on
>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no failures. My Linux-X64
>>>>>>> stress kit is running right now.
>>>>>>>
>>>>>>> I've done the SPECjbb2015 baseline and CR3 runs. I need to gather
>>>>>>> the results and analyze them.
>>>>>>>
>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a small but important bug fix for the Async Monitor 
>>>>>>>> Deflation
>>>>>>>> project ready to go. It's also known as v2.02 (for those for 
>>>>>>>> with the
>>>>>>>> patches) and as webrev/5-for-jdk13 (for those with webrev 
>>>>>>>> URLs). Sorry
>>>>>>>> for all the names...
>>>>>>>>
>>>>>>>> JDK-8222295 was pushed to jdk/jdk two days ago so that baseline 
>>>>>>>> patch
>>>>>>>> is out of our hair.
>>>>>>>>
>>>>>>>> Main bug URL:
>>>>>>>>
>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>
>>>>>>>> The project is currently baselined on jdk-13+17.
>>>>>>>>
>>>>>>>> Here's the full webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/ 
>>>>>>>>
>>>>>>>>
>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/
>>>>>>>>
>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR2 
>>>>>>>> changes:
>>>>>>>>
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>
>>>>>>>>
>>>>>>>> This version of the patch has been thru Mach5 tier[1-6] testing on
>>>>>>>> Oracle's usual set of platforms. Mach5 tier[7-8] is running now.
>>>>>>>> My stress kit is running on Solaris-X64 now. Kitchensink8H is 
>>>>>>>> running
>>>>>>>> now on product, fastdebug, and slowdebug bits on Linux-X64, MacOSX
>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs are running now on product,
>>>>>>>> fastdebug and slowdebug bits on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>> I'll start my my stress kit on Linux-X64 sometime on Sunday (after
>>>>>>>> my jdk-13+18 stress run is done).
>>>>>>>>
>>>>>>>> I'll do SPECjbb2015 baseline and CR2 runs after all the stress
>>>>>>>> testing is done.
>>>>>>>>
>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> I finally have CR1 for the Async Monitor Deflation project 
>>>>>>>>> ready to
>>>>>>>>> go. It's also known as v2.01 (for those for with the patches) 
>>>>>>>>> and as
>>>>>>>>> webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all 
>>>>>>>>> the
>>>>>>>>> names...
>>>>>>>>>
>>>>>>>>> Main bug URL:
>>>>>>>>>
>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>
>>>>>>>>> Baseline bug fixes URL:
>>>>>>>>>
>>>>>>>>>     JDK-8222295 more baseline cleanups from Async Monitor 
>>>>>>>>> Deflation project
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295
>>>>>>>>>
>>>>>>>>> The project is currently baselined on jdk-13+15.
>>>>>>>>>
>>>>>>>>> Here's the webrev for the latest baseline changes (JDK-8222295):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here's the full webrev URL (JDK-8153224 only):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here's the incremental webrev URL (JDK-8153224):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So I'm looking for reviews for both JDK-8222295 and the latest 
>>>>>>>>> version
>>>>>>>>> of JDK-8153224...
>>>>>>>>>
>>>>>>>>> I still have to update the OpenJDK wiki to reflect the CR 
>>>>>>>>> changes:
>>>>>>>>>
>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This version of the patch has been thru Mach5 tier[1-3] 
>>>>>>>>> testing on
>>>>>>>>> Oracle's usual set of platforms. Mach5 tier[4-6] is running 
>>>>>>>>> now and
>>>>>>>>> Mach5 tier[78] will be run later today. My stress kit on 
>>>>>>>>> Solaris-X64
>>>>>>>>> is running now. Linux-X64 stress testing will start on Sunday. 
>>>>>>>>> I'm
>>>>>>>>> planning to do Kitchensink runs, SPECjbb2015 runs and my monitor
>>>>>>>>> inflation stress tests on Linux-X64, MacOSX and Solaris-X64.
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> Welcome to the OpenJDK review thread for my port of Carsten's 
>>>>>>>>>> work on:
>>>>>>>>>>
>>>>>>>>>>     JDK-8153224 Monitor deflation prolong safepoints
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224
>>>>>>>>>>
>>>>>>>>>> Here's a link to the OpenJDK wiki that describes my port:
>>>>>>>>>>
>>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/
>>>>>>>>>>
>>>>>>>>>> Here's a link to Carsten's original webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
>>>>>>>>>>
>>>>>>>>>> Earlier versions of this patch have been through several 
>>>>>>>>>> rounds of
>>>>>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and
>>>>>>>>>> Roman for their preliminary code review comments. A very special
>>>>>>>>>> thanks to Robbin and Roman for building and testing the patch in
>>>>>>>>>> their own environments (including specJBB2015).
>>>>>>>>>>
>>>>>>>>>> This version of the patch has been thru Mach5 tier[1-8] 
>>>>>>>>>> testing on
>>>>>>>>>> Oracle's usual set of platforms. Earlier versions have been run
>>>>>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers
>>>>>>>>>> (product, fastdebug, slowdebug).Earlier versions have run 
>>>>>>>>>> Kitchensink
>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, 
>>>>>>>>>> fastdebug
>>>>>>>>>> and slowdebug). Earlier versions have run my monitor 
>>>>>>>>>> inflation stress
>>>>>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 
>>>>>>>>>> (product,
>>>>>>>>>> fastdebug and slowdebug).
>>>>>>>>>>
>>>>>>>>>> All of the testing done on earlier versions will be redone on 
>>>>>>>>>> the
>>>>>>>>>> latest version of the patch.
>>>>>>>>>>
>>>>>>>>>> Thanks, in advance, for any questions, comments or suggestions.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>> P.S.
>>>>>>>>>> One subtest in 
>>>>>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java
>>>>>>>>>> is currently failing in -Xcomp mode on Win* only. I've been 
>>>>>>>>>> trying
>>>>>>>>>> to characterize/analyze this failure for more than a week 
>>>>>>>>>> now. At
>>>>>>>>>> this point I'm convinced that Async Monitor Deflation is 
>>>>>>>>>> aggravating
>>>>>>>>>> an existing bug. However, I plan to have a better handle on that
>>>>>>>>>> failure before these bits are pushed to the jdk/jdk repo.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>