RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS

Fri Dec 7 17:48:48 UTC 2012

Hi John/Jon/Ramki,

All proposed recommendations and requested changes have been implemented. We are going to test it on Monday. You will get the new tested patch soon. 

The attached code here just got compiled, no test executed yet, it might contain a bug, but you can quickly review it and send your comments.

Best regards
Michal

// Wait until the next synchronous GC, a concurrent full gc request,
// or a timeout, whichever is earlier.
void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long t_millis) {
  // Wait time in millis or 0 value representing infinite wait for a scavenge
  assert(t_millis >= 0, "Wait time for scavenge should be 0 or positive");

  GenCollectedHeap* gch = GenCollectedHeap::heap();
  double start_time_secs = os::elapsedTime();
  double end_time_secs = start_time_secs + (t_millis / ((double) MILLIUNITS));

  // Total collections count before waiting loop
  unsigned int before_count;
  {
    MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
    before_count = gch->total_collections();
  }

  unsigned int loop_count = 0;

  while(!_should_terminate) {
    double now_time = os::elapsedTime();
    long wait_time_millis;

    if(t_millis != 0) {
      // New wait limit
      wait_time_millis = (long) ((end_time_secs - now_time) * MILLIUNITS);
      if(wait_time_millis <= 0) {
        // Wait time is over
        break;
      }
    } else {
      // No wait limit, wait if necessary forever
      wait_time_millis = 0;
    }

    // Wait until the next event or the remaining timeout
    {
      MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);

      set_CMS_flag(CMS_cms_wants_token);   // to provoke notifies
      if (_should_terminate || _collector->_full_gc_requested) {
        return;
      }
      assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
      CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
      clear_CMS_flag(CMS_cms_wants_token);
      assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
             "Should not be set");
    }

    // Extra wait time check before entering the heap lock to get the collection count
    if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
      // Wait time is over
      break;
    }

    // Total collections count after the event
    unsigned int after_count;
    {
      MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
      after_count = gch->total_collections();
    }

    if(before_count != after_count) {
      // There was a collection - success
      break;
    }

    // Too many loops warning
    if(++loop_count == 0) {
      warning("wait_on_cms_lock_for_scavenge() has looped %d times", loop_count - 1);
    }
  }
}

void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
  while (!_should_terminate) {
    if (CMSIncrementalMode) {
      icms_wait();
      if(CMSWaitDuration >= 0) {
        // Wait until the next synchronous GC, a concurrent full gc
        // request or a timeout, whichever is earlier.
        wait_on_cms_lock_for_scavenge(CMSWaitDuration);
      }
      return;
    } else {
      if(CMSWaitDuration >= 0) {
        // Wait until the next synchronous GC, a concurrent full gc
        // request or a timeout, whichever is earlier.
        wait_on_cms_lock_for_scavenge(CMSWaitDuration);
      } else {
        // Wait until any cms_lock event not to call shouldConcurrentCollect permanently
        wait_on_cms_lock(0);
      }
    }
    // Check if we should start a CMS collection cycle
    if (_collector->shouldConcurrentCollect()) {
      return;
    }
    // .. collection criterion not yet met, let's go back
    // and wait some more
  }
}

Od: hotspot-gc-dev-bounces at openjdk.java.net
Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson" john.cuthbertson at oracle.com
Kopie: hotspot-gc-dev at openjdk.java.net
Datum: Thu, 6 Dec 2012 23:43:29 -0800
Předmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS

> Hi John --
> 
> wrt the changes posted, i see the intent of the code and agree with
> it. I have a few minor suggestions on the
> details of how it's implemented. My comments are inline below,
> interleaved with the code:
> 
>  317 // Wait until the next synchronous GC, a concurrent full gc request,
>  318 // or a timeout, whichever is earlier.
>  319 void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
> t_millis) {
>  320   // Wait for any cms_lock event when timeout not specified (0 millis)
>  321   if (t_millis == 0) {
>  322     wait_on_cms_lock(t_millis);
>  323     return;
>  324   }
> 
> I'd completely avoid the special case above because it would miss the
> part about waiting for a
> scavenge, instead dealing with that case in the code in the loop below
> directly. The idea
> of the "0" value is not to ask that we return immediately, but that we
> wait, if necessary
> forever, for a scavenge. The "0" really represents the value infinity
> in that sense. This would
> be in keeping with our use of wait() with a "0" value for timeout at
> other places in the JVM as
> well, so it's consistent.
> 
>  325
>  326   GenCollectedHeap* gch = GenCollectedHeap::heap();
>  327   double start_time = os::elapsedTime();
>  328   double end_time = start_time + (t_millis / 1000.0);
> 
> Note how, the end_time == start_time for the special case of t_millis
> == 0, so we need to treat that
> case specially below.
> 
>  329
>  330   // Total collections count before waiting loop
>  331   unsigned int before_count;
>  332   {
>  333     MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>  334     before_count = gch->total_collections();
>  335   }
> 
> Good.
> 
>  336
>  337   while (true) {
>  338     double now_time = os::elapsedTime();
>  339     long wait_time_millis = (long)((end_time - now_time) * 1000.0);
>  340
>  341     if (wait_time_millis <= 0) {
>  342       // Wait time is over
>  343       break;
>  344     }
> 
> Modify to:
>            if (t_millis != 0) {
>               if  (wait_time_millis <= 0)  {
>                  // Wait time is over
>                  break;
>              }
>           } else {
>              wait_time_millis = 0;     // for use in wait() below
>          }
> 
>  345
>  346     // Wait until the next event or the remaining timeout
>  347     {
>  348       MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>  349       if (_should_terminate || _collector->_full_gc_requested) {
>  350         return;
>  351       }
>  352       set_CMS_flag(CMS_cms_wants_token);   // to provoke notifies
> 
> insert:     assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> 
>  353       CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
>  354       clear_CMS_flag(CMS_cms_wants_token);
>  355       assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
>  356              "Should not be set");
>  357     }
>  358
>  359     // Extra wait time check before entering the heap lock to get
> the collection count
>  360     if (os::elapsedTime() >= end_time) {
>  361       // Wait time is over
>  362       break;
>  363     }
> 
> Modify above wait time check to make an exception for t_miliis == 0:
>            // Extra wait time check before checking collection count
>            if (t_millis != 0 && os::elapsedTime() >= end_time) {
>               // wait time exceeded
>               break;
>            }
> 
>  364
>  365     // Total collections count after the event
>  366     unsigned int after_count;
>  367     {
>  368       MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>  369       after_count = gch->total_collections();
>  370     }
>  371
>  372     if (before_count != after_count) {
>  373       // There was a collection - success
>  374       break;
>  375     }
>  376   }
>  377 }
> 
> While it is true that we do not have a case where the method is called
> with a time of "0", I think we
> want that value to be treated correctly as "infinity". For the case
> where we do not want a wait at all,
> we should use a small positive value, like "1 ms" to signal that
> intent, i.e. -XX:CMSWaitDuration=1,
> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
> by reserving negative values to
> signal infinity, but that would make the code in the loop a bit more fiddly.)
> 
> As mentioned in my previous email, I'd like to see this tested with
> CMSWaitDuration set to 0, positive and
> negative values (if necessary, we can reject negative value settings),
> and with ExplicitGCInvokesConcurrent.
> 
> Rest looks OK to me, although I am not sure how this behaves with
> iCMS, as I have forgotten that part of the
> code.
> 
> Finally, in current code (before these changes) there are two callers
> of the former wait_for_cms_lock() method,
> one here in sleepBeforeNextCycle() and one from the precleaning loop.
> I think the right thing has been done
> in terms of leaving the latter alone.
> 
> It would be good if this were checked with CMSInitiatingOccupancy set
> to 0 (or a small value), CMSWaitDuration set to 0,
> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
> CMS cycles start very soon after the end of
> a scavenge (and not at random times as Michal has observed earlier,
> although i am guessing that is difficult to test).
> It would be good to repeat the above test with iCMS as well.
> 
> thanks!
> -- ramki
> 
> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna  wrote:
> >
> > Thanks Jon for the pointer:
> >
> >
> > On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu  wrote:
> >>
> >>
> >>
> >> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
> >>>
> >>> The high level idea looks correct. I'll look at the details in a bit (seriously this time; sorry it dropped off my plate last time I promised).
> >>> Does anyone have a pointer to the related discussion thread on this aias from earlier in the year, by chance, so one could refresh one's
> >>> memory of that discussion?
> >>
> >>
> >> subj: CMSWaitDuration unstable behavior
> >>
> >> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
> >>
> >>
> >
> > also: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
> >
> > On to it later this afternoon, and TTYL w/review.
> > - ramki