Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
Michal Frajt
michal at frajt.eu
Wed Dec 12 12:35:25 UTC 2012
All,
Find the attached patch. It implements proposed recommendations and requested changes. Please mind that the CMSWaitDuration set to -1 (never wait) requires new parameter CMSCheckInterval (develop only, 1000 milliseconds default - constant). The parameter defines the next CMS cycle start check interval in the case there are no desynchronization (notifications) events on the CGC_lock.
Tested with the Solaris/amd64 build
CMS
+ CMSWaitDuration>0 OK
+ CMSWaitDuration=0 OK
+ CMSWaitDuration<0 OK
iCMS
+ CMSWaitDuration>0 OK
+ CMSWaitDuration=0 OK
+ CMSWaitDuration<0 OK
Regards,
Michal
Od: hotspot-gc-dev-bounces at openjdk.java.net
Komu: hotspot-gc-dev at openjdk.java.net
Kopie:
Datum: Fri, 7 Dec 2012 18:48:48 +0100
Předmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
> Hi John/Jon/Ramki,
>
> All proposed recommendations and requested changes have been implemented. We are going to test it on Monday. You will get the new tested patch soon.
>
> The attached code here just got compiled, no test executed yet, it might contain a bug, but you can quickly review it and send your comments.
>
> Best regards
> Michal
>
>
> // Wait until the next synchronous GC, a concurrent full gc request,
> // or a timeout, whichever is earlier.
> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long t_millis) {
> // Wait time in millis or 0 value representing infinite wait for a scavenge
> assert(t_millis >= 0, "Wait time for scavenge should be 0 or positive");
>
> GenCollectedHeap* gch = GenCollectedHeap::heap();
> double start_time_secs = os::elapsedTime();
> double end_time_secs = start_time_secs + (t_millis / ((double) MILLIUNITS));
>
> // Total collections count before waiting loop
> unsigned int before_count;
> {
> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> before_count = gch->total_collections();
> }
>
> unsigned int loop_count = 0;
>
> while(!_should_terminate) {
> double now_time = os::elapsedTime();
> long wait_time_millis;
>
> if(t_millis != 0) {
> // New wait limit
> wait_time_millis = (long) ((end_time_secs - now_time) * MILLIUNITS);
> if(wait_time_millis <= 0) {
> // Wait time is over
> break;
> }
> } else {
> // No wait limit, wait if necessary forever
> wait_time_millis = 0;
> }
>
> // Wait until the next event or the remaining timeout
> {
> MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>
> set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
> if (_should_terminate || _collector->_full_gc_requested) {
> return;
> }
> assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
> clear_CMS_flag(CMS_cms_wants_token);
> assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
> "Should not be set");
> }
>
> // Extra wait time check before entering the heap lock to get the collection count
> if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
> // Wait time is over
> break;
> }
>
> // Total collections count after the event
> unsigned int after_count;
> {
> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> after_count = gch->total_collections();
> }
>
> if(before_count != after_count) {
> // There was a collection - success
> break;
> }
>
> // Too many loops warning
> if(++loop_count == 0) {
> warning("wait_on_cms_lock_for_scavenge() has looped %d times", loop_count - 1);
> }
> }
> }
>
> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
> while (!_should_terminate) {
> if (CMSIncrementalMode) {
> icms_wait();
> if(CMSWaitDuration >= 0) {
> // Wait until the next synchronous GC, a concurrent full gc
> // request or a timeout, whichever is earlier.
> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> }
> return;
> } else {
> if(CMSWaitDuration >= 0) {
> // Wait until the next synchronous GC, a concurrent full gc
> // request or a timeout, whichever is earlier.
> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> } else {
> // Wait until any cms_lock event not to call shouldConcurrentCollect permanently
> wait_on_cms_lock(0);
> }
> }
> // Check if we should start a CMS collection cycle
> if (_collector->shouldConcurrentCollect()) {
> return;
> }
> // .. collection criterion not yet met, let's go back
> // and wait some more
> }
> }
>
>
>
> Od: hotspot-gc-dev-bounces at openjdk.java.net
> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson" john.cuthbertson at oracle.com
> Kopie: hotspot-gc-dev at openjdk.java.net
> Datum: Thu, 6 Dec 2012 23:43:29 -0800
> Předmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
>
> > Hi John --
> >
> > wrt the changes posted, i see the intent of the code and agree with
> > it. I have a few minor suggestions on the
> > details of how it's implemented. My comments are inline below,
> > interleaved with the code:
> >
> > 317 // Wait until the next synchronous GC, a concurrent full gc request,
> > 318 // or a timeout, whichever is earlier.
> > 319 void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
> > t_millis) {
> > 320 // Wait for any cms_lock event when timeout not specified (0 millis)
> > 321 if (t_millis == 0) {
> > 322 wait_on_cms_lock(t_millis);
> > 323 return;
> > 324 }
> >
> > I'd completely avoid the special case above because it would miss the
> > part about waiting for a
> > scavenge, instead dealing with that case in the code in the loop below
> > directly. The idea
> > of the "0" value is not to ask that we return immediately, but that we
> > wait, if necessary
> > forever, for a scavenge. The "0" really represents the value infinity
> > in that sense. This would
> > be in keeping with our use of wait() with a "0" value for timeout at
> > other places in the JVM as
> > well, so it's consistent.
> >
> > 325
> > 326 GenCollectedHeap* gch = GenCollectedHeap::heap();
> > 327 double start_time = os::elapsedTime();
> > 328 double end_time = start_time + (t_millis / 1000.0);
> >
> > Note how, the end_time == start_time for the special case of t_millis
> > == 0, so we need to treat that
> > case specially below.
> >
> > 329
> > 330 // Total collections count before waiting loop
> > 331 unsigned int before_count;
> > 332 {
> > 333 MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> > 334 before_count = gch->total_collections();
> > 335 }
> >
> > Good.
> >
> > 336
> > 337 while (true) {
> > 338 double now_time = os::elapsedTime();
> > 339 long wait_time_millis = (long)((end_time - now_time) * 1000.0);
> > 340
> > 341 if (wait_time_millis <= 0) {
> > 342 // Wait time is over
> > 343 break;
> > 344 }
> >
> > Modify to:
> > if (t_millis != 0) {
> > if (wait_time_millis <= 0) {
> > // Wait time is over
> > break;
> > }
> > } else {
> > wait_time_millis = 0; // for use in wait() below
> > }
> >
> > 345
> > 346 // Wait until the next event or the remaining timeout
> > 347 {
> > 348 MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
> > 349 if (_should_terminate || _collector->_full_gc_requested) {
> > 350 return;
> > 351 }
> > 352 set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
> >
> > insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> >
> > 353 CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
> > 354 clear_CMS_flag(CMS_cms_wants_token);
> > 355 assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
> > 356 "Should not be set");
> > 357 }
> > 358
> > 359 // Extra wait time check before entering the heap lock to get
> > the collection count
> > 360 if (os::elapsedTime() >= end_time) {
> > 361 // Wait time is over
> > 362 break;
> > 363 }
> >
> > Modify above wait time check to make an exception for t_miliis == 0:
> > // Extra wait time check before checking collection count
> > if (t_millis != 0 && os::elapsedTime() >= end_time) {
> > // wait time exceeded
> > break;
> > }
> >
> > 364
> > 365 // Total collections count after the event
> > 366 unsigned int after_count;
> > 367 {
> > 368 MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> > 369 after_count = gch->total_collections();
> > 370 }
> > 371
> > 372 if (before_count != after_count) {
> > 373 // There was a collection - success
> > 374 break;
> > 375 }
> > 376 }
> > 377 }
> >
> > While it is true that we do not have a case where the method is called
> > with a time of "0", I think we
> > want that value to be treated correctly as "infinity". For the case
> > where we do not want a wait at all,
> > we should use a small positive value, like "1 ms" to signal that
> > intent, i.e. -XX:CMSWaitDuration=1,
> > reserving CMSWaitDuration=0 to signal infinity. (We could also do that
> > by reserving negative values to
> > signal infinity, but that would make the code in the loop a bit more fiddly.)
> >
> > As mentioned in my previous email, I'd like to see this tested with
> > CMSWaitDuration set to 0, positive and
> > negative values (if necessary, we can reject negative value settings),
> > and with ExplicitGCInvokesConcurrent.
> >
> > Rest looks OK to me, although I am not sure how this behaves with
> > iCMS, as I have forgotten that part of the
> > code.
> >
> > Finally, in current code (before these changes) there are two callers
> > of the former wait_for_cms_lock() method,
> > one here in sleepBeforeNextCycle() and one from the precleaning loop.
> > I think the right thing has been done
> > in terms of leaving the latter alone.
> >
> > It would be good if this were checked with CMSInitiatingOccupancy set
> > to 0 (or a small value), CMSWaitDuration set to 0,
> > -+PromotionFailureALot and checking that (1) it does not deadlock (2)
> > CMS cycles start very soon after the end of
> > a scavenge (and not at random times as Michal has observed earlier,
> > although i am guessing that is difficult to test).
> > It would be good to repeat the above test with iCMS as well.
> >
> > thanks!
> > -- ramki
> >
> > On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna wrote:
> > >
> > > Thanks Jon for the pointer:
> > >
> > >
> > > On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu wrote:
> > >>
> > >>
> > >>
> > >> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
> > >>>
> > >>> The high level idea looks correct. I'll look at the details in a bit (seriously this time; sorry it dropped off my plate last time I promised).
> > >>> Does anyone have a pointer to the related discussion thread on this aias from earlier in the year, by chance, so one could refresh one's
> > >>> memory of that discussion?
> > >>
> > >>
> > >> subj: CMSWaitDuration unstable behavior
> > >>
> > >> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
> > >>
> > >>
> > >
> > > also: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
> > >
> > > On to it later this afternoon, and TTYL w/review.
> > > - ramki
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openjdk7u-7189971.patch
Type: application/octet-stream
Size: 6454 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20121212/3b09bc69/openjdk7u-7189971.patch>
More information about the hotspot-gc-dev
mailing list