RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
John Cuthbertson
john.cuthbertson at oracle.com
Thu Jan 10 20:20:28 UTC 2013
Hi Michal,
On 1/10/2013 11:46 AM, John Cuthbertson wrote:
> Hi Michal,
>
> Many apologies for the delay in generating a new webrev for this
> change but here is the new one:
> http://cr.openjdk.java.net/~johnc/7189971/webrev.1/
>
> Can you verify the webrev to make sure that changes have been applied
> correctly? Looking at the new webrev it seems that the setting of the
> CMS has been moved back above the return out of the loop. Was this
> intentional?
The above should be "... setting of the CMS token has been ...".
JohnC
>
> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0
> and CMSWaitDuration=1500 with CMS.
>
> Regards,
>
> JohnC
>
> On 12/12/2012 4:35 AM, Michal Frajt wrote:
>> All,
>> Find the attached patch. It implements proposed recommendations and
>> requested changes. Please mind that the CMSWaitDuration set to -1
>> (never wait) requires new parameter CMSCheckInterval (develop only,
>> 1000 milliseconds default - constant). The parameter defines the
>> next CMS cycle start check interval in the case there are no
>> desynchronization (notifications) events on the CGC_lock.
>>
>> Tested with the Solaris/amd64 build
>> CMS
>> + CMSWaitDuration>0 OK
>> + CMSWaitDuration=0 OK
>> + CMSWaitDuration<0 OK
>> iCMS
>> + CMSWaitDuration>0 OK
>> + CMSWaitDuration=0 OK
>> + CMSWaitDuration<0 OK
>> Regards,
>> Michal
>> Od: hotspot-gc-dev-bounces at openjdk.java.net
>> Komu: hotspot-gc-dev at openjdk.java.net
>> Kopie:
>> Datum: Fri, 7 Dec 2012 18:48:48 +0100
>> Předmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
>> non-incremental mode of CMS
>>
>>> Hi John/Jon/Ramki,
>>>
>>> All proposed recommendations and requested changes have been
>>> implemented. We are going to test it on Monday. You will get the new
>>> tested patch soon.
>>>
>>> The attached code here just got compiled, no test executed yet, it
>>> might contain a bug, but you can quickly review it and send your
>>> comments.
>>>
>>> Best regards
>>> Michal
>>>
>>>
>>> // Wait until the next synchronous GC, a concurrent full gc request,
>>> // or a timeout, whichever is earlier.
>>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>> t_millis) {
>>> // Wait time in millis or 0 value representing infinite wait for
>>> a scavenge
>>> assert(t_millis >= 0, "Wait time for scavenge should be 0 or
>>> positive");
>>>
>>> GenCollectedHeap* gch = GenCollectedHeap::heap();
>>> double start_time_secs = os::elapsedTime();
>>> double end_time_secs = start_time_secs + (t_millis / ((double)
>>> MILLIUNITS));
>>>
>>> // Total collections count before waiting loop
>>> unsigned int before_count;
>>> {
>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>> before_count = gch->total_collections();
>>> }
>>>
>>> unsigned int loop_count = 0;
>>>
>>> while(!_should_terminate) {
>>> double now_time = os::elapsedTime();
>>> long wait_time_millis;
>>>
>>> if(t_millis != 0) {
>>> // New wait limit
>>> wait_time_millis = (long) ((end_time_secs - now_time) *
>>> MILLIUNITS);
>>> if(wait_time_millis <= 0) {
>>> // Wait time is over
>>> break;
>>> }
>>> } else {
>>> // No wait limit, wait if necessary forever
>>> wait_time_millis = 0;
>>> }
>>>
>>> // Wait until the next event or the remaining timeout
>>> {
>>> MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>>>
>>> set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
>>> if (_should_terminate || _collector->_full_gc_requested) {
>>> return;
>>> }
>>> assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>> CGC_lock->wait(Mutex::_no_safepoint_check_flag,
>>> wait_time_millis);
>>> clear_CMS_flag(CMS_cms_wants_token);
>>> assert(!CMS_flag_is_set(CMS_cms_has_token |
>>> CMS_cms_wants_token),
>>> "Should not be set");
>>> }
>>>
>>> // Extra wait time check before entering the heap lock to get
>>> the collection count
>>> if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
>>> // Wait time is over
>>> break;
>>> }
>>>
>>> // Total collections count after the event
>>> unsigned int after_count;
>>> {
>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>> after_count = gch->total_collections();
>>> }
>>>
>>> if(before_count != after_count) {
>>> // There was a collection - success
>>> break;
>>> }
>>>
>>> // Too many loops warning
>>> if(++loop_count == 0) {
>>> warning("wait_on_cms_lock_for_scavenge() has looped %d
>>> times", loop_count - 1);
>>> }
>>> }
>>> }
>>>
>>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
>>> while (!_should_terminate) {
>>> if (CMSIncrementalMode) {
>>> icms_wait();
>>> if(CMSWaitDuration >= 0) {
>>> // Wait until the next synchronous GC, a concurrent full gc
>>> // request or a timeout, whichever is earlier.
>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>> }
>>> return;
>>> } else {
>>> if(CMSWaitDuration >= 0) {
>>> // Wait until the next synchronous GC, a concurrent full gc
>>> // request or a timeout, whichever is earlier.
>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>> } else {
>>> // Wait until any cms_lock event not to call
>>> shouldConcurrentCollect permanently
>>> wait_on_cms_lock(0);
>>> }
>>> }
>>> // Check if we should start a CMS collection cycle
>>> if (_collector->shouldConcurrentCollect()) {
>>> return;
>>> }
>>> // .. collection criterion not yet met, let's go back
>>> // and wait some more
>>> }
>>> }
>>>
>>> Od: hotspot-gc-dev-bounces at openjdk.java.net
>>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson"
>>> john.cuthbertson at oracle.com
>>> Kopie: hotspot-gc-dev at openjdk.java.net
>>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
>>> Předmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
>>> non-incremental mode of CMS
>>>
>>>> Hi John --
>>>>
>>>> wrt the changes posted, i see the intent of the code and agree with
>>>> it. I have a few minor suggestions on the
>>>> details of how it's implemented. My comments are inline below,
>>>> interleaved with the code:
>>>>
>>>> 317 // Wait until the next synchronous GC, a concurrent full gc
>>>> request,
>>>> 318 // or a timeout, whichever is earlier.
>>>> 319 void
>>>> ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>>> t_millis) {
>>>> 320 // Wait for any cms_lock event when timeout not specified
>>>> (0 millis)
>>>> 321 if (t_millis == 0) {
>>>> 322 wait_on_cms_lock(t_millis);
>>>> 323 return;
>>>> 324 }
>>>>
>>>> I'd completely avoid the special case above because it would miss the
>>>> part about waiting for a
>>>> scavenge, instead dealing with that case in the code in the loop below
>>>> directly. The idea
>>>> of the "0" value is not to ask that we return immediately, but that we
>>>> wait, if necessary
>>>> forever, for a scavenge. The "0" really represents the value infinity
>>>> in that sense. This would
>>>> be in keeping with our use of wait() with a "0" value for timeout at
>>>> other places in the JVM as
>>>> well, so it's consistent.
>>>>
>>>> 325
>>>> 326 GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>> 327 double start_time = os::elapsedTime();
>>>> 328 double end_time = start_time + (t_millis / 1000.0);
>>>>
>>>> Note how, the end_time == start_time for the special case of t_millis
>>>> == 0, so we need to treat that
>>>> case specially below.
>>>>
>>>> 329
>>>> 330 // Total collections count before waiting loop
>>>> 331 unsigned int before_count;
>>>> 332 {
>>>> 333 MutexLockerEx hl(Heap_lock,
>>>> Mutex::_no_safepoint_check_flag);
>>>> 334 before_count = gch->total_collections();
>>>> 335 }
>>>>
>>>> Good.
>>>>
>>>> 336
>>>> 337 while (true) {
>>>> 338 double now_time = os::elapsedTime();
>>>> 339 long wait_time_millis = (long)((end_time - now_time) *
>>>> 1000.0);
>>>> 340
>>>> 341 if (wait_time_millis <= 0) {
>>>> 342 // Wait time is over
>>>> 343 break;
>>>> 344 }
>>>>
>>>> Modify to:
>>>> if (t_millis != 0) {
>>>> if (wait_time_millis <= 0) {
>>>> // Wait time is over
>>>> break;
>>>> }
>>>> } else {
>>>> wait_time_millis = 0; // for use in wait() below
>>>> }
>>>>
>>>> 345
>>>> 346 // Wait until the next event or the remaining timeout
>>>> 347 {
>>>> 348 MutexLockerEx x(CGC_lock,
>>>> Mutex::_no_safepoint_check_flag);
>>>> 349 if (_should_terminate || _collector->_full_gc_requested) {
>>>> 350 return;
>>>> 351 }
>>>> 352 set_CMS_flag(CMS_cms_wants_token); // to provoke
>>>> notifies
>>>>
>>>> insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>>
>>>> 353 CGC_lock->wait(Mutex::_no_safepoint_check_flag,
>>>> wait_time_millis);
>>>> 354 clear_CMS_flag(CMS_cms_wants_token);
>>>> 355 assert(!CMS_flag_is_set(CMS_cms_has_token |
>>>> CMS_cms_wants_token),
>>>> 356 "Should not be set");
>>>> 357 }
>>>> 358
>>>> 359 // Extra wait time check before entering the heap lock to
>>>> get
>>>> the collection count
>>>> 360 if (os::elapsedTime() >= end_time) {
>>>> 361 // Wait time is over
>>>> 362 break;
>>>> 363 }
>>>>
>>>> Modify above wait time check to make an exception for t_miliis == 0:
>>>> // Extra wait time check before checking collection count
>>>> if (t_millis != 0 && os::elapsedTime() >= end_time) {
>>>> // wait time exceeded
>>>> break;
>>>> }
>>>>
>>>> 364
>>>> 365 // Total collections count after the event
>>>> 366 unsigned int after_count;
>>>> 367 {
>>>> 368 MutexLockerEx hl(Heap_lock,
>>>> Mutex::_no_safepoint_check_flag);
>>>> 369 after_count = gch->total_collections();
>>>> 370 }
>>>> 371
>>>> 372 if (before_count != after_count) {
>>>> 373 // There was a collection - success
>>>> 374 break;
>>>> 375 }
>>>> 376 }
>>>> 377 }
>>>>
>>>> While it is true that we do not have a case where the method is called
>>>> with a time of "0", I think we
>>>> want that value to be treated correctly as "infinity". For the case
>>>> where we do not want a wait at all,
>>>> we should use a small positive value, like "1 ms" to signal that
>>>> intent, i.e. -XX:CMSWaitDuration=1,
>>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
>>>> by reserving negative values to
>>>> signal infinity, but that would make the code in the loop a bit
>>>> more fiddly.)
>>>>
>>>> As mentioned in my previous email, I'd like to see this tested with
>>>> CMSWaitDuration set to 0, positive and
>>>> negative values (if necessary, we can reject negative value settings),
>>>> and with ExplicitGCInvokesConcurrent.
>>>>
>>>> Rest looks OK to me, although I am not sure how this behaves with
>>>> iCMS, as I have forgotten that part of the
>>>> code.
>>>>
>>>> Finally, in current code (before these changes) there are two callers
>>>> of the former wait_for_cms_lock() method,
>>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
>>>> I think the right thing has been done
>>>> in terms of leaving the latter alone.
>>>>
>>>> It would be good if this were checked with CMSInitiatingOccupancy set
>>>> to 0 (or a small value), CMSWaitDuration set to 0,
>>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
>>>> CMS cycles start very soon after the end of
>>>> a scavenge (and not at random times as Michal has observed earlier,
>>>> although i am guessing that is difficult to test).
>>>> It would be good to repeat the above test with iCMS as well.
>>>>
>>>> thanks!
>>>> -- ramki
>>>>
>>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna wrote:
>>>>> Thanks Jon for the pointer:
>>>>>
>>>>>
>>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu wrote:
>>>>>>
>>>>>>
>>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
>>>>>>> The high level idea looks correct. I'll look at the details in a
>>>>>>> bit (seriously this time; sorry it dropped off my plate last
>>>>>>> time I promised).
>>>>>>> Does anyone have a pointer to the related discussion thread on
>>>>>>> this aias from earlier in the year, by chance, so one could
>>>>>>> refresh one's
>>>>>>> memory of that discussion?
>>>>>>
>>>>>> subj: CMSWaitDuration unstable behavior
>>>>>>
>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
>>>>>>
>>>>>>
>>>>>>
>>>>> also:
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
>>>>>
>>>>> On to it later this afternoon, and TTYL w/review.
>>>>> - ramki
>
More information about the hotspot-gc-dev
mailing list