RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
Srinivas Ramakrishna
ysr1729 at gmail.com
Fri Dec 7 07:43:29 UTC 2012
Hi John --
wrt the changes posted, i see the intent of the code and agree with
it. I have a few minor suggestions on the
details of how it's implemented. My comments are inline below,
interleaved with the code:
317 // Wait until the next synchronous GC, a concurrent full gc request,
318 // or a timeout, whichever is earlier.
319 void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
t_millis) {
320 // Wait for any cms_lock event when timeout not specified (0 millis)
321 if (t_millis == 0) {
322 wait_on_cms_lock(t_millis);
323 return;
324 }
I'd completely avoid the special case above because it would miss the
part about waiting for a
scavenge, instead dealing with that case in the code in the loop below
directly. The idea
of the "0" value is not to ask that we return immediately, but that we
wait, if necessary
forever, for a scavenge. The "0" really represents the value infinity
in that sense. This would
be in keeping with our use of wait() with a "0" value for timeout at
other places in the JVM as
well, so it's consistent.
325
326 GenCollectedHeap* gch = GenCollectedHeap::heap();
327 double start_time = os::elapsedTime();
328 double end_time = start_time + (t_millis / 1000.0);
Note how, the end_time == start_time for the special case of t_millis
== 0, so we need to treat that
case specially below.
329
330 // Total collections count before waiting loop
331 unsigned int before_count;
332 {
333 MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
334 before_count = gch->total_collections();
335 }
Good.
336
337 while (true) {
338 double now_time = os::elapsedTime();
339 long wait_time_millis = (long)((end_time - now_time) * 1000.0);
340
341 if (wait_time_millis <= 0) {
342 // Wait time is over
343 break;
344 }
Modify to:
if (t_millis != 0) {
if (wait_time_millis <= 0) {
// Wait time is over
break;
}
} else {
wait_time_millis = 0; // for use in wait() below
}
345
346 // Wait until the next event or the remaining timeout
347 {
348 MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
349 if (_should_terminate || _collector->_full_gc_requested) {
350 return;
351 }
352 set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
353 CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
354 clear_CMS_flag(CMS_cms_wants_token);
355 assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
356 "Should not be set");
357 }
358
359 // Extra wait time check before entering the heap lock to get
the collection count
360 if (os::elapsedTime() >= end_time) {
361 // Wait time is over
362 break;
363 }
Modify above wait time check to make an exception for t_miliis == 0:
// Extra wait time check before checking collection count
if (t_millis != 0 && os::elapsedTime() >= end_time) {
// wait time exceeded
break;
}
364
365 // Total collections count after the event
366 unsigned int after_count;
367 {
368 MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
369 after_count = gch->total_collections();
370 }
371
372 if (before_count != after_count) {
373 // There was a collection - success
374 break;
375 }
376 }
377 }
While it is true that we do not have a case where the method is called
with a time of "0", I think we
want that value to be treated correctly as "infinity". For the case
where we do not want a wait at all,
we should use a small positive value, like "1 ms" to signal that
intent, i.e. -XX:CMSWaitDuration=1,
reserving CMSWaitDuration=0 to signal infinity. (We could also do that
by reserving negative values to
signal infinity, but that would make the code in the loop a bit more fiddly.)
As mentioned in my previous email, I'd like to see this tested with
CMSWaitDuration set to 0, positive and
negative values (if necessary, we can reject negative value settings),
and with ExplicitGCInvokesConcurrent.
Rest looks OK to me, although I am not sure how this behaves with
iCMS, as I have forgotten that part of the
code.
Finally, in current code (before these changes) there are two callers
of the former wait_for_cms_lock() method,
one here in sleepBeforeNextCycle() and one from the precleaning loop.
I think the right thing has been done
in terms of leaving the latter alone.
It would be good if this were checked with CMSInitiatingOccupancy set
to 0 (or a small value), CMSWaitDuration set to 0,
-+PromotionFailureALot and checking that (1) it does not deadlock (2)
CMS cycles start very soon after the end of
a scavenge (and not at random times as Michal has observed earlier,
although i am guessing that is difficult to test).
It would be good to repeat the above test with iCMS as well.
thanks!
-- ramki
On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna <ysr1729 at gmail.com> wrote:
>
> Thanks Jon for the pointer:
>
>
> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>>
>>
>>
>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
>>>
>>> The high level idea looks correct. I'll look at the details in a bit (seriously this time; sorry it dropped off my plate last time I promised).
>>> Does anyone have a pointer to the related discussion thread on this aias from earlier in the year, by chance, so one could refresh one's
>>> memory of that discussion?
>>
>>
>> subj: CMSWaitDuration unstable behavior
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
>>
>>
>
> also: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
>
> On to it later this afternoon, and TTYL w/review.
> - ramki
More information about the hotspot-gc-dev
mailing list