CMS parallel initial mark

Fri Jun 7 18:17:42 UTC 2013

On 6/7/2013 4:23 AM, Thomas Schatzl wrote:
> Hi all,
>
>    some notes about the second patch, the modified eden boundary
> sampling; in general I think the idea is good (Jon suggested that it
> should become default), but imho there are some issues with the
> implementation.
>
> On Tue, 2013-05-28 at 17:24 -0700, Hiroshi Yamauchi wrote:
>> Hi,
>>
>> I'd like to have the following contributed if it makes sense.
>>
>>
>> 2) Now, here's a related issue and a suggested fix/patch for it:
>>
>> I see that the initial mark and remark pause times sometimes spike
>> with a large young generation. For example, under a 1 GB young gen / 3
>> GB heap setting, they occasionally spike up to ~500 milliseconds from
>> the normal < 100 ms range, on my machine. As far as I can tell, this
>> happens when the eden is fairly occupied (> 700 MB full) and not
>> sufficiently divided up and the parallelism decreases (at the worst
>> case it becomes almost single-threaded.)
>>
>> Here's a suggested patch in a separate patch:
>>
>>    http://cr.openjdk.java.net/~hiroshi/webrevs/edenchunks/webrev.00/
>>
>> that attempts to improve on this issue by implementing an alternative
>> way of dividing up the eden into chunks for an increased parallelism
>> (or better load balancing between the GC threads) for the young gen
>> scan portion of the remark phase (and the now-parallelized initial
>> mark phase.) It uses a CAS-based mechanism that samples the object
>> boundaries in the eden space on the slow allocation code paths (eg. at
>> the TLAB refill and large object allocation times) at all times.
> - this patch is for DefNew/CMS only it seems. Is this correct?
>
> - in the Hotspot code base explicit != NULL checks are done for whatever
> reason. To keep the same style it would be nice to update the checks
> whether to do the sampling ("if (CMSEdenChunksRecordAlways &&
> _next_gen)") accordingly.
>
> (the next point is not an issue of the patch)
> - the check whether there is a _next_gen is odd - I do not think that
> DefNew works as a generation without an older generation, but I'm not
> sure.
You're correct that DefNew needs to have a _next_gen.

> Looking at other similar code checking for _next_gen != NULL, the
> response is typically to update _next_gen and then asserting that
> _next_gen is != NULL. E.g.
>
>    if (_next_gen == NULL) {
>      GenCollectedHeap* gch = GenCollectedHeap::heap();
>      _next_gen = gch->next_gen(this);
>      assert(_next_gen != NULL,
>             "This must be the youngest gen, and not the only gen");
>    }
>
> Jon?

Yes, except in a very few situations, _next_gen should be set.
Exceptions I would expect is during initialization.  At the point
Thomas indicates and assertion would be sufficient.

assert(_next_gen != NULL, "...");

>
> - I think CMSCollector::_eden_chunk_sampling_active should be volatile
> to avoid any clever compilers skipping the read in sample_eden_chunk().
> (Although I don't think it's possible here). See also below for avoiding
> that variable completely.

Could you explain you thinking for making it volatile.  As you have, I 
didn't think
it was necessary and thought ,"If not necessary, don't do it".

>
> - in the CMSCollector::CMSCollector() initialization list, the
> initialization of _eden_chunk_sampling_active breaks the "-- ditto --"
> list, which looks strange. (Actually I think the comment that a variable
> may eventually be set in the ctor again later seems questionable anyway)
>
> - instead of checking whether _eden_chunk_array is NULL to detect
> whether you can sample I would prefer if the same predicate as in the
> initialization (gch->supports_inline_contig_alloc()) were used.
> (I think the change just copied what was done in the previous version of
> concurrentMarkSweepGeneration.cpp:4515 has been done)
>
> Maybe create a new predicate for that, used everywhere.
>
> (btw, throughout this code there is a lot of checking whether some
> NEW_C_HEAP_ARRAY returned NULL or not, and does elaborate recovery -
> however by default NEW_C_HEAP_ARRAY fails with an OOM anyway... - but
> that is certainly out of scope)

Thomas, could you file a CR for fixing that?  Thanks.

>
> Same for checking whether _survivor_chunk_array is NULL - at the moment
> sometimes _survivor_chunk_array is checked against NULL to avoid
> entering code (in print_eden_and_survivor_chunk_arrays()).
>
> The condition that could be used is the one that triggers allocation of
> _survivor_chunk_array: (CMSParallelRemarkEnabled &&
> CMSParallelSurvivorRemarkEnabled) in
> concurrentMarkSweepGeneration.cpp:728.

Personally, I like a check like if (_eden_chunk_array != NULL)
better. I think it is more future proof.   For whatever reason we don't
try to dereference _eden_chunk_array if it is NULL.  An assertion
checking that

(CMSParallelRemarkEnabled &&
CMSParallelSurvivorRemarkEnabled)

is true would we be appropriate.

Jon

>
> - the code to get the lock on _eden_chunk_sampling_active seems to be
> needlessly complicated.
>
> Please consider using a dedicated Monitor in conjunction with
> try_lock(). I.e.
>
> in CMSCollector add a
>
> Monitor * _cms_eden_sampling_lock; // initialize in the constructor
>
> and then
>
> if (_cms_eden_sampling_lock->try_lock()) {
>
>    // do sampling
>
>    _cms_eden_sampling_lock->unlock();
> }
>
> Monitor::try_lock() boils down to the exactly same code, uses a CAS to
> obtain the lock. I cannot see a reason to roll an own lock
> implementation here.
>
> Hiroshi, is it possible for you to take another look at this change? It
> would help us a lot. If there are questions, please do not hesitate to
> ask.
>
> Thanks a lot for your contributions so far,
> Thomas
>