RFR(S/M): 7112743: G1: Reduce overhead of marking closure during evacuation pauses

Thu Nov 24 16:26:41 UTC 2011

Bengt,

On 11/24/2011 02:46 AM, Bengt Rutisson wrote:
> What I was trying to say is that there is code that relies on 
> doHeapRegion() not returning true. For example code like:
>
>             bool res2 = cl->doHeapRegion(chr);
>             assert(!res2, "Should not abort");
>
> and the comment for G1CollectedHeap::heap_region_par_iterate_chunked() 
> says " For now requires that "doHeapRegion" always returns "false"".

Sure, this is because we need to make sure that each iteration claims 
all the regions so that the claim values are not left inconsistent.

> To me this seems a bit odd. I'd prefer that we'd separate out 
> abortable iteration and non-abortable iteration. It looks like we are 
> trying to using the same protocol for two different purposes.

For that we'd either need different closures or different methods 
(abortable / not abortable) on each closure (and maybe a flag saying 
what type of closure it is?) which would add an extra layer of 
complexity to the closure hierarchy... As you know I'm not a big fan of 
using closures for such iterations, but I have to say that assuming that 
some never abort is the last thing that annoys me about using them.

Tony

> Just to be clear. This is nothing I expect John to address in his 
> current change.
>
> Bengt
>
>
>>
>> Tony
>>
>> On 11/23/2011 08:15 AM, Bengt Rutisson wrote:
>>>
>>> Hi John,
>>>
>>> Overall this looks good.
>>>
>>> Some comments:
>>>
>>> concurrentMark.cpp
>>>
>>> line 2880. return statement has wrong indentation now. A space was 
>>> removed.
>>>
>>> bool doHeapRegion(HeapRegion* hr)
>>>
>>> why use the completed varaible? why not just:
>>> while (!_bm->iterate(&_bit_cl, mr));
>>>
>>> I don't really like that the completed name is so close to 
>>> HeapRegionClosure::_complete
>>>
>>> I am also a bit confused about the bool return value. It seems that 
>>> many places in the code requires that doHeapRegion() always returns 
>>> false. I hope this gets cleaned up by Tony's iterator changes. I 
>>> have not yet had time to look at them.
>>>
>>>
>>>
>>> g1CollectedHeap.cpp
>>>
>>> G1CollectedHeap::calculateStartRegion()
>>>
>>> (I saw that you just moved the code from g1RemSet, but I still have 
>>> some comments regarding it.)
>>>
>>> Shouldn't the check "if (ParallelGCThreads > 0)" be "if 
>>> (use_parallel_gc_threads())"?
>>> I realize that it is the same thing, but since we have a method for 
>>> it I assume that we should use it. In fact, it might be more correct 
>>> to have the test be "if (workers()->total_workers() > 0)" since we 
>>> would get a division by zero if that is not true. (An alternative 
>>> would be to check "if (worker_i > 0)". That would be more efficient, 
>>> but it would rely on that single threaded calls always pass 0 as 
>>> worker_i.)
>>>
>>> Also, it worries me a bit that we iterate over the collection set 
>>> for each thread to find the start. Seems like it could be a bottle 
>>> neck if we have many threads and many regions in the collection set. 
>>> We can calculate the start regions by just iterating once over the 
>>> collection set since we know the number of threads. I am not sure 
>>> where the best place to add that logic is though. Maybe we could let 
>>> G1CollectorPolicy have a method that calculates the start regions 
>>> once for all worker threads. Then G1RemSet::scanRS() and 
>>> G1ParCompleteMarkInCSTask::work() could look up their start regions 
>>> in constant time.
>>>
>>>
>>> Bengt
>>>
>>>
>>>
>>> On 2011-11-22 18:23, John Cuthbertson wrote:
>>>> Hi Everyone,
>>>>
>>>> I have a new webrev for this change based upon some review comments 
>>>> from Tony. The new webrev can be found at: 
>>>> http://cr.openjdk.java.net/~johnc/7112743/webrev.2/
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>>
>>>> On 11/18/11 10:18, John Cuthbertson wrote:
>>>>> Hi Everyone,
>>>>>
>>>>> Can I have a couple of volunteers look over these changes? The 
>>>>> webrev can be found at: 
>>>>> http://cr.openjdk.java.net/~johnc/7112743/webrev.1/
>>>>>
>>>>> Summary:
>>>>> We have a workload where the serial "complete marking in 
>>>>> collection set" closure was taking a realy long time and 
>>>>> lengthening the pause times. Since this code might be going away 
>>>>> at some point (see CR 6888336:  G1: avoid explicitly marking and 
>>>>> pushing objects in survivor spaces) it was decided that a short 
>>>>> term solution would be to parallelize the existing code and have 
>>>>> the GC workers execute the code rather than the VM thread. Also 
>>>>> included are some minor tweaks to the PrintGCDetails output to 
>>>>> print out the mark closure time and the collection set freeing 
>>>>> time (since some of the original code was moved there).
>>>>>
>>>>> Testing: GC test suite (with 12 GC threads, makeing threshold of 
>>>>> 5%, marking verification enabled, and heap verification after GC 
>>>>> enabled); GC test suite on my workstation (4 GC threads, same 
>>>>> flags as before); Kitchensink; specjbb2005 configured to have at 
>>>>> leat 60% heap utilization.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> JohnC
>>>>
>>>
>