RFR(S/M): 7112743: G1: Reduce overhead of marking closure during evacuation pauses

Thu Nov 24 07:46:34 UTC 2011

Tony,

On 2011-11-23 14:23, Tony Printezis wrote:
> Bengt,
>
> FYI: doHeapRegion() is supposed to return false if it wants the 
> iteration to carry on, true if it wants the iteration to exit 
> immediately (IMHO, it should have been the other way around, but 
> anyway...).

I understand this protocol. (And I agree that it is using true/false in 
an unintuitive way. ;-) )

What I was trying to say is that there is code that relies on 
doHeapRegion() not returning true. For example code like:

             bool res2 = cl->doHeapRegion(chr);
             assert(!res2, "Should not abort");

and the comment for G1CollectedHeap::heap_region_par_iterate_chunked() 
says " For now requires that "doHeapRegion" always returns "false"".

To me this seems a bit odd. I'd prefer that we'd separate out abortable 
iteration and non-abortable iteration. It looks like we are trying to 
using the same protocol for two different purposes.

Just to be clear. This is nothing I expect John to address in his 
current change.

Bengt

>
> Tony
>
> On 11/23/2011 08:15 AM, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Overall this looks good.
>>
>> Some comments:
>>
>> concurrentMark.cpp
>>
>> line 2880. return statement has wrong indentation now. A space was 
>> removed.
>>
>> bool doHeapRegion(HeapRegion* hr)
>>
>> why use the completed varaible? why not just:
>> while (!_bm->iterate(&_bit_cl, mr));
>>
>> I don't really like that the completed name is so close to 
>> HeapRegionClosure::_complete
>>
>> I am also a bit confused about the bool return value. It seems that 
>> many places in the code requires that doHeapRegion() always returns 
>> false. I hope this gets cleaned up by Tony's iterator changes. I have 
>> not yet had time to look at them.
>>
>>
>>
>> g1CollectedHeap.cpp
>>
>> G1CollectedHeap::calculateStartRegion()
>>
>> (I saw that you just moved the code from g1RemSet, but I still have 
>> some comments regarding it.)
>>
>> Shouldn't the check "if (ParallelGCThreads > 0)" be "if 
>> (use_parallel_gc_threads())"?
>> I realize that it is the same thing, but since we have a method for 
>> it I assume that we should use it. In fact, it might be more correct 
>> to have the test be "if (workers()->total_workers() > 0)" since we 
>> would get a division by zero if that is not true. (An alternative 
>> would be to check "if (worker_i > 0)". That would be more efficient, 
>> but it would rely on that single threaded calls always pass 0 as 
>> worker_i.)
>>
>> Also, it worries me a bit that we iterate over the collection set for 
>> each thread to find the start. Seems like it could be a bottle neck 
>> if we have many threads and many regions in the collection set. We 
>> can calculate the start regions by just iterating once over the 
>> collection set since we know the number of threads. I am not sure 
>> where the best place to add that logic is though. Maybe we could let 
>> G1CollectorPolicy have a method that calculates the start regions 
>> once for all worker threads. Then G1RemSet::scanRS() and 
>> G1ParCompleteMarkInCSTask::work() could look up their start regions 
>> in constant time.
>>
>>
>> Bengt
>>
>>
>>
>> On 2011-11-22 18:23, John Cuthbertson wrote:
>>> Hi Everyone,
>>>
>>> I have a new webrev for this change based upon some review comments 
>>> from Tony. The new webrev can be found at: 
>>> http://cr.openjdk.java.net/~johnc/7112743/webrev.2/
>>>
>>> Thanks,
>>>
>>> JohnC
>>>
>>> On 11/18/11 10:18, John Cuthbertson wrote:
>>>> Hi Everyone,
>>>>
>>>> Can I have a couple of volunteers look over these changes? The 
>>>> webrev can be found at: 
>>>> http://cr.openjdk.java.net/~johnc/7112743/webrev.1/
>>>>
>>>> Summary:
>>>> We have a workload where the serial "complete marking in collection 
>>>> set" closure was taking a realy long time and lengthening the pause 
>>>> times. Since this code might be going away at some point (see CR 
>>>> 6888336:  G1: avoid explicitly marking and pushing objects in 
>>>> survivor spaces) it was decided that a short term solution would be 
>>>> to parallelize the existing code and have the GC workers execute 
>>>> the code rather than the VM thread. Also included are some minor 
>>>> tweaks to the PrintGCDetails output to print out the mark closure 
>>>> time and the collection set freeing time (since some of the 
>>>> original code was moved there).
>>>>
>>>> Testing: GC test suite (with 12 GC threads, makeing threshold of 
>>>> 5%, marking verification enabled, and heap verification after GC 
>>>> enabled); GC test suite on my workstation (4 GC threads, same flags 
>>>> as before); Kitchensink; specjbb2005 configured to have at leat 60% 
>>>> heap utilization.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>
>>