RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking

Mon Nov 28 13:07:20 UTC 2011

John,

A little late, but here are some comments on this webrev. I know you 
have some more improvements to this change coming, but overall I think 
it looks good. Most of my comments are just minor coding style comments.

Bengt

concurrentMark.hpp

Rename ConcurrentMark::clear() to  ConcurrentMark::clear_mark() or 
ConcurrentMark::unmark()? The commment you added is definitely needed to 
understand what this method does. But it would be even better if it was 
possible to get that from the method name itself.

It seems like everywhere we use count_marked_bytes_for(int worker_i) we 
almost directly use the array returned to index with the heap region 
that we are interested in. How about wrapping all of this is in 
something like count_set_marked_bytes_for(int worker_i, int hrs_index) 
and count_get_marked_bytes_for(int worker_i, int hrs_index) ? That way 
the data structure does not have to be exposed outside ConcurrentMark. 
It would mean that ConcurrentMark::count_region() would have to take a 
worker_i value instead of a marked_bytes_array.

If you don't agree with the suggestion above I would suggest to change 
the name from count_marked_bytes_for() to count_marked_bytes_array_for() 
since in every place that it is being called the resulting value is 
stored in a local variable called marked_bytes_array, which seems like a 
more informative name to me.

I think this comment:

   // As above - but we don't know the heap region containing the
   // object and so have to supply it.
   inline bool par_mark_and_count(oop obj, int worker_i);

should be something like "we don't know the heap region containing the 
object so we will have to look it up".

Same thing here:

   // As above - but we don't have the heap region containing the
   // object, so we have to supply it.
   // Should *not* be called from parallel code.
   inline bool mark_and_count(oop obj);

concurrentMark.cpp

Since you are changing CalcLiveObjectsClosure::doHeapRegion() anyway, 
could you please remove this unused code (1393-1397):

         /*
         gclog_or_tty->print_cr("Setting bits from %d/%d.",
                                obj_card_num - _bottom_card_num,
                                obj_last_card_num - _bottom_card_num);
         */

What about the destructor ConcurrentMark::~ConcurrentMark() ? I remember 
Tony mentioning that it won't be called. Do you still want to keep the code?

FinalCountDataUpdateClosure::set_bit_for_region()
Probably not worth it, but would it make sense to add information in a 
startsHumongous HeapRegion to be able to give you the last 
continuesHumongous region? Since we know this when we set the regions up 
it seems like a waste to have to iterate over the region list to find it.

G1ParFinalCountTask
To me it is a bit surprising that we mix in the verify code inside this 
closure. Would it be possible to extract this code out somehow?

Line 3378: "// Use fill_to_bytes". Is this something you plan on doing?

G1ParFinalCountTask::work()
Just for the record. I don't really like the way we have to set up both 
a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even though we will 
only use them if we have VerifyDuringGC enabled. I realize it is due to 
the scoping, but I still think it obstucts the code flow and introduces 
unnecessary work. Unfortunately I don't have a good suggestion for how 
to work around it.

Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are 
StackObjs I assume it is not possible to get around the issue with a 
ResourceMark.

On 2011-11-15 19:44, John Cuthbertson wrote:
> Hi Everyone,
>
> I have a new webrev for these changes which includes all the code 
> review comments made during the code walkthrough and subsequently. The 
> new webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/6484965/webrev.1/
>
> Changes include:
> * Accessor function type changes (they return a pointer to the card 
> bitmap for a task rather than a reference).
> * A card bitmap and marked_bytes array is now embedded into the CMTask 
> to avoid an extra dereference.
> * Marking and counting have been combined in a single operation.
> * Mark/count routines have been placed in the .inline.hpp file.
> * Aggregation is now down in parallel and clearing of the count data 
> is performed at the same time.
> * GC Lab retirment - there's no need to be super accurate - I don't 
> need a specific routine - and can use one of the existing count 
> routines (count_region).
>
> Note this version of the code still has a card bitmap per task/worker 
> id and so incurs a footprint penalty. Removing the card bitmap(s) used 
> to scrub RSets will be done as a separate change.
>
> Testing: GC test suite and Kitchensink with a very low marking 
> threshold - 2%.
>
> Thanks,
>
> JohnC
>
> On 10/03/11 16:09, John Cuthbertson wrote:
>> Hi Everyone,
>>
>> Can I have a couple of volunteers review the changes for this CR? The 
>> webrev can be found at: 
>> http://cr.openjdk.java.net/~johnc/6484965/webrev.0/
>>
>> Summary:
>> During some big application runs we were seeing some excessively long 
>> concurrent counting times. These long counting times were increasing 
>> the cycle time of concurrent marking and increasing the risk of 
>> evacuation failure. It was suggested that, rather than having a 
>> separate phase that walks the marking bitmap to calculate the amount 
>> of live data etc, we could track this information in some per marking 
>> task/worker thread data structures and update the information as 
>> objects were marked. These changes implement that scheme. The 
>> original bitmap walking code has been retained but turned into a 
>> verification mechanism that is enabled using 
>> -XX:+UnlockDiagnosticVMOption -XX:+VerifyDuringGC.
>>
>> Testing: GC test suite with a low marking threshold (10%); OpenDS, 
>> KitchenSink, and jprt with the additional verification code enabled.
>>
>> JohnC
>>
>>
>>
>>
>