RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking

Bengt Rutisson bengt.rutisson at oracle.com
Wed Jan 18 10:07:52 UTC 2012


Hi John,

I had a quick look through the changes. Looks good to me.

I looked a little closer at the log output and locking in 
VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to 
take the ParGCRareEvent_lock to avoid interleaved output from different 
threads, but I am having second thoughts about this.

All your output is prefixed with "Region %d" and logged with print_cr(), 
so even if several threads do logging at the same time it should be easy 
enough to parse the log files. There is one exception to this and that 
is this block:

1612     if (failures > 0 && _verbose)  {
1613       gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", ntams: "
1614                           PTR_FORMAT", top: "PTR_FORMAT", end: 
"PTR_FORMAT,
1615                           hr->hrs_index(), hr->bottom(), 
hr->next_top_at_mark_start(),
1616                           hr->top(), hr->end());
1617       gclog_or_tty->print_cr(", marked_bytes: calc/actual 
"SIZE_FORMAT"/"SIZE_FORMAT,
1618                              _calc_cl.region_marked_bytes(),
1619                              hr->next_marked_bytes());
1620     }

I guess this is what triggered my suggestion to use the lock in the last 
review. But now that I look at it I think it would be better to just 
merge the print() and print_cr() statements into one single print_cr() 
statement. If you do that I don't see the need for taking the lock.

Sorry for going back and forth on this topic.

The rest of the changes look good to me!
Bengt

On 2012-01-12 09:26, John Cuthbertson wrote:
> Hi Everyone,
>
> The latest incarnation of these changes can be found at: 
> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/
>
> The changes in this version include:
> * Conditionally using a lock so that the output of the verification 
> closure executed by different threads does not interfere with each 
> other (suggested by Bengt).
> * Merging up to the latest hotspot-gc tip (including Tony's marking 
> changes). This involved changing the evacuation failure code and 
> adding a suitable mark/count routine for use in 
> ConcurrentMark::grayRoot(). I also removed the counting changes from 
> code that has been made obsolete as a result of Tony's marking changes.
>
> Testing: a few runs of the GC test suite with low marking thresholds 
> (2 and 10%) with and without verification, and jprt.
>
> Thanks,
>
> JohnC
>
> On 12/21/2011 9:08 AM, John Cuthbertson wrote:
>> Hi Bengt,
>>
>> That's a good observation. I guess it is possible but I haven't seen 
>> it in practice (though I was typically only using 4 threads when 
>> debugging a verification failure). It won't do any harm so I'll add 
>> the locking.
>>
>> Thanks,
>>
>> JohnC
>>
>>
>>
>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote:
>>>
>>> Hi John,
>>>
>>> Thanks for updating your fix! Looks good.
>>>
>>> One quesiton:
>>> In concurrentMark.cpp it seems to me that the 
>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up 
>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure 
>>> in heapRegion.cpp. There are several workers simultaneously doing 
>>> the verification, right? Is it worth adding the same kind of locking 
>>> that Tony added?
>>>
>>> Bengt
>>>
>>> On 2011-12-20 20:16, John Cuthbertson wrote:
>>>> Hi Bengt,
>>>>
>>>> As I mentioned earlier - thanks for the code review. I've applied 
>>>> your suggestions, merged with the the latest changeset in 
>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int <-> 
>>>> size_t issue you also mentioned, and retested using the GC test 
>>>> suite. A new webrev can be found at: 
>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/
>>>>
>>>> Specific replies are inline.
>>>>
>>>> On 11/28/11 05:07, Bengt Rutisson wrote:
>>>>>
>>>>> John,
>>>>>
>>>>> A little late, but here are some comments on this webrev. I know 
>>>>> you have some more improvements to this change coming, but overall 
>>>>> I think it looks good. Most of my comments are just minor coding 
>>>>> style comments.
>>>>>
>>>>> Bengt
>>>>>
>>>>> concurrentMark.hpp
>>>>>
>>>>> Rename ConcurrentMark::clear() to  ConcurrentMark::clear_mark() or 
>>>>> ConcurrentMark::unmark()? The commment you added is definitely 
>>>>> needed to understand what this method does. But it would be even 
>>>>> better if it was possible to get that from the method name itself.
>>>>
>>>> Done.
>>>>
>>>>> It seems like everywhere we use count_marked_bytes_for(int 
>>>>> worker_i) we almost directly use the array returned to index with 
>>>>> the heap region that we are interested in. How about wrapping all 
>>>>> of this is in something like count_set_marked_bytes_for(int 
>>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int 
>>>>> worker_i, int hrs_index) ? That way the data structure does not 
>>>>> have to be exposed outside ConcurrentMark. It would mean that 
>>>>> ConcurrentMark::count_region() would have to take a worker_i value 
>>>>> instead of a marked_bytes_array.
>>>>
>>>> I did not do this. I embed the marked_bytes array for a worker into 
>>>> the CMTask for that worker to save a de-reference. This was one of 
>>>> the requests from the original code walk-through. Avoiding the 
>>>> de-reference in the CMTask::do_marking_step() shaves a couple of 
>>>> points off the marking time. I think your suggestion would 
>>>> reinstate the de-reference again and we would lose those few 
>>>> percentage points again.
>>>>
>>>>> If you don't agree with the suggestion above I would suggest to 
>>>>> change the name from count_marked_bytes_for() to 
>>>>> count_marked_bytes_array_for() since in every place that it is 
>>>>> being called the resulting value is stored in a local variable 
>>>>> called marked_bytes_array, which seems like a more informative 
>>>>> name to me.
>>>>
>>>> Done. I agree - the new name sounds better.
>>>>
>>>>> I think this comment:
>>>>>
>>>>>   // As above - but we don't know the heap region containing the
>>>>>   // object and so have to supply it.
>>>>>   inline bool par_mark_and_count(oop obj, int worker_i);
>>>>>
>>>>> should be something like "we don't know the heap region containing 
>>>>> the object so we will have to look it up".
>>>>>
>>>>> Same thing here:
>>>>>
>>>>>   // As above - but we don't have the heap region containing the
>>>>>   // object, so we have to supply it.
>>>>>   // Should *not* be called from parallel code.
>>>>>   inline bool mark_and_count(oop obj);
>>>>>
>>>>>
>>>>
>>>> Comments were changed to:
>>>>
>>>>
>>>>> concurrentMark.cpp
>>>>>
>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() 
>>>>> anyway, could you please remove this unused code (1393-1397):
>>>>>
>>>>>         /*
>>>>>         gclog_or_tty->print_cr("Setting bits from %d/%d.",
>>>>>                                obj_card_num - _bottom_card_num,
>>>>>                                obj_last_card_num - _bottom_card_num);
>>>>>         */
>>>>>
>>>>>
>>>>
>>>> Done.
>>>>
>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I 
>>>>> remember Tony mentioning that it won't be called. Do you still 
>>>>> want to keep the code?
>>>>
>>>> I removed the entire destructor - I don't see it being called in 
>>>> the experiments I've run.
>>>>
>>>>> FinalCountDataUpdateClosure::set_bit_for_region()
>>>>> Probably not worth it, but would it make sense to add information 
>>>>> in a startsHumongous HeapRegion to be able to give you the last 
>>>>> continuesHumongous region? Since we know this when we set the 
>>>>> regions up it seems like a waste to have to iterate over the 
>>>>> region list to find it.
>>>>
>>>> If you read the original comment - the original author did not want 
>>>> to make any assumptions about the internal field values of the 
>>>> HeapRegions spanned by a humongous object and so used the loop 
>>>> technique. I think you are correct and I now use the information in 
>>>> the startsHumongous region to find the index of the last 
>>>> continuesHumongous region spaned by the H-obj.
>>>>
>>>>> G1ParFinalCountTask
>>>>> To me it is a bit surprising that we mix in the verify code inside 
>>>>> this closure. Would it be possible to extract this code out somehow?
>>>>
>>>> I did it this way to avoid another iteration over the heap regions. 
>>>> But it probably does make more sense to separate them and use 
>>>> another iteration to do the verify. Done.
>>>>
>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on 
>>>>> doing?
>>>>
>>>> I removed the comment. I was thinking of doing this as 
>>>> fill_to_bytes is typically implemented using (a possibly 
>>>> specialized version of) memset. But it's probably not worth it in 
>>>> this case.
>>>>
>>>>> G1ParFinalCountTask::work()
>>>>> Just for the record. I don't really like the way we have to set up 
>>>>> both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even 
>>>>> though we will only use them if we have VerifyDuringGC enabled. I 
>>>>> realize it is due to the scoping, but I still think it obstucts 
>>>>> the code flow and introduces unnecessary work. Unfortunately I 
>>>>> don't have a good suggestion for how to work around it.
>>>>>
>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are 
>>>>> StackObjs I assume it is not possible to get around the issue with 
>>>>> a ResourceMark.
>>>>
>>>> Now that the verification is performed in a separate iteration of 
>>>> the heap regions there's no need to create the 
>>>> VerifyLiveObjectDataHRClosure  and Mux2HRClosure instances here. 
>>>> Done. I have also removed the now-redundant Mux2HRClosure.
>>>>
>>>> Hopefully the new webrev addresses these comments.
>>>>
>>>> Thanks again for looking.
>>>>
>>>> JohnC
>>>>
>>>
>>
>




More information about the hotspot-gc-dev mailing list