RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking

Wed Jan 18 17:53:28 UTC 2012

Hi Bengt,

Not a problem. I'll take out the locking and combine the print 
statements. If you don't mind I'll also change the prints in the 
verification code to use HR_FORMAT and HR_FORMAT_PARAMS to make them a 
bit more consistent with others.

Thanks again.

JohnC

On 1/18/2012 2:07 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> I had a quick look through the changes. Looks good to me.
>
> I looked a little closer at the log output and locking in 
> VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to 
> take the ParGCRareEvent_lock to avoid interleaved output from 
> different threads, but I am having second thoughts about this.
>
> All your output is prefixed with "Region %d" and logged with 
> print_cr(), so even if several threads do logging at the same time it 
> should be easy enough to parse the log files. There is one exception 
> to this and that is this block:
>
> 1612     if (failures > 0 && _verbose)  {
> 1613       gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", ntams: "
> 1614                           PTR_FORMAT", top: "PTR_FORMAT", end: 
> "PTR_FORMAT,
> 1615                           hr->hrs_index(), hr->bottom(), 
> hr->next_top_at_mark_start(),
> 1616                           hr->top(), hr->end());
> 1617       gclog_or_tty->print_cr(", marked_bytes: calc/actual 
> "SIZE_FORMAT"/"SIZE_FORMAT,
> 1618                              _calc_cl.region_marked_bytes(),
> 1619                              hr->next_marked_bytes());
> 1620     }
>
> I guess this is what triggered my suggestion to use the lock in the 
> last review. But now that I look at it I think it would be better to 
> just merge the print() and print_cr() statements into one single 
> print_cr() statement. If you do that I don't see the need for taking 
> the lock.
>
> Sorry for going back and forth on this topic.
>
> The rest of the changes look good to me!
> Bengt
>
> On 2012-01-12 09:26, John Cuthbertson wrote:
>> Hi Everyone,
>>
>> The latest incarnation of these changes can be found at: 
>> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/
>>
>> The changes in this version include:
>> * Conditionally using a lock so that the output of the verification 
>> closure executed by different threads does not interfere with each 
>> other (suggested by Bengt).
>> * Merging up to the latest hotspot-gc tip (including Tony's marking 
>> changes). This involved changing the evacuation failure code and 
>> adding a suitable mark/count routine for use in 
>> ConcurrentMark::grayRoot(). I also removed the counting changes from 
>> code that has been made obsolete as a result of Tony's marking changes.
>>
>> Testing: a few runs of the GC test suite with low marking thresholds 
>> (2 and 10%) with and without verification, and jprt.
>>
>> Thanks,
>>
>> JohnC
>>
>> On 12/21/2011 9:08 AM, John Cuthbertson wrote:
>>> Hi Bengt,
>>>
>>> That's a good observation. I guess it is possible but I haven't seen 
>>> it in practice (though I was typically only using 4 threads when 
>>> debugging a verification failure). It won't do any harm so I'll add 
>>> the locking.
>>>
>>> Thanks,
>>>
>>> JohnC
>>>
>>>
>>>
>>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote:
>>>>
>>>> Hi John,
>>>>
>>>> Thanks for updating your fix! Looks good.
>>>>
>>>> One quesiton:
>>>> In concurrentMark.cpp it seems to me that the 
>>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up 
>>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure 
>>>> in heapRegion.cpp. There are several workers simultaneously doing 
>>>> the verification, right? Is it worth adding the same kind of 
>>>> locking that Tony added?
>>>>
>>>> Bengt
>>>>
>>>> On 2011-12-20 20:16, John Cuthbertson wrote:
>>>>> Hi Bengt,
>>>>>
>>>>> As I mentioned earlier - thanks for the code review. I've applied 
>>>>> your suggestions, merged with the the latest changeset in 
>>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int 
>>>>> <-> size_t issue you also mentioned, and retested using the GC 
>>>>> test suite. A new webrev can be found at: 
>>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/
>>>>>
>>>>> Specific replies are inline.
>>>>>
>>>>> On 11/28/11 05:07, Bengt Rutisson wrote:
>>>>>>
>>>>>> John,
>>>>>>
>>>>>> A little late, but here are some comments on this webrev. I know 
>>>>>> you have some more improvements to this change coming, but 
>>>>>> overall I think it looks good. Most of my comments are just minor 
>>>>>> coding style comments.
>>>>>>
>>>>>> Bengt
>>>>>>
>>>>>> concurrentMark.hpp
>>>>>>
>>>>>> Rename ConcurrentMark::clear() to  ConcurrentMark::clear_mark() 
>>>>>> or ConcurrentMark::unmark()? The commment you added is definitely 
>>>>>> needed to understand what this method does. But it would be even 
>>>>>> better if it was possible to get that from the method name itself.
>>>>>
>>>>> Done.
>>>>>
>>>>>> It seems like everywhere we use count_marked_bytes_for(int 
>>>>>> worker_i) we almost directly use the array returned to index with 
>>>>>> the heap region that we are interested in. How about wrapping all 
>>>>>> of this is in something like count_set_marked_bytes_for(int 
>>>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int 
>>>>>> worker_i, int hrs_index) ? That way the data structure does not 
>>>>>> have to be exposed outside ConcurrentMark. It would mean that 
>>>>>> ConcurrentMark::count_region() would have to take a worker_i 
>>>>>> value instead of a marked_bytes_array.
>>>>>
>>>>> I did not do this. I embed the marked_bytes array for a worker 
>>>>> into the CMTask for that worker to save a de-reference. This was 
>>>>> one of the requests from the original code walk-through. Avoiding 
>>>>> the de-reference in the CMTask::do_marking_step() shaves a couple 
>>>>> of points off the marking time. I think your suggestion would 
>>>>> reinstate the de-reference again and we would lose those few 
>>>>> percentage points again.
>>>>>
>>>>>> If you don't agree with the suggestion above I would suggest to 
>>>>>> change the name from count_marked_bytes_for() to 
>>>>>> count_marked_bytes_array_for() since in every place that it is 
>>>>>> being called the resulting value is stored in a local variable 
>>>>>> called marked_bytes_array, which seems like a more informative 
>>>>>> name to me.
>>>>>
>>>>> Done. I agree - the new name sounds better.
>>>>>
>>>>>> I think this comment:
>>>>>>
>>>>>>   // As above - but we don't know the heap region containing the
>>>>>>   // object and so have to supply it.
>>>>>>   inline bool par_mark_and_count(oop obj, int worker_i);
>>>>>>
>>>>>> should be something like "we don't know the heap region 
>>>>>> containing the object so we will have to look it up".
>>>>>>
>>>>>> Same thing here:
>>>>>>
>>>>>>   // As above - but we don't have the heap region containing the
>>>>>>   // object, so we have to supply it.
>>>>>>   // Should *not* be called from parallel code.
>>>>>>   inline bool mark_and_count(oop obj);
>>>>>>
>>>>>>
>>>>>
>>>>> Comments were changed to:
>>>>>
>>>>>
>>>>>> concurrentMark.cpp
>>>>>>
>>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() 
>>>>>> anyway, could you please remove this unused code (1393-1397):
>>>>>>
>>>>>>         /*
>>>>>>         gclog_or_tty->print_cr("Setting bits from %d/%d.",
>>>>>>                                obj_card_num - _bottom_card_num,
>>>>>>                                obj_last_card_num - 
>>>>>> _bottom_card_num);
>>>>>>         */
>>>>>>
>>>>>>
>>>>>
>>>>> Done.
>>>>>
>>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I 
>>>>>> remember Tony mentioning that it won't be called. Do you still 
>>>>>> want to keep the code?
>>>>>
>>>>> I removed the entire destructor - I don't see it being called in 
>>>>> the experiments I've run.
>>>>>
>>>>>> FinalCountDataUpdateClosure::set_bit_for_region()
>>>>>> Probably not worth it, but would it make sense to add information 
>>>>>> in a startsHumongous HeapRegion to be able to give you the last 
>>>>>> continuesHumongous region? Since we know this when we set the 
>>>>>> regions up it seems like a waste to have to iterate over the 
>>>>>> region list to find it.
>>>>>
>>>>> If you read the original comment - the original author did not 
>>>>> want to make any assumptions about the internal field values of 
>>>>> the HeapRegions spanned by a humongous object and so used the loop 
>>>>> technique. I think you are correct and I now use the information 
>>>>> in the startsHumongous region to find the index of the last 
>>>>> continuesHumongous region spaned by the H-obj.
>>>>>
>>>>>> G1ParFinalCountTask
>>>>>> To me it is a bit surprising that we mix in the verify code 
>>>>>> inside this closure. Would it be possible to extract this code 
>>>>>> out somehow?
>>>>>
>>>>> I did it this way to avoid another iteration over the heap 
>>>>> regions. But it probably does make more sense to separate them and 
>>>>> use another iteration to do the verify. Done.
>>>>>
>>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on 
>>>>>> doing?
>>>>>
>>>>> I removed the comment. I was thinking of doing this as 
>>>>> fill_to_bytes is typically implemented using (a possibly 
>>>>> specialized version of) memset. But it's probably not worth it in 
>>>>> this case.
>>>>>
>>>>>> G1ParFinalCountTask::work()
>>>>>> Just for the record. I don't really like the way we have to set 
>>>>>> up both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even 
>>>>>> though we will only use them if we have VerifyDuringGC enabled. I 
>>>>>> realize it is due to the scoping, but I still think it obstucts 
>>>>>> the code flow and introduces unnecessary work. Unfortunately I 
>>>>>> don't have a good suggestion for how to work around it.
>>>>>>
>>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are 
>>>>>> StackObjs I assume it is not possible to get around the issue 
>>>>>> with a ResourceMark.
>>>>>
>>>>> Now that the verification is performed in a separate iteration of 
>>>>> the heap regions there's no need to create the 
>>>>> VerifyLiveObjectDataHRClosure  and Mux2HRClosure instances here. 
>>>>> Done. I have also removed the now-redundant Mux2HRClosure.
>>>>>
>>>>> Hopefully the new webrev addresses these comments.
>>>>>
>>>>> Thanks again for looking.
>>>>>
>>>>> JohnC
>>>>>
>>>>
>>>
>>
>