RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking
John Cuthbertson
john.cuthbertson at oracle.com
Wed Jan 18 17:53:28 UTC 2012
Hi Bengt,
Not a problem. I'll take out the locking and combine the print
statements. If you don't mind I'll also change the prints in the
verification code to use HR_FORMAT and HR_FORMAT_PARAMS to make them a
bit more consistent with others.
Thanks again.
JohnC
On 1/18/2012 2:07 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> I had a quick look through the changes. Looks good to me.
>
> I looked a little closer at the log output and locking in
> VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to
> take the ParGCRareEvent_lock to avoid interleaved output from
> different threads, but I am having second thoughts about this.
>
> All your output is prefixed with "Region %d" and logged with
> print_cr(), so even if several threads do logging at the same time it
> should be easy enough to parse the log files. There is one exception
> to this and that is this block:
>
> 1612 if (failures > 0 && _verbose) {
> 1613 gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", ntams: "
> 1614 PTR_FORMAT", top: "PTR_FORMAT", end:
> "PTR_FORMAT,
> 1615 hr->hrs_index(), hr->bottom(),
> hr->next_top_at_mark_start(),
> 1616 hr->top(), hr->end());
> 1617 gclog_or_tty->print_cr(", marked_bytes: calc/actual
> "SIZE_FORMAT"/"SIZE_FORMAT,
> 1618 _calc_cl.region_marked_bytes(),
> 1619 hr->next_marked_bytes());
> 1620 }
>
> I guess this is what triggered my suggestion to use the lock in the
> last review. But now that I look at it I think it would be better to
> just merge the print() and print_cr() statements into one single
> print_cr() statement. If you do that I don't see the need for taking
> the lock.
>
> Sorry for going back and forth on this topic.
>
> The rest of the changes look good to me!
> Bengt
>
> On 2012-01-12 09:26, John Cuthbertson wrote:
>> Hi Everyone,
>>
>> The latest incarnation of these changes can be found at:
>> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/
>>
>> The changes in this version include:
>> * Conditionally using a lock so that the output of the verification
>> closure executed by different threads does not interfere with each
>> other (suggested by Bengt).
>> * Merging up to the latest hotspot-gc tip (including Tony's marking
>> changes). This involved changing the evacuation failure code and
>> adding a suitable mark/count routine for use in
>> ConcurrentMark::grayRoot(). I also removed the counting changes from
>> code that has been made obsolete as a result of Tony's marking changes.
>>
>> Testing: a few runs of the GC test suite with low marking thresholds
>> (2 and 10%) with and without verification, and jprt.
>>
>> Thanks,
>>
>> JohnC
>>
>> On 12/21/2011 9:08 AM, John Cuthbertson wrote:
>>> Hi Bengt,
>>>
>>> That's a good observation. I guess it is possible but I haven't seen
>>> it in practice (though I was typically only using 4 threads when
>>> debugging a verification failure). It won't do any harm so I'll add
>>> the locking.
>>>
>>> Thanks,
>>>
>>> JohnC
>>>
>>>
>>>
>>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote:
>>>>
>>>> Hi John,
>>>>
>>>> Thanks for updating your fix! Looks good.
>>>>
>>>> One quesiton:
>>>> In concurrentMark.cpp it seems to me that the
>>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up
>>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure
>>>> in heapRegion.cpp. There are several workers simultaneously doing
>>>> the verification, right? Is it worth adding the same kind of
>>>> locking that Tony added?
>>>>
>>>> Bengt
>>>>
>>>> On 2011-12-20 20:16, John Cuthbertson wrote:
>>>>> Hi Bengt,
>>>>>
>>>>> As I mentioned earlier - thanks for the code review. I've applied
>>>>> your suggestions, merged with the the latest changeset in
>>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int
>>>>> <-> size_t issue you also mentioned, and retested using the GC
>>>>> test suite. A new webrev can be found at:
>>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/
>>>>>
>>>>> Specific replies are inline.
>>>>>
>>>>> On 11/28/11 05:07, Bengt Rutisson wrote:
>>>>>>
>>>>>> John,
>>>>>>
>>>>>> A little late, but here are some comments on this webrev. I know
>>>>>> you have some more improvements to this change coming, but
>>>>>> overall I think it looks good. Most of my comments are just minor
>>>>>> coding style comments.
>>>>>>
>>>>>> Bengt
>>>>>>
>>>>>> concurrentMark.hpp
>>>>>>
>>>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark()
>>>>>> or ConcurrentMark::unmark()? The commment you added is definitely
>>>>>> needed to understand what this method does. But it would be even
>>>>>> better if it was possible to get that from the method name itself.
>>>>>
>>>>> Done.
>>>>>
>>>>>> It seems like everywhere we use count_marked_bytes_for(int
>>>>>> worker_i) we almost directly use the array returned to index with
>>>>>> the heap region that we are interested in. How about wrapping all
>>>>>> of this is in something like count_set_marked_bytes_for(int
>>>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int
>>>>>> worker_i, int hrs_index) ? That way the data structure does not
>>>>>> have to be exposed outside ConcurrentMark. It would mean that
>>>>>> ConcurrentMark::count_region() would have to take a worker_i
>>>>>> value instead of a marked_bytes_array.
>>>>>
>>>>> I did not do this. I embed the marked_bytes array for a worker
>>>>> into the CMTask for that worker to save a de-reference. This was
>>>>> one of the requests from the original code walk-through. Avoiding
>>>>> the de-reference in the CMTask::do_marking_step() shaves a couple
>>>>> of points off the marking time. I think your suggestion would
>>>>> reinstate the de-reference again and we would lose those few
>>>>> percentage points again.
>>>>>
>>>>>> If you don't agree with the suggestion above I would suggest to
>>>>>> change the name from count_marked_bytes_for() to
>>>>>> count_marked_bytes_array_for() since in every place that it is
>>>>>> being called the resulting value is stored in a local variable
>>>>>> called marked_bytes_array, which seems like a more informative
>>>>>> name to me.
>>>>>
>>>>> Done. I agree - the new name sounds better.
>>>>>
>>>>>> I think this comment:
>>>>>>
>>>>>> // As above - but we don't know the heap region containing the
>>>>>> // object and so have to supply it.
>>>>>> inline bool par_mark_and_count(oop obj, int worker_i);
>>>>>>
>>>>>> should be something like "we don't know the heap region
>>>>>> containing the object so we will have to look it up".
>>>>>>
>>>>>> Same thing here:
>>>>>>
>>>>>> // As above - but we don't have the heap region containing the
>>>>>> // object, so we have to supply it.
>>>>>> // Should *not* be called from parallel code.
>>>>>> inline bool mark_and_count(oop obj);
>>>>>>
>>>>>>
>>>>>
>>>>> Comments were changed to:
>>>>>
>>>>>
>>>>>> concurrentMark.cpp
>>>>>>
>>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion()
>>>>>> anyway, could you please remove this unused code (1393-1397):
>>>>>>
>>>>>> /*
>>>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.",
>>>>>> obj_card_num - _bottom_card_num,
>>>>>> obj_last_card_num -
>>>>>> _bottom_card_num);
>>>>>> */
>>>>>>
>>>>>>
>>>>>
>>>>> Done.
>>>>>
>>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I
>>>>>> remember Tony mentioning that it won't be called. Do you still
>>>>>> want to keep the code?
>>>>>
>>>>> I removed the entire destructor - I don't see it being called in
>>>>> the experiments I've run.
>>>>>
>>>>>> FinalCountDataUpdateClosure::set_bit_for_region()
>>>>>> Probably not worth it, but would it make sense to add information
>>>>>> in a startsHumongous HeapRegion to be able to give you the last
>>>>>> continuesHumongous region? Since we know this when we set the
>>>>>> regions up it seems like a waste to have to iterate over the
>>>>>> region list to find it.
>>>>>
>>>>> If you read the original comment - the original author did not
>>>>> want to make any assumptions about the internal field values of
>>>>> the HeapRegions spanned by a humongous object and so used the loop
>>>>> technique. I think you are correct and I now use the information
>>>>> in the startsHumongous region to find the index of the last
>>>>> continuesHumongous region spaned by the H-obj.
>>>>>
>>>>>> G1ParFinalCountTask
>>>>>> To me it is a bit surprising that we mix in the verify code
>>>>>> inside this closure. Would it be possible to extract this code
>>>>>> out somehow?
>>>>>
>>>>> I did it this way to avoid another iteration over the heap
>>>>> regions. But it probably does make more sense to separate them and
>>>>> use another iteration to do the verify. Done.
>>>>>
>>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on
>>>>>> doing?
>>>>>
>>>>> I removed the comment. I was thinking of doing this as
>>>>> fill_to_bytes is typically implemented using (a possibly
>>>>> specialized version of) memset. But it's probably not worth it in
>>>>> this case.
>>>>>
>>>>>> G1ParFinalCountTask::work()
>>>>>> Just for the record. I don't really like the way we have to set
>>>>>> up both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even
>>>>>> though we will only use them if we have VerifyDuringGC enabled. I
>>>>>> realize it is due to the scoping, but I still think it obstucts
>>>>>> the code flow and introduces unnecessary work. Unfortunately I
>>>>>> don't have a good suggestion for how to work around it.
>>>>>>
>>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are
>>>>>> StackObjs I assume it is not possible to get around the issue
>>>>>> with a ResourceMark.
>>>>>
>>>>> Now that the verification is performed in a separate iteration of
>>>>> the heap regions there's no need to create the
>>>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here.
>>>>> Done. I have also removed the now-redundant Mux2HRClosure.
>>>>>
>>>>> Hopefully the new webrev addresses these comments.
>>>>>
>>>>> Thanks again for looking.
>>>>>
>>>>> JohnC
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-gc-dev
mailing list