RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking
Bengt Rutisson
bengt.rutisson at oracle.com
Wed Jan 18 19:19:36 UTC 2012
John,
On 2012-01-18 18:53, John Cuthbertson wrote:
> Hi Bengt,
>
> Not a problem. I'll take out the locking and combine the print
> statements. If you don't mind I'll also change the prints in the
> verification code to use HR_FORMAT and HR_FORMAT_PARAMS to make them a
> bit more consistent with others.
Sounds great. Ship it!
Thanks,
Bengt
>
> Thanks again.
>
> JohnC
>
> On 1/18/2012 2:07 AM, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> I had a quick look through the changes. Looks good to me.
>>
>> I looked a little closer at the log output and locking in
>> VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to
>> take the ParGCRareEvent_lock to avoid interleaved output from
>> different threads, but I am having second thoughts about this.
>>
>> All your output is prefixed with "Region %d" and logged with
>> print_cr(), so even if several threads do logging at the same time it
>> should be easy enough to parse the log files. There is one exception
>> to this and that is this block:
>>
>> 1612 if (failures > 0 && _verbose) {
>> 1613 gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT",
>> ntams: "
>> 1614 PTR_FORMAT", top: "PTR_FORMAT", end:
>> "PTR_FORMAT,
>> 1615 hr->hrs_index(), hr->bottom(),
>> hr->next_top_at_mark_start(),
>> 1616 hr->top(), hr->end());
>> 1617 gclog_or_tty->print_cr(", marked_bytes: calc/actual
>> "SIZE_FORMAT"/"SIZE_FORMAT,
>> 1618 _calc_cl.region_marked_bytes(),
>> 1619 hr->next_marked_bytes());
>> 1620 }
>>
>> I guess this is what triggered my suggestion to use the lock in the
>> last review. But now that I look at it I think it would be better to
>> just merge the print() and print_cr() statements into one single
>> print_cr() statement. If you do that I don't see the need for taking
>> the lock.
>>
>> Sorry for going back and forth on this topic.
>>
>> The rest of the changes look good to me!
>> Bengt
>>
>> On 2012-01-12 09:26, John Cuthbertson wrote:
>>> Hi Everyone,
>>>
>>> The latest incarnation of these changes can be found at:
>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/
>>>
>>> The changes in this version include:
>>> * Conditionally using a lock so that the output of the verification
>>> closure executed by different threads does not interfere with each
>>> other (suggested by Bengt).
>>> * Merging up to the latest hotspot-gc tip (including Tony's marking
>>> changes). This involved changing the evacuation failure code and
>>> adding a suitable mark/count routine for use in
>>> ConcurrentMark::grayRoot(). I also removed the counting changes from
>>> code that has been made obsolete as a result of Tony's marking changes.
>>>
>>> Testing: a few runs of the GC test suite with low marking thresholds
>>> (2 and 10%) with and without verification, and jprt.
>>>
>>> Thanks,
>>>
>>> JohnC
>>>
>>> On 12/21/2011 9:08 AM, John Cuthbertson wrote:
>>>> Hi Bengt,
>>>>
>>>> That's a good observation. I guess it is possible but I haven't
>>>> seen it in practice (though I was typically only using 4 threads
>>>> when debugging a verification failure). It won't do any harm so
>>>> I'll add the locking.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>>
>>>>
>>>>
>>>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi John,
>>>>>
>>>>> Thanks for updating your fix! Looks good.
>>>>>
>>>>> One quesiton:
>>>>> In concurrentMark.cpp it seems to me that the
>>>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up
>>>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure
>>>>> in heapRegion.cpp. There are several workers simultaneously doing
>>>>> the verification, right? Is it worth adding the same kind of
>>>>> locking that Tony added?
>>>>>
>>>>> Bengt
>>>>>
>>>>> On 2011-12-20 20:16, John Cuthbertson wrote:
>>>>>> Hi Bengt,
>>>>>>
>>>>>> As I mentioned earlier - thanks for the code review. I've applied
>>>>>> your suggestions, merged with the the latest changeset in
>>>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int
>>>>>> <-> size_t issue you also mentioned, and retested using the GC
>>>>>> test suite. A new webrev can be found at:
>>>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/
>>>>>>
>>>>>> Specific replies are inline.
>>>>>>
>>>>>> On 11/28/11 05:07, Bengt Rutisson wrote:
>>>>>>>
>>>>>>> John,
>>>>>>>
>>>>>>> A little late, but here are some comments on this webrev. I know
>>>>>>> you have some more improvements to this change coming, but
>>>>>>> overall I think it looks good. Most of my comments are just
>>>>>>> minor coding style comments.
>>>>>>>
>>>>>>> Bengt
>>>>>>>
>>>>>>> concurrentMark.hpp
>>>>>>>
>>>>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark()
>>>>>>> or ConcurrentMark::unmark()? The commment you added is
>>>>>>> definitely needed to understand what this method does. But it
>>>>>>> would be even better if it was possible to get that from the
>>>>>>> method name itself.
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>> It seems like everywhere we use count_marked_bytes_for(int
>>>>>>> worker_i) we almost directly use the array returned to index
>>>>>>> with the heap region that we are interested in. How about
>>>>>>> wrapping all of this is in something like
>>>>>>> count_set_marked_bytes_for(int worker_i, int hrs_index) and
>>>>>>> count_get_marked_bytes_for(int worker_i, int hrs_index) ? That
>>>>>>> way the data structure does not have to be exposed outside
>>>>>>> ConcurrentMark. It would mean that
>>>>>>> ConcurrentMark::count_region() would have to take a worker_i
>>>>>>> value instead of a marked_bytes_array.
>>>>>>
>>>>>> I did not do this. I embed the marked_bytes array for a worker
>>>>>> into the CMTask for that worker to save a de-reference. This was
>>>>>> one of the requests from the original code walk-through. Avoiding
>>>>>> the de-reference in the CMTask::do_marking_step() shaves a couple
>>>>>> of points off the marking time. I think your suggestion would
>>>>>> reinstate the de-reference again and we would lose those few
>>>>>> percentage points again.
>>>>>>
>>>>>>> If you don't agree with the suggestion above I would suggest to
>>>>>>> change the name from count_marked_bytes_for() to
>>>>>>> count_marked_bytes_array_for() since in every place that it is
>>>>>>> being called the resulting value is stored in a local variable
>>>>>>> called marked_bytes_array, which seems like a more informative
>>>>>>> name to me.
>>>>>>
>>>>>> Done. I agree - the new name sounds better.
>>>>>>
>>>>>>> I think this comment:
>>>>>>>
>>>>>>> // As above - but we don't know the heap region containing the
>>>>>>> // object and so have to supply it.
>>>>>>> inline bool par_mark_and_count(oop obj, int worker_i);
>>>>>>>
>>>>>>> should be something like "we don't know the heap region
>>>>>>> containing the object so we will have to look it up".
>>>>>>>
>>>>>>> Same thing here:
>>>>>>>
>>>>>>> // As above - but we don't have the heap region containing the
>>>>>>> // object, so we have to supply it.
>>>>>>> // Should *not* be called from parallel code.
>>>>>>> inline bool mark_and_count(oop obj);
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Comments were changed to:
>>>>>>
>>>>>>
>>>>>>> concurrentMark.cpp
>>>>>>>
>>>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion()
>>>>>>> anyway, could you please remove this unused code (1393-1397):
>>>>>>>
>>>>>>> /*
>>>>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.",
>>>>>>> obj_card_num - _bottom_card_num,
>>>>>>> obj_last_card_num -
>>>>>>> _bottom_card_num);
>>>>>>> */
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I
>>>>>>> remember Tony mentioning that it won't be called. Do you still
>>>>>>> want to keep the code?
>>>>>>
>>>>>> I removed the entire destructor - I don't see it being called in
>>>>>> the experiments I've run.
>>>>>>
>>>>>>> FinalCountDataUpdateClosure::set_bit_for_region()
>>>>>>> Probably not worth it, but would it make sense to add
>>>>>>> information in a startsHumongous HeapRegion to be able to give
>>>>>>> you the last continuesHumongous region? Since we know this when
>>>>>>> we set the regions up it seems like a waste to have to iterate
>>>>>>> over the region list to find it.
>>>>>>
>>>>>> If you read the original comment - the original author did not
>>>>>> want to make any assumptions about the internal field values of
>>>>>> the HeapRegions spanned by a humongous object and so used the
>>>>>> loop technique. I think you are correct and I now use the
>>>>>> information in the startsHumongous region to find the index of
>>>>>> the last continuesHumongous region spaned by the H-obj.
>>>>>>
>>>>>>> G1ParFinalCountTask
>>>>>>> To me it is a bit surprising that we mix in the verify code
>>>>>>> inside this closure. Would it be possible to extract this code
>>>>>>> out somehow?
>>>>>>
>>>>>> I did it this way to avoid another iteration over the heap
>>>>>> regions. But it probably does make more sense to separate them
>>>>>> and use another iteration to do the verify. Done.
>>>>>>
>>>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on
>>>>>>> doing?
>>>>>>
>>>>>> I removed the comment. I was thinking of doing this as
>>>>>> fill_to_bytes is typically implemented using (a possibly
>>>>>> specialized version of) memset. But it's probably not worth it in
>>>>>> this case.
>>>>>>
>>>>>>> G1ParFinalCountTask::work()
>>>>>>> Just for the record. I don't really like the way we have to set
>>>>>>> up both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even
>>>>>>> though we will only use them if we have VerifyDuringGC enabled.
>>>>>>> I realize it is due to the scoping, but I still think it
>>>>>>> obstucts the code flow and introduces unnecessary work.
>>>>>>> Unfortunately I don't have a good suggestion for how to work
>>>>>>> around it.
>>>>>>>
>>>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are
>>>>>>> StackObjs I assume it is not possible to get around the issue
>>>>>>> with a ResourceMark.
>>>>>>
>>>>>> Now that the verification is performed in a separate iteration of
>>>>>> the heap regions there's no need to create the
>>>>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here.
>>>>>> Done. I have also removed the now-redundant Mux2HRClosure.
>>>>>>
>>>>>> Hopefully the new webrev addresses these comments.
>>>>>>
>>>>>> Thanks again for looking.
>>>>>>
>>>>>> JohnC
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-gc-dev
mailing list