RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking
John Cuthbertson
john.cuthbertson at oracle.com
Thu Jan 12 08:26:25 UTC 2012
Hi Everyone,
The latest incarnation of these changes can be found at:
http://cr.openjdk.java.net/~johnc/6484965/webrev.3/
The changes in this version include:
* Conditionally using a lock so that the output of the verification
closure executed by different threads does not interfere with each other
(suggested by Bengt).
* Merging up to the latest hotspot-gc tip (including Tony's marking
changes). This involved changing the evacuation failure code and adding
a suitable mark/count routine for use in ConcurrentMark::grayRoot(). I
also removed the counting changes from code that has been made obsolete
as a result of Tony's marking changes.
Testing: a few runs of the GC test suite with low marking thresholds (2
and 10%) with and without verification, and jprt.
Thanks,
JohnC
On 12/21/2011 9:08 AM, John Cuthbertson wrote:
> Hi Bengt,
>
> That's a good observation. I guess it is possible but I haven't seen
> it in practice (though I was typically only using 4 threads when
> debugging a verification failure). It won't do any harm so I'll add
> the locking.
>
> Thanks,
>
> JohnC
>
>
>
> On 12/21/2011 2:05 AM, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Thanks for updating your fix! Looks good.
>>
>> One quesiton:
>> In concurrentMark.cpp it seems to me that the
>> VerifyLiveObjectDataHRClosure could get the same kind of messed up
>> output that Tony just fixed with 7123165 for the VerifyLiveClosure in
>> heapRegion.cpp. There are several workers simultaneously doing the
>> verification, right? Is it worth adding the same kind of locking that
>> Tony added?
>>
>> Bengt
>>
>> On 2011-12-20 20:16, John Cuthbertson wrote:
>>> Hi Bengt,
>>>
>>> As I mentioned earlier - thanks for the code review. I've applied
>>> your suggestions, merged with the the latest changeset in
>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int <->
>>> size_t issue you also mentioned, and retested using the GC test
>>> suite. A new webrev can be found at:
>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/
>>>
>>> Specific replies are inline.
>>>
>>> On 11/28/11 05:07, Bengt Rutisson wrote:
>>>>
>>>> John,
>>>>
>>>> A little late, but here are some comments on this webrev. I know
>>>> you have some more improvements to this change coming, but overall
>>>> I think it looks good. Most of my comments are just minor coding
>>>> style comments.
>>>>
>>>> Bengt
>>>>
>>>> concurrentMark.hpp
>>>>
>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() or
>>>> ConcurrentMark::unmark()? The commment you added is definitely
>>>> needed to understand what this method does. But it would be even
>>>> better if it was possible to get that from the method name itself.
>>>
>>> Done.
>>>
>>>> It seems like everywhere we use count_marked_bytes_for(int
>>>> worker_i) we almost directly use the array returned to index with
>>>> the heap region that we are interested in. How about wrapping all
>>>> of this is in something like count_set_marked_bytes_for(int
>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int
>>>> worker_i, int hrs_index) ? That way the data structure does not
>>>> have to be exposed outside ConcurrentMark. It would mean that
>>>> ConcurrentMark::count_region() would have to take a worker_i value
>>>> instead of a marked_bytes_array.
>>>
>>> I did not do this. I embed the marked_bytes array for a worker into
>>> the CMTask for that worker to save a de-reference. This was one of
>>> the requests from the original code walk-through. Avoiding the
>>> de-reference in the CMTask::do_marking_step() shaves a couple of
>>> points off the marking time. I think your suggestion would reinstate
>>> the de-reference again and we would lose those few percentage points
>>> again.
>>>
>>>> If you don't agree with the suggestion above I would suggest to
>>>> change the name from count_marked_bytes_for() to
>>>> count_marked_bytes_array_for() since in every place that it is
>>>> being called the resulting value is stored in a local variable
>>>> called marked_bytes_array, which seems like a more informative name
>>>> to me.
>>>
>>> Done. I agree - the new name sounds better.
>>>
>>>> I think this comment:
>>>>
>>>> // As above - but we don't know the heap region containing the
>>>> // object and so have to supply it.
>>>> inline bool par_mark_and_count(oop obj, int worker_i);
>>>>
>>>> should be something like "we don't know the heap region containing
>>>> the object so we will have to look it up".
>>>>
>>>> Same thing here:
>>>>
>>>> // As above - but we don't have the heap region containing the
>>>> // object, so we have to supply it.
>>>> // Should *not* be called from parallel code.
>>>> inline bool mark_and_count(oop obj);
>>>>
>>>>
>>>
>>> Comments were changed to:
>>>
>>>
>>>> concurrentMark.cpp
>>>>
>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion()
>>>> anyway, could you please remove this unused code (1393-1397):
>>>>
>>>> /*
>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.",
>>>> obj_card_num - _bottom_card_num,
>>>> obj_last_card_num - _bottom_card_num);
>>>> */
>>>>
>>>>
>>>
>>> Done.
>>>
>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I
>>>> remember Tony mentioning that it won't be called. Do you still want
>>>> to keep the code?
>>>
>>> I removed the entire destructor - I don't see it being called in the
>>> experiments I've run.
>>>
>>>> FinalCountDataUpdateClosure::set_bit_for_region()
>>>> Probably not worth it, but would it make sense to add information
>>>> in a startsHumongous HeapRegion to be able to give you the last
>>>> continuesHumongous region? Since we know this when we set the
>>>> regions up it seems like a waste to have to iterate over the region
>>>> list to find it.
>>>
>>> If you read the original comment - the original author did not want
>>> to make any assumptions about the internal field values of the
>>> HeapRegions spanned by a humongous object and so used the loop
>>> technique. I think you are correct and I now use the information in
>>> the startsHumongous region to find the index of the last
>>> continuesHumongous region spaned by the H-obj.
>>>
>>>> G1ParFinalCountTask
>>>> To me it is a bit surprising that we mix in the verify code inside
>>>> this closure. Would it be possible to extract this code out somehow?
>>>
>>> I did it this way to avoid another iteration over the heap regions.
>>> But it probably does make more sense to separate them and use
>>> another iteration to do the verify. Done.
>>>
>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on
>>>> doing?
>>>
>>> I removed the comment. I was thinking of doing this as fill_to_bytes
>>> is typically implemented using (a possibly specialized version of)
>>> memset. But it's probably not worth it in this case.
>>>
>>>> G1ParFinalCountTask::work()
>>>> Just for the record. I don't really like the way we have to set up
>>>> both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even
>>>> though we will only use them if we have VerifyDuringGC enabled. I
>>>> realize it is due to the scoping, but I still think it obstucts the
>>>> code flow and introduces unnecessary work. Unfortunately I don't
>>>> have a good suggestion for how to work around it.
>>>>
>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are
>>>> StackObjs I assume it is not possible to get around the issue with
>>>> a ResourceMark.
>>>
>>> Now that the verification is performed in a separate iteration of
>>> the heap regions there's no need to create the
>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here.
>>> Done. I have also removed the now-redundant Mux2HRClosure.
>>>
>>> Hopefully the new webrev addresses these comments.
>>>
>>> Thanks again for looking.
>>>
>>> JohnC
>>>
>>
>
More information about the hotspot-gc-dev
mailing list