CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause

Wed Jan 25 14:02:22 UTC 2012

Tony,

This looks good.

One minor thing. Is it possible to rename CMRootRegions::reset() ? From 
the name I would have expected it to kind of set its fields to NULL or 
something similar. But what it does is set up the CMRootRegion for the 
upcoming concurrent scanning. How about CMRootRegions::setup() or 
CMRootRegions::prepare() instead?

Just a reminder (I guess you will remember to do this anyway):

concurrentMark.inline.hpp
  // TODO: make sure we pass hr to par_mark_and_count() after merging
   // with John's changes.

Also, about the prefetching in void ConcurrentMark::scanRootRegion(). 
Did you measure any performance impact of this? Why did you decide to 
include it?

Bengt

On 2012-01-20 18:13, Tony Printezis wrote:
> Hi all,
>
> New webrev for this based on comments from John:
>
> http://cr.openjdk.java.net/~tonyp/7127706/webrev.3/
>
> Now, the CMRootRegions::claim_next() method does not check the 
> ConcurrentMark::has_aborted() flag to know when to abort (and return 
> NULL) but, instead, a _has_aborted flag I added to the CMRootRegions 
> class. This is set to true at the start of a Full GC and before the 
> Full GC waits for the root region scan to finish.
>
> Additionally, I now call _root_regions.reset() from 
> CM::checkpointRootsInitialPost()instead of calling it explicitly from 
> do_collection_pause_at_safepoint().
>
> Tony
>
> On 01/19/2012 09:40 AM, Tony Printezis wrote:
>> Bengt (and all),
>>
>> Updated webrev using "root regions" now:
>>
>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.2/
>>
>> Tony
>>
>> On 01/19/2012 09:04 AM, Tony Printezis wrote:
>>> Bengt,
>>>
>>> Inline (again!)
>>>
>>> On 1/19/2012 4:03 AM, Bengt Rutisson wrote:
>>>>
>>>> Tony,
>>>>
>>>> On 2012-01-18 22:31, Tony Printezis wrote:
>>>>> Bengt,
>>>>>
>>>>> Here's a webrev with the renaming:
>>>>>
>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/
>>>>>
>>>>> I have to say I'm not sure I really like the term "initial-mark / 
>>>>> IM snapshot regions". I'll try to come up with an alternative name 
>>>>> for them....
>>>>
>>>> Looked quickly at the new webrev.
>>>>
>>>> I agree that IM-snapshot might not be optimal. Still I like the 
>>>> fact that it is not just "snapshot" since I think that can easily 
>>>> be confused with the SATB terms. 
>>>
>>> Well, it's supposed to be. In SATB, anything that's reachable from 
>>> the "snapshot" at initial-mark time will be retained. Since we want 
>>> to avoid explicitly marking the survivors, we make them all 
>>> implicitly live and part of the "snapshot" which is why we have to 
>>> scan them (the same way we scan roots during the initial-mark 
>>> pause). So, "snapshot regions" is not unreasonable given that if we 
>>> were not using SATB we would not be able to do this.
>>>
>>>> It is of course part of the SATB snapshot, but not the whole thing.
>>>>
>>>> Just thinking aloud here, what about not using the word "snapshot" 
>>>> at all? How about "to_be_scanned_regions", "root_regions" or 
>>>> "concurrent_roots"?
>>>
>>> I like "root regions". Let's go with that (it's shorter too!).
>>>
>>> Tony
>>>
>>>> Really not sure what a good name is here...I kind of like the log 
>>>> message "Concurrent root scanning took 0.000x ms".
>>>>
>>>> Bengt
>>>>
>>>>>
>>>>> Tony
>>>>>
>>>>> Tony Printezis wrote:
>>>>>> Hi Bengt,
>>>>>>
>>>>>> Thanks for looking at this so quickly! Inline.
>>>>>>
>>>>>> Bengt Rutisson wrote:
>>>>>>>
>>>>>>> Tony,
>>>>>>>
>>>>>>> Overall this looks really good. Thanks for fixing it.
>>>>>>>
>>>>>>> Some comments:
>>>>>>>
>>>>>>> First, a general question regarding naming and logging. We now 
>>>>>>> talk about "snapshot" a lot. It is a pretty good name, but maybe 
>>>>>>> it needs some more context to be understandable in the code and 
>>>>>>> the GC log. I don't have any really good names, but maybe 
>>>>>>> "survivor_snapshot"
>>>>>>
>>>>>> I'd rather not mention "survivors" given that we might add 
>>>>>> non-survivor regions in the future.
>>>>>>
>>>>>>> or "initial_mark_snapshot"?
>>>>>>
>>>>>> I like "initial-mark snapshot" better. Having said that 
>>>>>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions 
>>>>>> are kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and 
>>>>>> _im_snapshot_regions if that's OK.
>>>>>>
>>>>>>> concurrentMark.inline.hpp
>>>>>>>
>>>>>>>   if (hr == NULL) {
>>>>>>>     hr = _g1h->heap_region_containing_raw(addr);
>>>>>>>     // Given that we're looking for a region that contains an 
>>>>>>> object
>>>>>>>     // header it's impossible to get back a HC region.
>>>>>>>     assert(!hr->continuesHumongous(), "sanity");
>>>>>>>   } else {
>>>>>>>     assert(hr->is_in(addr), "pre-condition");
>>>>>>>   }
>>>>>>>
>>>>>>> The first assert should probably hold even for regions that are 
>>>>>>> passed in to grayRoot() right? So, maybe something like:
>>>>>>>
>>>>>>>   if (hr == NULL) {
>>>>>>>     hr = _g1h->heap_region_containing_raw(addr);
>>>>>>>   } else {
>>>>>>>     assert(hr->is_in(addr), "pre-condition");
>>>>>>>   }
>>>>>>>   // Given that we need a region that contains an object
>>>>>>>   // header it's impossible for it to be a HC region.
>>>>>>>   assert(!hr->continuesHumongous(), "sanity");
>>>>>>
>>>>>> Good observation! I changed to the above.
>>>>>>
>>>>>>> concurrentMarkThread.cpp
>>>>>>>
>>>>>>> ConcurrentMarkThread::run()
>>>>>>>
>>>>>>> Why do we do the explicit time/date stamping?
>>>>>>>
>>>>>>>           gclog_or_tty->date_stamp(PrintGCDateStamps);
>>>>>>>           gclog_or_tty->stamp(PrintGCTimeStamps);
>>>>>>>           gclog_or_tty->print_cr("[GC 
>>>>>>> concurrent-snapshot-scan-start]");
>>>>>>>
>>>>>>> why is it not enough with the normal -XX:+PrintGCTimeStamps 
>>>>>>> information? 
>>>>>>
>>>>>> Not quite sure what you mean with "is it not enough with the 
>>>>>> normal ... information". Each log record needs either a GC time 
>>>>>> stamp or a GC date stamp and we have to print either or both 
>>>>>> depending on the two -XX parameters. Unfortunately, the logging 
>>>>>> code has not been well abstracted and/or refactored so we have 
>>>>>> this unfortunate replication throughout the GCs.
>>>>>>
>>>>>>> This is probably correct since I see this pattern in other 
>>>>>>> places. But I would like to understand why we do it.
>>>>>>>
>>>>>>>
>>>>>>> g1CollectedHeap.cpp:
>>>>>>>
>>>>>>> G1CollectedHeap::do_collection()
>>>>>>>
>>>>>>> Is it worth logging how long we had to wait in 
>>>>>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same 
>>>>>>> way that we do in 
>>>>>>> G1CollectedHeap::do_collection_pause_at_safepoint()?
>>>>>>
>>>>>> Currently, the GC log records for the evacuation pauses have a 
>>>>>> lot of extra information when +PrintGCDetails is set and it was 
>>>>>> reasonable to add an extra record with the wait time. And it's 
>>>>>> more important to know how the wait for snapshot region scanning 
>>>>>> affects evacuation pauses, which are more critical. The Full GC 
>>>>>> log records are currently one line and I don't think we want to 
>>>>>> extend them further (at least, not before we put a decent GC 
>>>>>> logging framework in place). On the other hand, the snapshot 
>>>>>> region scanning aborts when a marking cycle is aborted due to a 
>>>>>> Full GC. So, this wait time should not be long. How about I add a 
>>>>>> comment in the code saying that, when we introduce a more 
>>>>>> extensible logging framework, we could add the wait time to the 
>>>>>> Full GC log records? Something like:
>>>>>>
>>>>>>    // Note: When we have a more flexible GC logging framework that
>>>>>>    // allows us to add optional attributes to a GC log record we
>>>>>>    // could consider timing and reporting how long we wait in the
>>>>>>    // following two methods.
>>>>>>     wait_while_free_regions_coming();
>>>>>>    // ...
>>>>>>    _cm->snapshot_regions()->wait_until_scan_finished();
>>>>>>
>>>>>>
>>>>>>> Finally, just some food for thought. Could this be generalized 
>>>>>>> to more roots? I mean take a snapshot and scan it concurrently.
>>>>>>
>>>>>> By scanning the IM snapshot regions we say "these guys are all 
>>>>>> roots, instead of scanning them during the GC we will scan them 
>>>>>> concurrently". And we can do that for any object / region a) as 
>>>>>> long as we know they will not move while they are being scanned 
>>>>>> and b) because we have the pre-barrier. If any references on the 
>>>>>> snapshot objects are updated, the pre-barrier ensures that their 
>>>>>> values at the start of marking will be enqueued and processed.
>>>>>>
>>>>>> For external arbitrary roots we don't have a write barrier (and 
>>>>>> we shouldn't as it'd be too expensive). So, we cannot do that for 
>>>>>> non-object roots without a "pre-barrier"-type mechanism.
>>>>>>
>>>>>> Tony
>>>>>>
>>>>>>
>>>>>>> Bengt
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2012-01-18 00:48, Tony Printezis wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Can I have a couple of code reviews for this change that 
>>>>>>>> re-enables the use of survivor regions during the initial-mark 
>>>>>>>> pause?
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/
>>>>>>>>
>>>>>>>> From the CR:
>>>>>>>>
>>>>>>>> We could scan the survivors as we're copying them, however this 
>>>>>>>> will require more work during the initial-mark GCs (and in 
>>>>>>>> particular: special-case code in the fast path).
>>>>>>>>
>>>>>>>> A better approach is to let the concurrent marking threads scan 
>>>>>>>> the survivors and mark everything reachable from them a) before 
>>>>>>>> any more concurrent marking work is done (so that we can just 
>>>>>>>> mark the objects, without needing to push them on a stack, and 
>>>>>>>> let the "finger" algorithm discover them) and b) before the 
>>>>>>>> next GC starts (since, if we copy them, we won't know which of 
>>>>>>>> the new survivors are the ones we need to scan).
>>>>>>>>
>>>>>>>> This approach has the advantage that it does not require any 
>>>>>>>> extra work during the initial-mark GCs and all the work is done 
>>>>>>>> by the concurrent marking threads. However, it has the 
>>>>>>>> disadvantage that the survivor scanning might hold up the next 
>>>>>>>> GC. In most cases this should not be an issue as GCs take place 
>>>>>>>> at a reasonably low rate. If it does become a problem we could 
>>>>>>>> consider the following:
>>>>>>>>
>>>>>>>> - like when the GC locker is active, try to extend the eden to 
>>>>>>>> give a bit more time to the marking threads to finish scanning 
>>>>>>>> the survivors
>>>>>>>> - instead of waiting for the marking threads, a GC can take 
>>>>>>>> over and finish up scanning the remaining survivors (typically, 
>>>>>>>> we have more GC threads than marking threads, so the overhead 
>>>>>>>> will be reduced)
>>>>>>>> - if we supported region pinning, we could pin all the regions 
>>>>>>>> that were not scanned by the time the GC started so that the 
>>>>>>>> marking threads can resume scanning them after the GC completes
>>>>>>>>
>>>>>>>> Implementation notes:
>>>>>>>>
>>>>>>>> I introduced the concept of a "snapshot regions" in the 
>>>>>>>> ConcurrentMark which is a set of regions that need to be 
>>>>>>>> scanned at the start of a concurrent cycle. Currently, these 
>>>>>>>> can only be survivors but maybe we can use the same concept for 
>>>>>>>> something else in the future.
>>>>>>>>
>>>>>>>> Tony
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>