CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces

Tue Dec 27 18:05:12 UTC 2011

Hi all,

Here's an updated webrev for this change that takes into account the new 
approach of chunking object arrays (see previous e-mails on 7121623):

http://cr.openjdk.java.net/~tonyp/6888336/webrev.1/

If anything else the new approach simplified the code a bit since now we 
can always read an object's size from its from-image instead of having 
to check one or the other depending on whether it's a chunked array or 
not. I also moved the body of some methods from heapRegion.hpp to the 
.inline.hpp and .cpp files (as they were getting a bit large to keep in 
the .hpp file).

Tony

On 12/21/2011 05:37 PM, Tony Printezis wrote:
> Hi all,
>
> I'd like a couple of code reviews for the following non-trivial 
> changes (large, not necessary in lines of code modified but more due 
> to the fact that the evacuation pause / concurrent marking interaction 
> is changed quite dramatically):
>
> http://cr.openjdk.java.net/~tonyp/6888336/webrev.0/
>
> Here's some background, motivation, and a summary of the changes (I 
> felt that it was important to write a longer then usual explanation):
>
> * Background / Motivation
>
> Each G1 heap region has a field top-at-mark-start (aka TAMS) which 
> denotes where the top of the region was when marking started. An 
> object is considered implicitly live if it's over TAMS (i.e., it was 
> allocated since marking started) or explicitly live if it's below TAMS 
> (i.e., it was allocated before marking started) and marked on the 
> bitmap. (It follows that it's unnecessary to explicitly mark objects 
> over TAMS.)
>
> In fact, we have two copies of the above marking information: "Next 
> TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is the copy 
> that was obtained by the last marking cycle that was successfully 
> completed (so, it is consistent: all live objects should appear as 
> live in the prev marking information). Next is the copy that will be 
> obtained / is currently being obtained and it's not consistent because 
> it's not guaranteed to be complete.
>
> G1 uses SATB marking which has the advantage not to require objects 
> allocated since the start of marking to be visited at all by the 
> marking threads (they are implicitly live and they do not need to be 
> scanned). So, the active marking cycle can totally ignore objects over 
> NTAMS (since they have been allocated since marking started).
>
> The current interaction between evacuation pauses (let's call these 
> "GCs" from now on) and concurrent marking is very tricky. Even though 
> marking ignores all objects over NTAMS (currently: all objects in Eden 
> regions) it still has to visit and mark objects in the Survivors 
> regions. But those will be moved by subsequent GCs. So, a GC needs to 
> be aware that it's moving objects that have been marked by the marking 
> threads and not only propagate those marks but also notify the marking 
> threads that said objects have been moved. For that we use several 
> data structures: pushes to the global marking stack and also to what's 
> referred to as the "region stack" which is only used by the GC to push 
> a group of objects instead of pushing them individually  ("region" 
> here is a mem region and smaller than a G1 region).
>
> Additionally, because the marking threads could come across objects 
> that could potentially move we have to make sure that we don't leave 
> references to regions that have been evacuated on any marking data 
> structure. To do that we treat as roots all entries on the taskqueues 
> / global stack and drained all SATB buffers (both active buffers and 
> also enqueued buffers).
>
> The first issue with the above interaction is that it has performance 
> issues. Draining all SATB buffers and scanning the mark stack and 
> taskqueues has been shown to be very time-consuming in some cases. 
> Also, having to check whether objects are marked and propagate the 
> marks appropriately during GC is an extra overhead.
>
> The second issue is that it has been shown to be very fragile. We have 
> discovered and fixed many issues over time which were subtle and hard 
> to reproduce.
>
> We really need to simplify the GC/marking interaction to both improve 
> performance of GCs during marking, as well as improve our reliability. 
> This changeset does exactly that.
>
> * Explanation of the changes
>
> The goal is to ensure that all the objects that are copied by the GC 
> do not need to be visited by the marking threads and as a result do 
> not need to be explicitly marked, pushed, etc.
>
> The first observation is that most objects copied during a GC are 
> allocated after marking starts and are therefore implicitly live. This 
> is the case for all objects on Eden regions, as well as most objects 
> on Survivor regions. The only exception are objects on the Survivor 
> regions during the initial-mark pause. Unfortunately, it's not easy to 
> track those separately as they will get mixed in with future 
> Survivors. The first decision to deal with this is to turn off 
> Survivors during the initial-mark pause. This ensures that all objects 
> copied during each subsequent GC will only visit objects that have 
> been allocated since marking started and are therefore implicitly live 
> (i.e., over NTAMS). This allows us to totally eliminate that code that 
> propagates marks during the GC. We just have to make sure that all 
> copied objects are over NTAMS. Turning off Survivors during an 
> initial-mark pause is a bit of a "big hammer" approach, but it will 
> suffice for now. We have ideas on how to re-enable them in the future 
> and we'll explore a couple of alternatives.
>
> Given that the GC only copies objects that are implicitly marked it 
> follows that none of the objects that are copied during any GC should 
> appear on either the taskqueues nor the global marking stack. Also 
> remember that we filter SATB buffers before enqueueing them which will 
> filter out all implicitly marked objects. It follows that no enqueued 
> SATB buffer should have references to objects that are being moved. 
> This leaves the currently active SATB buffers given that the code that 
> populates them is unconditional. But if we run the filtering on those 
> during each GC such "offending" references are also quickly 
> eliminated. So, instead of having to scan all stacks and all SATB 
> buffers we only have to filter the active SATB buffers, which should 
> be much, much faster.
>
> * Implementation Notes
>
> The actual changes are not too extensive as they basically mostly 
> disable functionality in the GC code. The tricky part was to get the 
> TAMS fields correct at various phases (start of copying, start of 
> marking, etc.) and especially when an evacuation failure occurs. I put 
> all that functionality in methods on HeapRegion which do the right 
> thing when a GC starts, a marking starts, etc.
>
> The most important changes are in the "main" GC code, i.e. 
> G1ParCopyHelper::do_oop_work() and 
> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to 
> propagate marks we only now need to mark objects directly reachable 
> from roots during the initial-mark pause. The resulting code is much 
> simplified (and hopefully more performant!).
>
> I also added a method verify_no_cset_oops() which checks that indeed 
> all the marking data structures do not point to regions that are being 
> GCed at the start / end of each GC. (BTW, I'm considering adding a 
> develop flag to enable this on demand.)
>
> I should point out that this changeset will leave a lot of dead code. 
> However, I took the decision to keep the changes to a minimum in order 
> not overwhelm the code reviewers and make the important changes 
> clearer. (I also discussed this with a couple of potential code 
> reviewers and they agreed that this is a good approach.) I temporarily 
> added guarantees to ensure that methods that should not be called are 
> not called. I will remove all dead code with a future push.
>
> I also have to apologize to John Cuthbertson for removing a lot of 
> code he's added to deal with various bugs we had in the GC/marking 
> interaction. Hopefully the new code will be less fragile compared to 
> what we've had so far and John will be able to concentrate on more 
> interesting features than trying to track down hard-to-reproduce 
> failures!
>
> Tony
>