RFR: Traveral GC heuristics
Zhengyu Gu
zgu at redhat.com
Wed Jan 17 17:10:41 UTC 2018
shenandoahOopClosures.hpp:
Missing string dedup version
shenandoahSupport.cpp
L#615 - 656
L#3537 - 3556
L#3981 - 4056
indent
sharedRuntime.cpp
213 assert(oopDesc::is_oop(orig, true /* ignore mark word */), "Error");
214 // store the original value that was in the field reference
215 if (UseShenandoahGC) { ShenandoahBarrierSet::enqueue(orig); }
216 return;
217 thread->satb_mark_queue().enqueue(orig);
218 JRT_END
L#216: does not look right. Should it be inside UseShenandoahGC block?
Thanks,
-Zhengyu
On 01/17/2018 09:37 AM, Roman Kennke wrote:
> Testing showed up some regressions in non-traversal code and two issues
> that I introduced (or haven't fixed) when single-flag patch arrived.
>
> The following now passes hotspot_gc_shenandoah tests and runs of specjvm
> with fastdebug with -XX:+ShenandoahVerify
> -XX:+ShenandoahGCHeuristics=traversal, with -XX:TieredStopAtLevel=0|1|4
>
> Differential:
> http://cr.openjdk.java.net/~rkennke/traversal/webrev.01.diff/
> Full:
> http://cr.openjdk.java.net/~rkennke/traversal/webrev.01/
>
> Please review, test, comment, etc. :-)
>
> Cheers, Roman
>
>> This started out as a smallish partial-GC experiment, then into a
>> clone of partial GC, and ended up as a standalone GC mode for
>> Shenandoah, which is a frankensteinization of
>> partial+concurrent-marking, with some goodies :-)
>>
>> The idea is to do everything, marking+evacuation+update-refs, in one
>> single phase. This is not very difficult to do: while traversing,
>> evacuate objects that are in the Cset, and update references as we go.
>> I chose to traverse the heap using an incremental-update approach,
>> mostly because this is what partial GC does, and as said above, this
>> started out as a clone of partial :-)
>>
>> The tricky part is to choose the Cset: I made it such that each GC
>> cycle collects liveness information, and bases the decision about Cset
>> in the next cycle on that liveness information. Yes, this means the
>> first cycle does not collect anything (except immediate garbage).
>>
>> Advantages:
>> - obviously, touching all live objects only once means less time spent
>> in GC. Measurements show that traversing the heap and doing everything
>> is only slightly longer than Shenandoah's marking phase, and this
>> might actually be because we also need to mark through newly allocated
>> objects.
>> - Traversal-order evacuation gives us 10x increase in
>> ordering-sensitive microbenchmark:
>> https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality/
>>
>> - Simpler barriers: i-u style barriers don't need to load the
>> pre-value, and can be optimized much better (hoisted out of hot paths,
>> etc). Some of it is already done in this patch, but there are plenty
>> of opportunities to make it even better.
>> - Possibly less floating garbage because we trace through newly
>> allocated objects too, and don't treat it implicitely live.
>> - we don't need a keep-alive-barrier for Reference.get() which means
>> we keep fewer referents alive just because they happen to be accessed
>> during GC.
>> - MWF is only a switch away (if I understand MWF correctly):
>> -XX:+ShenandoahMWF
>> - It does not need RBs in the WB fast-path, because outside of the
>> single phase, nothing is ever forwarded.
>> - It does not need the membar stuff in the WBs because we turn on/off
>> the phase during safepoint
>>
>> Disadvantages:
>> - Store-value barrier needs to be a WB, RB is not sufficient. The
>> storeval barrier is there to ensure only to-space values ever get
>> written to fields during update-refs. 3-phase Shenandoah doesn't
>> evacuate during update-refs, and therefore RB is enough. We need WB
>> here. (I believe this is off-set by optimization opportunities, see
>> above)
>> - Known I-U problem: mutators can outrun the GC with allocations and
>> let us not terminate.
>> - It needs barriers for constants (need to check this).
>>
>> Stuff left to do:
>> - Implement sane degeneration: if we hit OOM, we simply restart and go
>> into full-GC.
>> - Depending on degen: make heuristics adaptive. Currently it requires
>> manual tweaking of thresholds.
>>
>> Relevant knobs:
>> - ShenandoahGarbageThreshold: regions with more garbage than this go
>> into the Cset. Notice that this is based on the *previous* cycle, so
>> we may actually have much more garbage (but not less).
>> - ShenandoahFreeThreshold: start GC when we have less than that much
>> free heap.
>>
>> I'll not go into all the details for now and give you the code:
>> http://cr.openjdk.java.net/~rkennke/traversal/webrev.00/
>>
>>
>> Roman
>
More information about the shenandoah-dev
mailing list