RFR: Traveral GC heuristics
Roman Kennke
rkennke at redhat.com
Tue Jan 23 20:41:10 UTC 2018
Am 23.01.2018 um 17:06 schrieb Aleksey Shipilev:
> On 01/22/2018 11:17 PM, Roman Kennke wrote:
>> - when growing the heap, we must make sure that the TAMS points to end for the new regions,
>> otherwise we'd treat them implicitely marked.
>
> Is this the source of this spooky change?
>
> inline ShenandoahHeapRegion* get(size_t i) const {
> - assert (i < _active_end, "sanity");
> + assert (i < _reserved_end, "sanity");
> return _regions[i];
> }
>
> get() is supposed to only return added regions. I think if you return something farther the
> _active_end, you read garbage.
All SHR are created and added to the regions list at the start. Which
means iterating to active_end actually does what I wanted. Reverted back
that change.
>> - folded 'SATB' queue processing into thread stack scanning. The problem here is that iterating
>> the threads 2x is cumbersome because of the claiming protocol: we need to fire the task/workers
>> 2x: once for the SATB queues, once for the thread scanning. I folded it into one pass. This
>> required a (trivial) extension in the upstream parallel thread scanning/iteration protocol.
>
> So this is the source for extension of RootProcessor and Thread methods? I am a bit uneasy about
> this (mostly because it raises backporting questions). I'd rather include this into RootProcessor
> right away, and assert nothing passes non-NULL ThreadClosure there. Then, sh/jdk8u, sh/jdk9 and
> sh/jdk10 versions would agree on the shape of RootProcessor methods and the calls to it, while
> sh/jdk10 would call RootProcessor with non-NULL ThreadClosure, and it would *also* implement the
> relevant parts in Thread.
Done in separate patch.
>> - I tripped an assert in SHR::increase_live_data(). I think the reason is that we have a race
>> here: a GC thread might not yet see the updated SHR::_top but already accounts for the updated
>> live data. I excluded conc-traversal from that check. This could probably be fixed by doing the
>> proper concurrency membars, but do we care? For assertion code?
>
> But wait, this change means "s" is greater than max_jint on some paths during Traversal?!
>
> inline void ShenandoahHeapRegion::increase_live_data_words(size_t s) {
> - assert (s <= (size_t)max_jint, "sanity");
> + assert (s <= (size_t)max_jint || _heap->is_concurrent_traversal_in_progress(), "sanity");
> increase_live_data_words((int)s);
> }
I removed the change to that assert. We only need the other one.
> Also, I am confused where Traversal calls increase_live_data_words(size_t), because both call sites
> are already protected:
>
> if (!sh->is_concurrent_traversal_in_progress()) {
> r->increase_live_data_words(used_words);
> }
>
> ...
>
> if (!ShenandoahHeap::heap()->is_concurrent_traversal_in_progress()) {
> r->increase_live_data_words(word_size);
> }
It's called from Traveral's own code.
> Since we have "adaptive" failures with C2 and/or -ShWBMemBar, I propose we chicken out, and drop all
> C2 changes (apart from the actual enqueue_barrier) from this change, then follow up on optimization
> story in subsequent changesets. This way we could integrated Traversal GC, and not risk immediate
> regression in non-Traversal code.
>
> This is done, along with other minor touchups here (apply over webrev.04):
> http://cr.openjdk.java.net/~shade/shenandoah/traversal-shade-updates-1.patch
Cool, thanks.
Differential patch:
http://cr.openjdk.java.net/~rkennke/traversal/webrev.05.diff/
Full patch, including your changes:
http://cr.openjdk.java.net/~rkennke/traversal/webrev.05/
(give it some seconds to fully upload)
Roman
More information about the shenandoah-dev
mailing list