RFR: Traveral GC heuristics

Tue Jan 23 17:23:49 UTC 2018

Am 23. Januar 2018 17:06:20 MEZ schrieb Aleksey Shipilev <shade at redhat.com>:
>On 01/22/2018 11:17 PM, Roman Kennke wrote:
>> - when growing the heap, we must make sure that the TAMS points to
>end for the new regions, 
>> otherwise we'd treat them implicitely marked.
>
>Is this the source of this spooky change?
>
>   inline ShenandoahHeapRegion* get(size_t i) const {
>-    assert (i < _active_end, "sanity");
>+    assert (i < _reserved_end, "sanity");
>     return _regions[i];
>   }
>
>get() is supposed to only return added regions. I think if you return
>something farther the
>_active_end, you read garbage.

I think all regions are initialized, but no memory allocated?

>> - folded 'SATB' queue processing into thread stack scanning. The
>problem here is that iterating 
>> the threads 2x is cumbersome because of the claiming protocol: we
>need to fire the task/workers
>> 2x: once for the SATB queues, once for the thread scanning. I folded
>it into one pass. This
>> required a (trivial) extension in the upstream parallel thread
>scanning/iteration protocol.
>
>So this is the source for extension of RootProcessor and Thread
>methods?

Yes.

> I am a bit uneasy about
>this (mostly because it raises backporting questions). I'd rather
>include this into RootProcessor
>right away, and assert nothing passes non-NULL ThreadClosure there.

OK, can break that out of the patch.

>Then, sh/jdk8u, sh/jdk9 and
>sh/jdk10 versions would agree on the shape of RootProcessor methods and
>the calls to it, while
>sh/jdk10 would call RootProcessor with non-NULL ThreadClosure, and it
>would *also* implement the
>relevant parts in Thread.

Makes sense.

>> - I tripped an assert in SHR::increase_live_data(). I think the
>reason is that we have a race 
>> here: a GC thread might not yet see the updated SHR::_top but already
>accounts for the updated 
>> live data. I excluded conc-traversal from that check. This could
>probably be fixed by doing the
>> proper concurrency membars, but do we care? For assertion code?
>
>But wait, this change means "s" is greater than max_jint on some paths
>during Traversal?!
>
> inline void ShenandoahHeapRegion::increase_live_data_words(size_t s) {
>-  assert (s <= (size_t)max_jint, "sanity");
>+  assert (s <= (size_t)max_jint ||
>_heap->is_concurrent_traversal_in_progress(), "sanity");
>   increase_live_data_words((int)s);
> }

Gah. No. Will fix it.

Stay tuned for updated patch.

>
>Also, I am confused where Traversal calls
>increase_live_data_words(size_t), because both call sites
>are already protected:
>
>    if (!sh->is_concurrent_traversal_in_progress()) {
>      r->increase_live_data_words(used_words);
>    }
>
>...
>
>  if (!ShenandoahHeap::heap()->is_concurrent_traversal_in_progress()) {
>      r->increase_live_data_words(word_size);
>    }
>
>> Full:
>> http://cr.openjdk.java.net/~rkennke/traversal/webrev.04/
>
>Since we have "adaptive" failures with C2 and/or -ShWBMemBar, I propose
>we chicken out, and drop all
>C2 changes (apart from the actual enqueue_barrier) from this change,
>then follow up on optimization
>story in subsequent changesets. This way we could integrated Traversal
>GC, and not risk immediate
>regression in non-Traversal code.
>
>This is done, along with other minor touchups here (apply over
>webrev.04):
>http://cr.openjdk.java.net/~shade/shenandoah/traversal-shade-updates-1.patch
>
>Thanks,
>-Aleksey

-- 
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.