RFR: Implement protocol for safe OOM during evacuation handling

Zhengyu Gu zgu at redhat.com
Mon Mar 5 15:08:54 UTC 2018


Hi Roman,

> ShenandoahOOMDuringEvacScope
> - The protocol has been designed to allow repeated calls into
> evac_object even if OOM-during-evac is active:
>    - workers may work in strides. OOMing at one object doesn't mean it's
> not attempted again for the next
>    - write-barriers may return, and go into another write-barrier before
> reaching a safepoint
> 
> ... it is ok to do that, the protocol will lead to simple+safe RB when
> calling into evac_object() again.
> 
> - There are situations when we need to *leave* the scope. Most
> importantly, workers (in partial and traversal) need to signal the
> terminator that they are ready, which will cause them to wait other
> workers to finish, and in which they will *not* be able to give up the
> OOM-counter. We must leave the scope before signalling the terminator,
> and we have ShenandoahOOMDuringEvacScopeLeaver for that. There are a few
> other situations where we need to leave the scope to avoid nested
> scoping. Leaving the scope like this is ok because of the above
> mentioned design to allow repeated calls into the protocol.

This sounds suspicious ... you have counter that drops to 0, then comes 
back up, I think there can have race here.

shenandoahTraversalGC.cpp

  483     for (uint i = 0; i < stride; i++) {
  484       if ((q->pop_buffer(task) ||
  485            q->pop_local(task) ||
  486            q->pop_overflow(task) ||
  487            (DO_SATB && 
satb_mq_set.apply_closure_to_completed_buffer(&satb_cl) && 
q->pop_buffer(task)) ||
  488            queues->steal(worker_id, &seed, task))) {
  489         conc_mark->do_task<T, true>(q, cl, live_data, &task);
  490       } else {
  491         ShenandoahOOMDuringEvacScopeLeaver oom_scope_leaver;
  492         if (terminator->offer_termination()) return;
  493       }


E.g. L#491 counter drops to 0 -> WB returns -> fails to terminate -> it 
can evacuate again?


Thanks,

-Zhengyu


> 
> 
> http://cr.openjdk.java.net/~rkennke/safe-oom-during-evac/webrev.08/
> 
> Passes: all hotspot_gc_shenandoah fastdebug/release and specjvm
> fastdebug/release
> 
> ok to push (after the eliminate wb-stub went in) ?
> 
> Roman
> 


More information about the shenandoah-dev mailing list