RFR (S): Fix races on full GC request

Aleksey Shipilev shade at redhat.com
Mon Dec 12 14:36:04 UTC 2016


Hi,

There is yet another semi-race in scheduling Full GC, here:

 173 void ShenandoahConcurrentThread::do_full_gc(GCCause::Cause cause) {
 175   assert(Thread::current()->is_Java_thread(), "expect Java thread here");
 176
 177   MonitorLockerEx ml(&_full_gc_lock);
 178   schedule_full_gc(); // sets _do_full_gc = true
 179   _full_gc_cause = cause;
 180
 181   // Now that full GC is scheduled, we can abort everything else
 182   ShenandoahHeap::heap()->cancel_concgc(cause);
 183
 184   while (_do_full_gc) {
 185     ml.wait();
 186     OrderAccess::storeload();
 187   }
 188   assert(!_do_full_gc, "expect full GC to have completed");
 189 }

If there is a thread that blocked on _full_gc_lock when Full GC had started, but
re-entered after Full GC is completed, it would try to schedule full GC / cancel
conc GC again! This mostly happens when full GCs are really short.

In our current code, this also fails the assert in Shenandoah control thread
that every cancellation should have a reason, like impending full GC. This
interesting result is because there are racy unlocked gets of _do_full_gc in
assertion code.

Both are solved by turning _do_full_gc updates atomic/lock-free, and using the
lock only for wait/notifies:
  http://cr.openjdk.java.net/~shade/shenandoah/cancel-races-again/webrev.02/

Testing: hotspot_gc_shenandoah, jcstress

Thanks,
-Aleksey

P.S. I swear to God, another race there, and I will burn the entire termination
protocol thing down.



More information about the shenandoah-dev mailing list