RFR (S): Fix races on full GC request
Aleksey Shipilev
shade at redhat.com
Mon Dec 12 14:36:04 UTC 2016
Hi,
There is yet another semi-race in scheduling Full GC, here:
173 void ShenandoahConcurrentThread::do_full_gc(GCCause::Cause cause) {
175 assert(Thread::current()->is_Java_thread(), "expect Java thread here");
176
177 MonitorLockerEx ml(&_full_gc_lock);
178 schedule_full_gc(); // sets _do_full_gc = true
179 _full_gc_cause = cause;
180
181 // Now that full GC is scheduled, we can abort everything else
182 ShenandoahHeap::heap()->cancel_concgc(cause);
183
184 while (_do_full_gc) {
185 ml.wait();
186 OrderAccess::storeload();
187 }
188 assert(!_do_full_gc, "expect full GC to have completed");
189 }
If there is a thread that blocked on _full_gc_lock when Full GC had started, but
re-entered after Full GC is completed, it would try to schedule full GC / cancel
conc GC again! This mostly happens when full GCs are really short.
In our current code, this also fails the assert in Shenandoah control thread
that every cancellation should have a reason, like impending full GC. This
interesting result is because there are racy unlocked gets of _do_full_gc in
assertion code.
Both are solved by turning _do_full_gc updates atomic/lock-free, and using the
lock only for wait/notifies:
http://cr.openjdk.java.net/~shade/shenandoah/cancel-races-again/webrev.02/
Testing: hotspot_gc_shenandoah, jcstress
Thanks,
-Aleksey
P.S. I swear to God, another race there, and I will burn the entire termination
protocol thing down.
More information about the shenandoah-dev
mailing list