RFR (S): Fix races on full GC request
Roman Kennke
rkennke at redhat.com
Mon Dec 12 15:20:12 UTC 2016
Hi,
> 173 void ShenandoahConcurrentThread::do_full_gc(GCCause::Cause
> cause) {
> 175 assert(Thread::current()->is_Java_thread(), "expect Java
> thread here");
> 176
> 177 MonitorLockerEx ml(&_full_gc_lock);
> 178 schedule_full_gc(); // sets _do_full_gc = true
> 179 _full_gc_cause = cause;
> 180
> 181 // Now that full GC is scheduled, we can abort everything else
> 182 ShenandoahHeap::heap()->cancel_concgc(cause);
> 183
> 184 while (_do_full_gc) {
> 185 ml.wait();
> 186 OrderAccess::storeload();
> 187 }
> 188 assert(!_do_full_gc, "expect full GC to have completed");
> 189 }
>
> If there is a thread that blocked on _full_gc_lock when Full GC had
> started, but
> re-entered after Full GC is completed, it would try to schedule full
> GC / cancel
> conc GC again! This mostly happens when full GCs are really short.
>
> In our current code, this also fails the assert in Shenandoah control
> thread
> that every cancellation should have a reason, like impending full GC.
> This
> interesting result is because there are racy unlocked gets of
> _do_full_gc in
> assertion code.
>
> Both are solved by turning _do_full_gc updates atomic/lock-free, and
> using the
> lock only for wait/notifies:
> http://cr.openjdk.java.net/~shade/shenandoah/cancel-races-again/web
> rev.02/
Looks good to me.
> P.S. I swear to God, another race there, and I will burn the entire
> termination
> protocol thing down.
:-D
I can't count how many races and strange conditions we already fixed
there. One entire problem class went *puff* away when Zhengyu suggested
to simplify JNI critical regions. I seriously hope it's the last one.
Otherwise we simply give up on terminating? :-P
Roman
More information about the shenandoah-dev
mailing list