RFR [8]: Fix shutdown deadlock due to blocking SATB flush periodic task
Aleksey Shipilev
shade at redhat.com
Tue Oct 2 14:13:00 UTC 2018
Found the reason for rare shutdown deadlock in sh/jdk8.
It goes like this. There is a DestroyVMThread that does the shutdown sequence. Before notifying
ShHeap about termination via ShHeap::stop(), it stops WatcherThread that executes periodic tasks.
One of those periodic tasks is from Shenandoah, which calls ShHeap::force_satb_flush_all_threads.
The deadlock happens when GC cycle wants to synchronize for safepoint at the same time. To do that,
safepoint synchronizer acquires Threads_lock for iterating the threads, and waits for threads to
report back. At the same time, DestroyVMThread enters the shutdown sequence, and tries to stop
WatcherThread, blocking until WatcherThread is dead, without checking for pending safepoint. But
WatcherThread is locked on Threads_lock inside ShHeap::force_satb_flush_all_threads, which is
already held by safepoint synchronizer, so the whole thing deadlocks. -XX:+SafepointTimeout then
predictably fails with "DestroyVMThread did not reach safepoint".
I think this is our fault: we should not really have blocking periodic tasks. In sh/jdk11 and later,
we have Thread SMR facility that does thread iteration without locking. In sh/jdk8 we can avoid
indefinite blocking by try_lock'ing the Threads_lock. We can give up on setting the flush flag,
because it is a hint anyway.
Goes like this:
diff -r ff9961cece0a src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp
--- a/src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp Tue Oct 02 13:17:33 2018 +0200
+++ b/src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp Tue Oct 02 15:49:45 2018 +0200
@@ -1588,18 +1588,28 @@
void ShenandoahHeap::force_satb_flush_all_threads() {
if (!is_concurrent_mark_in_progress()) {
// No need to flush SATBs
return;
}
- MutexLocker ml(Threads_lock);
- JavaThread::set_force_satb_flush_all_threads(true);
-
- // The threads are not "acquiring" their thread-local data, but it does not
- // hurt to "release" the updates here anyway.
- OrderAccess::fence();
+ // Do not block if Threads lock is busy. This avoids the potential deadlock
+ // when this code is called from the periodic task, and something else is
+ // expecting the periodic task to complete without blocking. On the off-chance
+ // Threads lock is busy momentarily, try to acquire several times.
+ for (int t = 0; t < 10; t++) {
+ if (Threads_lock->try_lock()) {
+ JavaThread::set_force_satb_flush_all_threads(true);
+ Threads_lock->unlock();
+
+ // The threads are not "acquiring" their thread-local data, but it does not
+ // hurt to "release" the updates here anyway.
+ OrderAccess::fence();
+ break;
+ }
+ os::naked_short_sleep(1);
+ }
}
Testing: hotspot_tier3_gc_shenandoah, deadlock stress tests (changed config for EvilSyncBug, shall
contribute it separately to sh/jdk first)
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list