RFR: 8266432: ZGC: GC allocation stalls can trigger deadlocks

MnS github.com+21689873+mehdinosrati at openjdk.java.net
Mon May 3 17:06:52 UTC 2021


On Mon, 3 May 2021 11:09:36 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:

> A deadlock can happen when a relocating thread holds the lock and tries to log information about the current thread, which can trigger a load barrier and a secondary relocation. The first relocation is holding the _ref_lock and the second relocation hangs when trying to reacquiring it. This is the stack trace:
> 
> #1 0x00007ff375c1fc90 in os::PlatformMonitor::wait
> #2 0x00007ff3760cbf92 in ZForwarding::wait_page_released
> #3 0x00007ff376118065 in ZRelocate::relocate_object
> #4 0x00007ff374f7097b in AccessInternal::PostRuntimeDispatch<ZBarrierSet::AccessBarrier<286790ul, ZBarrierSet>,
> #5 0x00007ff375176134 in oopDesc::obj_field
> #6 0x00007ff37551c5cb in java_lang_Thread::name
> #7 0x00007ff375f2e7ee in JavaThread::get_thread_name_string
> #8 0x00007ff376126323 in ZStatPhase::log_end
> #9 0x00007ff3761272e8 in ZStatCriticalPhase::register_end
> #10 0x00007ff3760cc0b0 in ZForwarding::wait_page_released
> #11 0x00007ff376118065 in ZRelocate::relocate_object
> #12 0x00007ff3760989c5 in ZLoadBarrierOopClosure::do_oop
> #13 0x00007ff375408ca8 in HandleArea::oops_do
> #14 0x00007ff375f2e0e9 in JavaThread::oops_do_no_frames
> 
> This started to happen after:
> 8261759: ZGC: ZWorker Threads Continue Marking After System.exit() called 
> 
> The proposed patch moves the scope of the logging to outside the lock scope. The first _ref_lock check isn't really necessary. It was introduced to limit the allocation stall logging, but if a thread entered wait_page_released it really was on its way to stall.

Marked as reviewed by MehdiNosrati at github.com (no known OpenJDK username).

-------------

PR: https://git.openjdk.java.net/jdk/pull/3839



More information about the hotspot-gc-dev mailing list