RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v6]

Tue Sep 29 14:14:51 UTC 2020

On Tue, 29 Sep 2020 13:13:55 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> I see; thank you for the explanation.
>
> Hi Erik,
> 
> I have been playing with this patch for past a few days.  Great work!
> 
> I found that this patch seems to break an early assumption.
> We have a comment in JavaThread::exit() says:
> 
> <pre>
>   // We need to cache the thread name for logging purposes below as once
>   // we have called on_thread_detach this thread must not access any oops.
> </pre>
> 
> Then in method :
> <pre><code>
> void Threads::remove(JavaThread* p, bool is_daemon)  {
>  ...
>   BarrierSet::barrier_set()->on_thread_detach(p);
> 
>   // Extra scope needed for Thread_lock, so we can check
>   // that we do not remove thread without safepoint code notice
>   { MonitorLocker ml(Threads_lock);
> ..
> }
> </code></pre>
> It calls barrier's on_thread_detach(), acquires Threads_lock.
> The lock acquisition triggers stack processing, that potential accesses oops.
> <pre><code>
> V  [libjvm.so+0x10c6f5e]  StackWatermark::start_processing()+0x6a
> V  [libjvm.so+0x10c77e8]  StackWatermarkSet::start_processing(JavaThread*, StackWatermarkKind)+0x82
> V  [libjvm.so+0xfd7757]  SafepointMechanism::process(JavaThread*)+0x37
> V  [libjvm.so+0xfd796b]  SafepointMechanism::process_if_requested_slow(JavaThread*)+0x1d
> V  [libjvm.so+0x4b3683]  SafepointMechanism::process_if_requested(JavaThread*)+0x2b
> V  [libjvm.so+0xe87f0d]  ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck()+0x5f
> V  [libjvm.so+0xe86700]  Mutex::lock_contended(Thread*)+0x12c
> V  [libjvm.so+0xe867d8]  Mutex::lock(Thread*)+0x96
> V  [libjvm.so+0xe86823]  Mutex::lock()+0x23
> V  [libjvm.so+0x29b4bc]  MutexLocker::MutexLocker(Mutex*, Mutex::SafepointCheckFlag)+0xe2
> V  [libjvm.so+0x29b533]  MonitorLocker::MonitorLocker(Monitor*, Mutex::SafepointCheckFlag)+0x29
> V  [libjvm.so+0x119f2ce]  Threads::remove(JavaThread*, bool)+0x56
> V  [libjvm.so+0x1198a2b]  JavaThread::exit(bool, JavaThread::ExitType)+0x905
> V  [libjvm.so+0x1197fde]  JavaThread::post_run()+0x22
> V  [libjvm.so+0x1193eae]  Thread::call_run()+0x230
> V  [libjvm.so+0xef3e38]  thread_native_entry(Thread*)+0x1e4
> </code></pre>
> 
> This is a problem for Shenandoah, as it flushes SATB buffers during on_thread_detach() and does not expect to see any
> more SATB traffic.
> Thanks.

> Hi Erik,
> 
> I have been playing with this patch for past a few days. Great work!
> 
> I found that this patch seems to break an early assumption.
> We have a comment in JavaThread::exit() says:
> 
>   // We need to cache the thread name for logging purposes below as once
>   // we have called on_thread_detach this thread must not access any oops.
> 
> Then in method :
> 
> ```
> 
> void Threads::remove(JavaThread* p, bool is_daemon)  {
>  ...
>   BarrierSet::barrier_set()->on_thread_detach(p);
> 
>   // Extra scope needed for Thread_lock, so we can check
>   // that we do not remove thread without safepoint code notice
>   { MonitorLocker ml(Threads_lock);
> ..
> }
> ```
> 
> It calls barrier's on_thread_detach(), acquires Threads_lock.
> The lock acquisition triggers stack processing, that potential accesses oops.
> 
> ```
> 
> V  [libjvm.so+0x10c6f5e]  StackWatermark::start_processing()+0x6a
> V  [libjvm.so+0x10c77e8]  StackWatermarkSet::start_processing(JavaThread*, StackWatermarkKind)+0x82
> V  [libjvm.so+0xfd7757]  SafepointMechanism::process(JavaThread*)+0x37
> V  [libjvm.so+0xfd796b]  SafepointMechanism::process_if_requested_slow(JavaThread*)+0x1d
> V  [libjvm.so+0x4b3683]  SafepointMechanism::process_if_requested(JavaThread*)+0x2b
> V  [libjvm.so+0xe87f0d]  ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck()+0x5f
> V  [libjvm.so+0xe86700]  Mutex::lock_contended(Thread*)+0x12c
> V  [libjvm.so+0xe867d8]  Mutex::lock(Thread*)+0x96
> V  [libjvm.so+0xe86823]  Mutex::lock()+0x23
> V  [libjvm.so+0x29b4bc]  MutexLocker::MutexLocker(Mutex*, Mutex::SafepointCheckFlag)+0xe2
> V  [libjvm.so+0x29b533]  MonitorLocker::MonitorLocker(Monitor*, Mutex::SafepointCheckFlag)+0x29
> V  [libjvm.so+0x119f2ce]  Threads::remove(JavaThread*, bool)+0x56
> V  [libjvm.so+0x1198a2b]  JavaThread::exit(bool, JavaThread::ExitType)+0x905
> V  [libjvm.so+0x1197fde]  JavaThread::post_run()+0x22
> V  [libjvm.so+0x1193eae]  Thread::call_run()+0x230
> V  [libjvm.so+0xef3e38]  thread_native_entry(Thread*)+0x1e4
> ```
> 
> This is a problem for Shenandoah, as it flushes SATB buffers during on_thread_detach() and does not expect to see any
> more SATB traffic.
> Thanks.

What oop are you encountering here? You should have no frames left at this point, and all oops should have been
cleared. At least that is the theory, and that was why the thread oop moved out from the thread (to enforce that). So I
am curious what oop you have found to still be around at this point.

Anyway, you can try moving the GC hook into the critical section. That should help.

-------------

PR: https://git.openjdk.java.net/jdk/pull/296