RFR: 8213231: ThreadSnapshot::_threadObj can become stale

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Jan 22 21:32:33 UTC 2019


On 1/22/19 9:59 AM, Erik Helin wrote:
> Hi all,
>
> this patch fixes a problem when the oop in ThreadSnapshot::_threadObj 
> can become stale. The issue is that the ThreadSnapshot::oops_do method 
> only gets called when a ThreadSnapshot instance has been registered in 
> a ThreadDumpResult (via the ThreadDumpResult::add_thread_snapshot 
> method). But, in order to register a ThreadSnapshot instance, you must 
> first create it. The problem is that the ThreadSnapshot constructor 
> first sets _threadObj to thread->threadObj() and then further down 
> might call ObjectSynchronizer:: get_lock_owner. The call to 
> ObjectSynchronizer:: get_lock_owner can result in a VM_RevokeBias VM 
> operation being executed. If a GC VM operation already is enqueued, 
> then that GC VM operation will run when the VM_RevokeBias VM operation 
> is executed. That GC VM operation will not update the oop in 
> ThreadSafepoint::_threadObj, because that ThreadSnapshot instance has 
> not yet been registered in any ThreadDumpResult (recall that the 
> ThreadSafepoint is being constructed), so the GC has no way to find 
> it. The oop in ThreadSafepoint::_threadObj will then become dangling 
> which most likely will cause the JVM to get a SIGSEGV some time later.
>
> The issue was found when debugging why an instance of 
> java/lang/management/ThreadInfo on the Java heap had a stale pointer 
> in its threadName field. Turns out that the java.lang.Thread instance 
> passed to the ThreadInfo was stale most likely for the reason outlined 
> in the paragraph above.
>
> This patch fixes the issue by ensuring that a ThreadSnapshot is always 
> registered in a ThreadDumpResult before the initialization of the 
> ThreadSnapshot is done. This ensures that the GC will always be able 
> to find the oop ThreadSnapshot::_threadObj via ThreadDumpResult::oops_do.
>
> Webrev:
> http://cr.openjdk.java.net/~ehelin/8213231/00/

This one caught my eye since I touched the ThreadSnapshot code in the
Thread-SMR project...

src/hotspot/share/runtime/vmOperations.cpp
     No comments.

src/hotspot/share/runtime/vmOperations.hpp
     No comments.

src/hotspot/share/services/management.cpp
     No comments.

src/hotspot/share/services/threadService.cpp
     No comments.

src/hotspot/share/services/threadService.hpp
     No comments.

Thumbs up!

Dan


>
> Issue:
> https://bugs.openjdk.java.net/browse/JDK-8213231
>
> Testing:
> - Tier 1, 2 and 3 on Windows, Mac, Linux (all x86-64)
> - RunThese30M (multiple runs) and RunThese24h on Linux x86-64
>   (please note that I never managed to reproduce the issue, all 
> analysis was done based on a core file)
>
> Thanks,
> Erik
>



More information about the hotspot-runtime-dev mailing list