RFR: 8240588: _threadObj cannot be used on an exiting JavaThread

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed May 13 15:41:44 UTC 2020


Hi David,

On 5/12/20 9:27 PM, David Holmes wrote:
> webrev: http://cr.openjdk.java.net/~dholmes/8240588/webrev.v3/

src/hotspot/share/memory/universe.cpp
     No comments.

src/hotspot/share/prims/whitebox.cpp
     L2238:     THROW_MSG(vmSymbols::java_lang_RuntimeException(), 
"Target thread not found in ThreadsListHandle!");
         Please consider: s/ThreadsListHandle/ThreadsList/

         The ThreadsListHandle is just a wrapper around the underlying
         ThreadsList. find_JavaThread_from_java_tid() searched the
         ThreadsList.

     L2248:     os::naked_short_sleep(0);
         I didn't realize/remember that '0' means minimum time supported 
by the OS.

     L2253:   // Now release the GC inducing thread - we have to 
re-resolve the external oop that
     L2254:   // was passed in as GC may have occurred and we don't know 
if we can trust t->threadObj() now.
     L2255:   oop original = JNIHandles::resolve_non_null(target_handle);
     L2256:   java_lang_Thread::set_priority(original, 
ThreadPriority(NormPriority + 2));
         So target_handle refers to the target thread that we're tracking
         and have allowed to exit. What good does it do to bump the priority
         on the target thread when it has (mostly) exited? By mostly, I mean
         that the JavaThread is still around because there is a ThreadsList
         that stills contains the target thread, but it is past executing
         code...

         Update: Now that I've read the test, I understand that you're using
         the priority field for test state transitions.

         Please add after this line (I also added a period at the end of 
L2225):

         L2225: // See 
test/hotspot/jtreg/runtime/Thread/ThreadObjAccessAtExit.java.
                // It explains how the thread's priority field is used 
for test
                // state coordination.

src/hotspot/share/runtime/thread.cpp
     No comments.

src/hotspot/share/runtime/thread.hpp
     No comments.

src/hotspot/share/runtime/threadSMR.cpp
     L1196:   Holder * h = new Holder(thread, _exiting_threads);
         nit - please delete space before '*'.

     L1210:     if (current->_thread == thread) {
     L1211:         *prev_next = current->_next;
     L1212:         break;
     L1213:     }
         nit - indent too much by two spaces on L1211-2.

         I like the rewrite to use prev_next. It's slightly more
         complicated than your original, but not overly so.

src/hotspot/share/runtime/threadSMR.hpp
     L93: //   Threads_lock) and visits the _threadObj oop of each 
JavaThread
         nit - needs a period at the end.

     L175:   DEBUG_ONLY(static bool contains_exiting_thread(JavaThread* 
thread);)
         This should be "#ifdef ASSERT ... #endif" to match the .cpp 
definition.

         nit - Also swap L175 and L176 to match the declaration order with
         the definition order in the .cpp.


test/lib/sun/hotspot/WhiteBox.java
     No comments.

test/hotspot/jtreg/runtime/Thread/ThreadObjAccessAtExit.java
     L29:  *          removed itself from the ThreadsList.
         s/the ThreadsList/the main ThreadsList/

     L39: /*
     L56: */
         nit - It's odd to see this style of comment block without a
             leading ' * ' before each line. Better still is the '//' style.

         This comment clarifies the SetPriority() calls in whitebox.cpp
         so now I know what's going on there.

     I like this test!!

Thumbs up! I don't need to see a new webrev if you decide to make
any tweaks based on my comments above.

Dan


> bug: https://bugs.openjdk.java.net/browse/JDK-8240588
>
> When a thread starts to terminate and removes itself from the main 
> ThreadsList it is no longer visited by GC (through the oops_do 
> mechanism). But that thread can still be found in secondary 
> ThreadsLists (via ThreadsListHandles), by code that then tries to 
> access its threadObj() oop, which can be invalid due to the fact it 
> has not been visited by the GC.
>
> As per the bug report I looked into a range of ways of addressing this:
> - make all ThreadsLists visible to GC
> - make the threadObj() a global handle of some form
> - fortify the call-sites to try to guard against a bad oop
>
> but I ended up with a very simple and clean solution that maintains an 
> auxiliary list of exiting threads (guarded by Threads_lock within 
> existing ThreadSMR code) which is walked via Universe::oops_do such 
> that all the threadObj() oops are visited and kept valid.
>
> Thanks to Erik, Dan, and Robbin for pre-review of this code and 
> suggested improvements.
>
> Thanks to Kim for explaining why handle approaches failed and the 
> limitations of oop access by a terminating thread. As a result of that 
> there is an additional small fix in thread.cpp to ensure the existing 
> thread doesn't try to access its own threadObj() oop when the thread 
> is not permitted to do so.
>
> Testing:
>
> I managed to devise a regression test which may not be future-proof 
> (in that the test may trivially pass because no oop relocation occurs) 
> but with which I was able to observe failures today with all GCs 
> without the fix, and success with the fix.
>
> The regression test was tested locally on Linux with each of Serial, 
> G1, Z and Shenadoah GCs, with product bits and fastdebug bits, and 
> with the fix disabled and enabled. With the fix disabled the test 
> reported an error in all configurations except product with ZGC. With 
> the fix enabled it passed on all configurations.
>
> The regression test was also tested in the CI:
> - linux, macOS x product,fastdebug x serial, G1, Z
> - windows x product, fastdebug, x serial, G1
>
> With the fix disabled the test reported an error in all configurations 
> (including product with ZGC!). With the fix enabled it passed on all 
> configurations.
>
> General testing: tiers 1-3
>
> Thanks,
> David



More information about the hotspot-runtime-dev mailing list