RFR: 8240588: _threadObj cannot be used on an exiting JavaThread
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed May 13 15:41:44 UTC 2020
Hi David,
On 5/12/20 9:27 PM, David Holmes wrote:
> webrev: http://cr.openjdk.java.net/~dholmes/8240588/webrev.v3/
src/hotspot/share/memory/universe.cpp
No comments.
src/hotspot/share/prims/whitebox.cpp
L2238: THROW_MSG(vmSymbols::java_lang_RuntimeException(),
"Target thread not found in ThreadsListHandle!");
Please consider: s/ThreadsListHandle/ThreadsList/
The ThreadsListHandle is just a wrapper around the underlying
ThreadsList. find_JavaThread_from_java_tid() searched the
ThreadsList.
L2248: os::naked_short_sleep(0);
I didn't realize/remember that '0' means minimum time supported
by the OS.
L2253: // Now release the GC inducing thread - we have to
re-resolve the external oop that
L2254: // was passed in as GC may have occurred and we don't know
if we can trust t->threadObj() now.
L2255: oop original = JNIHandles::resolve_non_null(target_handle);
L2256: java_lang_Thread::set_priority(original,
ThreadPriority(NormPriority + 2));
So target_handle refers to the target thread that we're tracking
and have allowed to exit. What good does it do to bump the priority
on the target thread when it has (mostly) exited? By mostly, I mean
that the JavaThread is still around because there is a ThreadsList
that stills contains the target thread, but it is past executing
code...
Update: Now that I've read the test, I understand that you're using
the priority field for test state transitions.
Please add after this line (I also added a period at the end of
L2225):
L2225: // See
test/hotspot/jtreg/runtime/Thread/ThreadObjAccessAtExit.java.
// It explains how the thread's priority field is used
for test
// state coordination.
src/hotspot/share/runtime/thread.cpp
No comments.
src/hotspot/share/runtime/thread.hpp
No comments.
src/hotspot/share/runtime/threadSMR.cpp
L1196: Holder * h = new Holder(thread, _exiting_threads);
nit - please delete space before '*'.
L1210: if (current->_thread == thread) {
L1211: *prev_next = current->_next;
L1212: break;
L1213: }
nit - indent too much by two spaces on L1211-2.
I like the rewrite to use prev_next. It's slightly more
complicated than your original, but not overly so.
src/hotspot/share/runtime/threadSMR.hpp
L93: // Threads_lock) and visits the _threadObj oop of each
JavaThread
nit - needs a period at the end.
L175: DEBUG_ONLY(static bool contains_exiting_thread(JavaThread*
thread);)
This should be "#ifdef ASSERT ... #endif" to match the .cpp
definition.
nit - Also swap L175 and L176 to match the declaration order with
the definition order in the .cpp.
test/lib/sun/hotspot/WhiteBox.java
No comments.
test/hotspot/jtreg/runtime/Thread/ThreadObjAccessAtExit.java
L29: * removed itself from the ThreadsList.
s/the ThreadsList/the main ThreadsList/
L39: /*
L56: */
nit - It's odd to see this style of comment block without a
leading ' * ' before each line. Better still is the '//' style.
This comment clarifies the SetPriority() calls in whitebox.cpp
so now I know what's going on there.
I like this test!!
Thumbs up! I don't need to see a new webrev if you decide to make
any tweaks based on my comments above.
Dan
> bug: https://bugs.openjdk.java.net/browse/JDK-8240588
>
> When a thread starts to terminate and removes itself from the main
> ThreadsList it is no longer visited by GC (through the oops_do
> mechanism). But that thread can still be found in secondary
> ThreadsLists (via ThreadsListHandles), by code that then tries to
> access its threadObj() oop, which can be invalid due to the fact it
> has not been visited by the GC.
>
> As per the bug report I looked into a range of ways of addressing this:
> - make all ThreadsLists visible to GC
> - make the threadObj() a global handle of some form
> - fortify the call-sites to try to guard against a bad oop
>
> but I ended up with a very simple and clean solution that maintains an
> auxiliary list of exiting threads (guarded by Threads_lock within
> existing ThreadSMR code) which is walked via Universe::oops_do such
> that all the threadObj() oops are visited and kept valid.
>
> Thanks to Erik, Dan, and Robbin for pre-review of this code and
> suggested improvements.
>
> Thanks to Kim for explaining why handle approaches failed and the
> limitations of oop access by a terminating thread. As a result of that
> there is an additional small fix in thread.cpp to ensure the existing
> thread doesn't try to access its own threadObj() oop when the thread
> is not permitted to do so.
>
> Testing:
>
> I managed to devise a regression test which may not be future-proof
> (in that the test may trivially pass because no oop relocation occurs)
> but with which I was able to observe failures today with all GCs
> without the fix, and success with the fix.
>
> The regression test was tested locally on Linux with each of Serial,
> G1, Z and Shenadoah GCs, with product bits and fastdebug bits, and
> with the fix disabled and enabled. With the fix disabled the test
> reported an error in all configurations except product with ZGC. With
> the fix enabled it passed on all configurations.
>
> The regression test was also tested in the CI:
> - linux, macOS x product,fastdebug x serial, G1, Z
> - windows x product, fastdebug, x serial, G1
>
> With the fix disabled the test reported an error in all configurations
> (including product with ZGC!). With the fix enabled it passed on all
> configurations.
>
> General testing: tiers 1-3
>
> Thanks,
> David
More information about the hotspot-runtime-dev
mailing list