[jdk18] RFR: 8273107: RunThese24H times out with "java.lang.management.ThreadInfo.getLockName()" is null
Erik Österlund
eosterlund at openjdk.java.net
Thu Dec 16 06:47:00 UTC 2021
On Tue, 14 Dec 2021 21:16:02 GMT, Daniel D. Daugherty <dcubed at openjdk.org> wrote:
> RunThese24H sometimes times out with a couple of error msgs:
> - "java.lang.management.ThreadInfo.getLockName()" is null
> - ForkJoin common pool thread stuck
>
> The '"java.lang.management.ThreadInfo.getLockName()" is null' error msg was
> due to RunThese's use of an older JCK test suite which has since been fixed.
>
> The 'ForkJoin common pool thread stuck' failure mode is likely due to a thread
> spending a lot of time in ObjectSynchronizer::monitors_iterate() due to a
> VM_ThreadDump::doit() call. I say "likely" because I've never been able to
> reproduce this failure mode in testing outside of Mach5. With the Mach5
> sightings, all we have are thread dumps and core files and not a live process.
>
> The VM_ThreadDump::doit() call is trying to gather owned monitor information
> for all threads in the system. I've seen sightings of this failure mode with > 2000
> threads. I've also seen passing runs with > 1.7 million monitors on the in-use list.
> Imagine searching a larger in-use list for > 2000 threads. It just doesn't scale.
Looks okay, but I found a potential issue.
src/hotspot/share/runtime/synchronizer.cpp line 79:
> 77: if (list != nullptr) {
> 78: // Add to existing list from the hash table:
> 79: list->add(om);
These two lines are executed in both the if and else clause. Consider putting it after instead.
src/hotspot/share/runtime/synchronizer.cpp line 1062:
> 1060: JavaThread* thread) {
> 1061: typedef LinkedListIterator<ObjectMonitor*> ObjectMonitorIterator;
> 1062: ObjectMonitorIterator iter = ObjectMonitorIterator(list->head());
This seems to create an object and then copy it to iter. I think the more obvious syntax is ObjectMonitorIterator iter(list->head());
src/hotspot/share/runtime/synchronizer.cpp line 1423:
> 1421: // deflate_idle_monitors() and deflate_monitor_list() can be called more than
> 1422: // once, we have to make sure the entry has not already been added.
> 1423: JavaThread* jt = (JavaThread*)mid->owner();
Hmmmm. I'm not sure this is correct. The owned of an ObjectMonitor is a JavaThread most of the time, but not always. I think that if the owner started off with a stack lock, and another thread inflated the lock due to contention, then the ObjectMonitor will get the stack lock as the owner, as the random other thread is too lazy to look up which thread stack the stack lock belongs to.
Because of that, I think we might miss locks that really are owned by a thread, as the thread key now will not map to such locks. Some kind of ownership lookup probably needs to happen here.
-------------
Changes requested by eosterlund (Reviewer).
PR: https://git.openjdk.java.net/jdk18/pull/25
More information about the hotspot-runtime-dev
mailing list