[jdk18] RFR: 8273107: RunThese24H times out with "java.lang.management.ThreadInfo.getLockName()" is null
Robbin Ehn
rehn at openjdk.java.net
Wed Dec 15 08:45:09 UTC 2021
On Tue, 14 Dec 2021 21:16:02 GMT, Daniel D. Daugherty <dcubed at openjdk.org> wrote:
> RunThese24H sometimes times out with a couple of error msgs:
> - "java.lang.management.ThreadInfo.getLockName()" is null
> - ForkJoin common pool thread stuck
>
> The '"java.lang.management.ThreadInfo.getLockName()" is null' error msg was
> due to RunThese's use of an older JCK test suite which has since been fixed.
>
> The 'ForkJoin common pool thread stuck' failure mode is likely due to a thread
> spending a lot of time in ObjectSynchronizer::monitors_iterate() due to a
> VM_ThreadDump::doit() call. I say "likely" because I've never been able to
> reproduce this failure mode in testing outside of Mach5. With the Mach5
> sightings, all we have are thread dumps and core files and not a live process.
>
> The VM_ThreadDump::doit() call is trying to gather owned monitor information
> for all threads in the system. I've seen sightings of this failure mode with > 2000
> threads. I've also seen passing runs with > 1.7 million monitors on the in-use list.
> Imagine searching a larger in-use list for > 2000 threads. It just doesn't scale.
Marked as reviewed by rehn (Reviewer).
src/hotspot/share/runtime/vmOperations.cpp line 283:
> 281:
> 282: ObjectMonitorsHashtable table;
> 283: ObjectMonitorsHashtable* tablep = nullptr;
It looks like you can remove this pointer.
ThreadStackTrace::dump_stack_at_safepoint also looks at "with_locked_monitors", if false it ignores the table.
So there should not be any need to pass in null.
src/hotspot/share/services/threadService.cpp line 692:
> 690: ObjectMonitorsHashtable::PtrList* list = table->get_entry(_thread);
> 691: if (list != nullptr) {
> 692: ObjectSynchronizer::monitors_iterate(&imc, list, _thread);
The InflatedMonitorsClosure walks the stack until object is found with this method:
ThreadStackTrace::is_owned_monitor_on_stack(oop object)
for every object...
If you instead just collect all locks on stack with one pass you don't have to walk the stack over and over, which should be major speedup.
-------------
PR: https://git.openjdk.java.net/jdk18/pull/25
More information about the hotspot-runtime-dev
mailing list