RFR 8150689: Thread dump report "waiting to re-lock in wait()" shows incorrectly
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Nov 21 19:52:56 UTC 2018
On 11/20/18 12:51 PM, Patricio Chilano wrote:
> Hi all,
>
> Could you review this small fix for thread dump reports?
> Now the "waiting to re-lock in wait()" message is shown in the "at
> java.lang.Object.wait(Native Method)" frame where the re-locking is
> actually occurring. In the frame where the lock was first taken we
> always show "locked". Please check if there is some scenario where you
> think the lock info on the stack is not being printed as you would
> expect.
>
> I also noticed the _thread_state attribute was printed twice so I
> removed one instance. I don't know if there was a specific reason to
> print it twice so I can add it back.
>
> Webrev URL: http://cr.openjdk.java.net/~pchilanomate/8150689.01/webrev
src/hotspot/share/runtime/safepoint.cpp
new L1122: _thread->print_thread_state_on(st);
L1123: st->print_cr("Thread: " INTPTR_FORMAT
L1124: " [0x%2x] State: %s _has_called_back %d
_at_poll_safepoint %d",
L1125: p2i(_thread),
_thread->osthread()->thread_id(), s, _has_called_back,
L1126: _at_poll_safepoint);
old L1127: _thread->print_thread_state_on(st);
The reason for moving the line isn't jumping out at me (yet).
Update: see next file's comments.
src/hotspot/share/runtime/thread.cpp
L2945: #ifndef PRODUCT
old L2946: print_thread_state_on(st);
L2947: _safepoint_state->print_on(st);
L2948: #endif // PRODUCT
Okay so now I understand the change to safepoint.cpp. I agree
with deleting L2946 from thread.cpp because it would result in
two print_thread_state_on() calls in non-PRODUCT bits.
However, I don't agree with moving the print_thread_state_on()
call in safepoint.cpp. If you do that then you'll change the
output order for the "Thread: ..." line to be after the
print_thread_state_on() output for all build configs.
So I just took a look at the output for baseline release bits
on my MBP and I'm surprised to see this:
"main" #1 prio=5 os_prio=31 cpu=121.30ms elapsed=1.41s
tid=0x00007f8abb808800 nid=0x2003 waiting on condition [0x0000000100aff000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
JavaThread state: _thread_blocked
Thread: 0x00007f8abb808800 [0x2003] State: _at_safepoint
_has_called_back 0 _at_poll_safepoint 0
JavaThread state: _thread_blocked
at java.lang.Thread.sleep(java.base at 12-internal/Native Method)
at Sleeper.main(Sleeper.java:5)
So even in baseline 'release' bits, I'm seeing two of these:
JavaThread state: _thread_blocked
I don't know what to think here. Can you explain why release
bits are also showing double print_thread_state_on() output?
Maybe I'm not understanding about the thread.cpp code.
So I went back to JDK8 release bits on my Solaris box:
"main" #1 prio=5 os_prio=64 tid=0x000000000041e800 nid=0x2 waiting on
condition [0xffff80ffbf1ae000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at Sleeper.main(Sleeper.java:5)
So there's no "JavaThread state: _thread_blocked" line at all.
I'm guessing we added that to 'release' bits output after JDK8.
And here's the JDK8 fastdebug bits on my Solaris box:
"main" #1 prio=5 os_prio=64 tid=0x0000000000421800 nid=0x2 waiting on
condition [0xffff80ffbf1ae000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
JavaThread state: _thread_blocked
Thread: 0x0000000000421800 [0x 2] State: _at_safepoint _has_called_back
0 _at_poll_safepoint 0
JavaThread state: _thread_blocked
at java.lang.Thread.sleep(Native Method)
at Sleeper.main(Sleeper.java:5)
Short version: I agree that there are two "JavaThread state:
_thread_blocked"
and we should fix that. I have no preference on whether the
line is before or
after the "Thread: ..." line. Others may have an opinion.
src/hotspot/share/runtime/vframe.cpp
I agree with the comment rewrite on L174-6.
I like this new code:
L182: if(java_lang_Thread::get_thread_status(thread()->threadObj())
== java_lang_Thread::BLOCKED_ON_MONITOR_ENTER){
L183: wait_state = "waiting to re-lock in wait()";
L184: }
Needs space between 'if' and '(' and between ')' and '{'.
I agree with deleting old L232->250. That's the code that was
causing the confusing "waiting to re-lock in wait()" output
in the frame(s) after frame 0.
Thumbs up!
As David says, having some example current and new output in the bug report
would help.
Dan
> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8150689
>
> Thanks,
> Patricio
More information about the hotspot-runtime-dev
mailing list