RFR 8150689: Thread dump report "waiting to re-lock in wait()" shows incorrectly

Patricio Chilano patricio.chilano.mateo at oracle.com
Wed Nov 21 22:45:17 UTC 2018


Hi Dan,

On 11/21/18 2:52 PM, Daniel D. Daugherty wrote:
> On 11/20/18 12:51 PM, Patricio Chilano wrote:
>> Hi all,
>>
>> Could you review this small fix for thread dump reports?
>> Now the "waiting to re-lock in wait()" message is shown in the "at 
>> java.lang.Object.wait(Native Method)" frame where the re-locking is 
>> actually occurring. In the frame where the lock was first taken we 
>> always show "locked". Please check if there is some scenario where 
>> you think the lock info on the stack is not being printed as you 
>> would expect.
>>
>> I also noticed the _thread_state attribute was printed twice so I 
>> removed one instance. I don't know if there was a specific reason to 
>> print it twice so I can add it back.
>>
>> Webrev URL: http://cr.openjdk.java.net/~pchilanomate/8150689.01/webrev
>
> src/hotspot/share/runtime/safepoint.cpp
>     new L1122:   _thread->print_thread_state_on(st);
>         L1123:   st->print_cr("Thread: " INTPTR_FORMAT
>         L1124:               "  [0x%2x] State: %s _has_called_back %d 
> _at_poll_safepoint %d",
>         L1125:                p2i(_thread), 
> _thread->osthread()->thread_id(), s, _has_called_back,
>         L1126:                _at_poll_safepoint);
>     old L1127:   _thread->print_thread_state_on(st);
>
>         The reason for moving the line isn't jumping out at me (yet).
>         Update: see next file's comments.
>
> src/hotspot/share/runtime/thread.cpp
>         L2945: #ifndef PRODUCT
>     old L2946:   print_thread_state_on(st);
>         L2947:   _safepoint_state->print_on(st);
>         L2948: #endif // PRODUCT
>
>         Okay so now I understand the change to safepoint.cpp. I agree
>         with deleting L2946 from thread.cpp because it would result in
>         two print_thread_state_on() calls in non-PRODUCT bits.
>
>         However, I don't agree with moving the print_thread_state_on()
>         call in safepoint.cpp. If you do that then you'll change the
>         output order for the "Thread: ..." line to be after the
>         print_thread_state_on() output for all build configs.
We could keep it in the same order in ThreadSafepointState::print_on() 
and if JavaThreadState and ThreadStatus need to be printed together 
maybe we could add a flag to decide whether to print the JavaThreadState 
in ThreadSafepointState::print_on(). (?)

> So I just took a look at the output for baseline release bits
>         on my MBP and I'm surprised to see this:
>
> "main" #1 prio=5 os_prio=31 cpu=121.30ms elapsed=1.41s 
> tid=0x00007f8abb808800 nid=0x2003 waiting on condition 
> [0x0000000100aff000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>    JavaThread state: _thread_blocked
> Thread: 0x00007f8abb808800  [0x2003] State: _at_safepoint 
> _has_called_back 0 _at_poll_safepoint 0
>    JavaThread state: _thread_blocked
>     at java.lang.Thread.sleep(java.base at 12-internal/Native Method)
>     at Sleeper.main(Sleeper.java:5)
>
>         So even in baseline 'release' bits, I'm seeing two of these:
>
>            JavaThread state: _thread_blocked
>
>         I don't know what to think here. Can you explain why release
>         bits are also showing double print_thread_state_on() output?
>         Maybe I'm not understanding about the thread.cpp code.
Odd, I did the same test in my MBP for a "release" build but I don't see 
the extra output. I also tested it for Linux and it doesn't show the 
extra output either. Can you confirm this?

> So I went back to JDK8 release bits on my Solaris box:
>
> "main" #1 prio=5 os_prio=64 tid=0x000000000041e800 nid=0x2 waiting on 
> condition [0xffff80ffbf1ae000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at Sleeper.main(Sleeper.java:5)
>
>         So there's no "JavaThread state: _thread_blocked" line at all.
>         I'm guessing we added that to 'release' bits output after JDK8.
>
>         And here's the JDK8 fastdebug bits on my Solaris box:
>
> "main" #1 prio=5 os_prio=64 tid=0x0000000000421800 nid=0x2 waiting on 
> condition [0xffff80ffbf1ae000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>    JavaThread state: _thread_blocked
> Thread: 0x0000000000421800  [0x 2] State: _at_safepoint 
> _has_called_back 0 _at_poll_safepoint 0
>    JavaThread state: _thread_blocked
>         at java.lang.Thread.sleep(Native Method)
>         at Sleeper.main(Sleeper.java:5)
>
>         Short version: I agree that there are two "JavaThread state: 
> _thread_blocked"
>         and we should fix that. I have no preference on whether the 
> line is before or
>         after the "Thread: ..." line. Others may have an opinion.
>
> src/hotspot/share/runtime/vframe.cpp
>     I agree with the comment rewrite on L174-6.
I actually kept thinking about the comment I added in lines L174-L176, 
but now I think it's not completely true. Because although that window 
of times exists there is no safepoint check inside it so the VMthread 
would never be able to stop the JavaThread during that time. Maybe I 
should remove that comment?

>     I like this new code:
>
>     L182: 
> if(java_lang_Thread::get_thread_status(thread()->threadObj()) == 
> java_lang_Thread::BLOCKED_ON_MONITOR_ENTER){
>     L183:             wait_state = "waiting to re-lock in wait()";
>     L184:           }
>         Needs space between 'if' and '(' and between ')' and '{'.
Done!
>
>     I agree with deleting old L232->250. That's the code that was
>     causing the confusing "waiting to re-lock in wait()" output
>     in the frame(s) after frame 0.
>
> Thumbs up!
>
> As David says, having some example current and new output in the bug 
> report
> would help.
Done!


Thanks!
Patricio

> Dan
>
>
>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8150689
>>
>> Thanks,
>> Patricio
>



More information about the hotspot-runtime-dev mailing list