RFR(S): 8247533: SA stack walking sometimes fails with sun.jvm.hotspot.debugger.DebuggerException: get_thread_regs failed for a lwp
David Holmes
david.holmes at oracle.com
Thu Jun 18 08:43:37 UTC 2020
On 18/06/2020 4:49 pm, Chris Plummer wrote:
> On 6/17/20 10:29 PM, David Holmes wrote:
>> On 18/06/2020 3:13 pm, Chris Plummer wrote:
>>> On 6/17/20 10:09 PM, David Holmes wrote:
>>>> On 18/06/2020 2:33 pm, Chris Plummer wrote:
>>>>> On 6/17/20 7:43 PM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> On 18/06/2020 6:34 am, Chris Plummer wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Please help review the following:
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247533
>>>>>>> http://cr.openjdk.java.net/~cjplummer/8247533/webrev.00/index.html
>>>>>>>
>>>>>>> The CR contains all the needed details. Here's a summary of
>>>>>>> changes in each file:
>>>>>>
>>>>>> The problem sounds to me like a variation of the more general
>>>>>> problem of not ensuring a thread is kept alive whilst acting upon
>>>>>> it. I don't know how the SA finds these references to the threads
>>>>>> it is going to stackwalk, but is it possible to fix this via
>>>>>> appropriate uses of ThreadsListHandle/Iterator?
>>>>> It fetches ThreadsSMRSupport::_java_thread_list.
>>>>>
>>>>> Keep in mind that once SA attaches, nothing in the VM changes. For
>>>>> example, SA can't create a wrapper to a JavaThread, only to have
>>>>> the JavaThread be freed later on. It's just not possible.
>>>>
>>>> Then how does it obtain a reference to a JavaThread for which the
>>>> native OS thread id is invalid? Any thread found in
>>>> _java_thread_list is either live or still to be started. In the
>>>> latter case the JavaThread->osThread does not have its thread_id set
>>>> yet.
>>>>
>>> My assumption was that the JavaThread is in the process of being
>>> destroyed, and it has freed its OS thread but is itself still in the
>>> thread list. I did notice that the OS thread id being used looked to
>>> be in the range of thread id #'s you would expect for the running
>>> app, so that to me indicated it was once valid, but is no more.
>>>
>>> Keep in mind that although hotspot may have synchronization code that
>>> prevents you from pulling a JavaThread off the thread list when it is
>>> in the process of being destroyed (I'm guessing it does), SA has no
>>> such protections.
>>
>> But you stated that once the SA has attached, the target VM can't
>> change. If the SA gets its set of thread from one attach then tries to
>> make queries about those threads in a separate attach, then obviously
>> it could be providing garbage thread information. So you would need to
>> re-validate the JavaThread in the target VM before trying to do
>> anything with it.
> That's not what is going on here. It's attaching and doing a stack
> trace, which involves getting the thread list and iterating through all
> threads without detaching.
Okay so I restate my original comment - all the JavaThreads must be
alive or not yet started, so how are you encountering an invalid thread
id? Any thread you find via the ThreadsList can't have destroyed its
osThread. In any case the logic should be checking thread->osThread()
for NULL, and then osThread()->get_state() to ensure it is >=
INITIALIZED before using the thread_id().
Cheers,
David
-----
> Also, even if you are using something like
> clhsdb to issue commands on addresses, if the address is no longer valid
> for the command you are executing, then you would get the appropriate
> error when there is an attempt to create a wrapper for it. I don't know
> of any command that operates directly on a JavaThread, but I think there
> are for InstanceKlass. So if you remembered the address of an
> InstanceKlass, and then reattached and tried a command that takes an
> InstanceKlass address, you would get an exception when SA tries to
> create the wrapper for the InsanceKlass if it were no longer a valid
> address for one.
>
> Chris
>>
>> David
>> -----
>>
>>> Chris
>>>> David
>>>> -----
>>>>
>>>>> Chris
>>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp
>>>>>>> src/jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m
>>>>>>> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp
>>>>>>> -Instead of throwing an exception when the OS ThreadID is
>>>>>>> invalid, print a warning.
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c
>>>>>>> -Improve a print_debug message
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/BsdThread.java
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThread.java
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/windbg/amd64/WindbgAMD64Thread.java
>>>>>>>
>>>>>>> -Deal with the array of registers read in being null due to the
>>>>>>> OS ThreadID not being valid.
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/BsdDebuggerLocal.java
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxDebuggerLocal.java
>>>>>>>
>>>>>>> -Fix issue with "sun.jvm.hotspot.debugger.DebuggerException"
>>>>>>> appearing twice when printing the exception.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>
>>>
>
More information about the serviceability-dev
mailing list