RFR(S): 8247533: SA stack walking sometimes fails with sun.jvm.hotspot.debugger.DebuggerException: get_thread_regs failed for a lwp
Chris Plummer
chris.plummer at oracle.com
Thu Jun 25 04:08:29 UTC 2020
On 6/24/20 6:53 PM, Yasumasa Suenaga wrote:
> On 2020/06/25 10:00, Chris Plummer wrote:
>> On 6/24/20 5:17 PM, Yasumasa Suenaga wrote:
>>> On 2020/06/25 3:22, Chris Plummer wrote:
>>>> On 6/24/20 12:01 AM, Yasumasa Suenaga wrote:
>>>>> On 2020/06/24 15:32, Chris Plummer wrote:
>>>>>> Hi Yasumasa ,
>>>>>>
>>>>>> I think LinuxAMD64CFrame is used for pstack and what I've been
>>>>>> looking at has been jstack, and in particular
>>>>>> AMD64CurrentFrameGuess, which does use "last java frame". I'm not
>>>>>> sure why LinuxAMD64CFrame does not look at "last java frame".
>>>>>> Maybe it should.
>>>>>
>>>>> I thought both pattern (jstack, mixed stack) for this change.
>>>>> As you know, mixed jstack (jstack --mixed) attempt to find top of
>>>>> native stack via LinuxAMD64CFrame, register values are needed for
>>>>> it (so it depends on ptrace() call). So I guess mixed mode jstack
>>>>> (jhsdb jstack --mixed) would not show any stacks (cannot find
>>>>> "last java frame").
>>>> Hi Yasumasa,
>>>>
>>>> I should have been more clear on what I meant by jstack and pstack.
>>>> For jstack I meant using StackTrace.java, which is what you get by
>>>> default with "jhsdb jstack" and also the clhsdb jstack command. For
>>>> pstack I meant PStack.java, which is what you get with "jhsdb
>>>> jstack --mixed" or the clhsdb pstack command.
>>>>
>>>> So this CR impacts both types of stack traces in that they will get
>>>> null registers when the the lower level API fails to get the
>>>> register set. For StackTrace.java it will then defer to "last java
>>>> frame" if available. For PStack.java it will not, and will always
>>>> result in no stack trace. The code of interest is here:
>>>>
>>>> AMD64ThreadContext context = (AMD64ThreadContext)
>>>> thread.getContext();
>>>> Address pc =
>>>> context.getRegisterAsAddress(AMD64ThreadContext.RIP);
>>>> if (pc == null) return null;
>>>> return LinuxAMD64CFrame.getTopFrame(dbg, pc, context);
>>>>
>>>> So the question is should "last java frame" be used if pc == null.
>>>> If so, then getTopFrame() would also need to be modified to use
>>>> "last java frame" when fetching RBP.
>>>
>>> I don't think so because CFrame is defined as "Models a "C"
>>> programming language frame on the stack" in the javadoc, so it
>>> should have *valid* register values IMHO.
>>> In addition, RIP is needed for Linux AMD64 at least because it would
>>> use DWARF since JDK-8234624.
>>>
>> Hi Yasumasa,
>>
>> I don't quite understand the "C" frame nomenclature since CFrame is
>> used for non C frames also. The PStack code roughly does the following:
>>
>> CFrame f = cdbg.topFrameForThread();
>> ClosestSymbol sym = f.closestSymbolToPC();
>> Address pc = f.pc();
>> if (sym != null) {
>> ... native symbol
>> } else if (interp.contains(pc)) {
>> ... print interpreter frame
>>
>> So if the CFrame was filled in with "last java frame" values, it
>> should allow PStack to print the stack starting with the "last java
>> frame". Any native frame below that point would be missed.
>
> To use "last java frame" in this case looks good because stack
> unwinding is a best effort behavior.
> However PStack::run is PC-driven. I want to regard it - in other
> words, it should not perform if we cannot get register values even if
> "last java frame" is available.
Ok, that sounds reasonable.
thanks,
Chris
>
>
> Thanks,
>
> Yasumasa
>
>
>> Chris
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 6/23/20 11:04 PM, Yasumasa Suenaga wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> Thanks you for explanation.
>>>>>>> Your change looks good (but "last java frame" would not be found
>>>>>>> in Linux AMD64 because RSP is NULL - cf. LinuxAMD64CFrame.java)
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/06/24 12:09, Chris Plummer wrote:
>>>>>>>> On 6/23/20 6:05 PM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> Skillful troubleshooters who use jhsdb will aware this
>>>>>>>>> warnings, and they will take other appropriate methods.
>>>>>>>>>
>>>>>>>>> However, I'm not sure it is worth to continue to perform even
>>>>>>>>> if SA cannot get register values.
>>>>>>>>>
>>>>>>>>> For example, Linux AMD64 depends on RIP and RSP values to find
>>>>>>>>> top frame.
>>>>>>>>> According to your change, The caller of
>>>>>>>>> getThreadIntegerRegisterSet() has responsible for dealing with
>>>>>>>>> the set of null registers. However X86ThreadContext::data (it
>>>>>>>>> includes raw register values) would still be zero when it
>>>>>>>>> happens.
>>>>>>>> This is what I intended to have happen. Just end up with a
>>>>>>>> register set of all nulls. Then when stack walking code gets a
>>>>>>>> null, it will revert to "last java frame" if available,
>>>>>>>> otherwise no stack dump is done.
>>>>>>>>>
>>>>>>>>> So I think register holder (e.g. X86ThreadContext) should have
>>>>>>>>> tri-state (have registers, fail to get registers, not yet
>>>>>>>>> attempt to get registers).
>>>>>>>>> OTOH it might be over-engineering. What do you think?
>>>>>>>> Before implementing this I looked at the what would be the
>>>>>>>> easier approach to get the desired effect of stack walking code
>>>>>>>> simply failing over to using "last java frame", and decided the
>>>>>>>> null set of registers was easiest. Other approaches involved
>>>>>>>> more changes and impacted more files.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/06/24 3:16, Chris Plummer wrote:
>>>>>>>>>> On 6/20/20 12:53 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>> On 2020/06/20 15:20, Chris Plummer wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> ptrace is not used for core files, so the EFAULT for a bad
>>>>>>>>>>>> core file is not a possibility. However, get_lwp_regs()
>>>>>>>>>>>> does redirect to core_get_lwp_regs() for core files. It can
>>>>>>>>>>>> fail, but the only reason it ever does is if the LWP can't
>>>>>>>>>>>> be found in the core (which is never suppose to happen). I
>>>>>>>>>>>> would think if this happened due to the core being
>>>>>>>>>>>> truncated, SA would be blowing up all over the place with
>>>>>>>>>>>> exceptions, probably before we ever get to this code, but
>>>>>>>>>>>> in any cast what we do here wouldn't really make a difference.
>>>>>>>>>>>
>>>>>>>>>>> You are right, sorry.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure why you prefer an exception for errors other
>>>>>>>>>>>> than ESRCH. Why should they be treated differently?
>>>>>>>>>>>> getThreadIntegerRegisterSet0() is used for finding the
>>>>>>>>>>>> current frame for stack tracing. With my changes any
>>>>>>>>>>>> failure will result in deferring to "last java frame" if
>>>>>>>>>>>> set, and otherwise just not produce a stack trace (and the
>>>>>>>>>>>> WARNING will be present in the output). This seems
>>>>>>>>>>>> preferable to completely abandoning any further thread
>>>>>>>>>>>> stack tracking.
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure we can trust call stack when ptrace() returns
>>>>>>>>>>> any errors other than ESRCH even if "last java frame" is
>>>>>>>>>>> available. For example, don't ptrace() return EFAULT or EIO
>>>>>>>>>>> when something wrong? (e.g. stack corruption) If so, it may
>>>>>>>>>>> lead to a wrong analysis for troubleshooter.
>>>>>>>>>>> I think it should be abort dumping call stack for its thread
>>>>>>>>>>> at least.
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> In general stack walking makes a best effort and can be
>>>>>>>>>> wrong, even when not getting errors like this. For any
>>>>>>>>>> actively executing thread SA needs to determine where the
>>>>>>>>>> stack starts, with register contents being the starting point
>>>>>>>>>> (SP, FP, and PC). These registers could contain anything, and
>>>>>>>>>> SA makes a best effort to determine a current frame from
>>>>>>>>>> them. However, the verification steps it takes are not 100%
>>>>>>>>>> guaranteed, and can lead to an incorrect assumption of the
>>>>>>>>>> current frame, which in turn can result in an exception later
>>>>>>>>>> on when walking the stack. See JDK-8247641.
>>>>>>>>>>
>>>>>>>>>> Keep in mind that the WARNING message will always be there.
>>>>>>>>>> This should be enough to put the troubleshooter on alert that
>>>>>>>>>> the stack trace may not be accurate. I think it's better to
>>>>>>>>>> make an attempt at a stack trace then to just abandon it and
>>>>>>>>>> not attempt to do something that may be useful.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/19/20 6:33 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I checked Linux kernel code at a glance, ESRCH seems to be
>>>>>>>>>>>>> set to errno by default.
>>>>>>>>>>>>> So I guess it is similar to "generic" error code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/torvalds/linux/blob/master/kernel/ptrace.c
>>>>>>>>>>>>>
>>>>>>>>>>>>> According to manpage of ptrace(2), it might return errno
>>>>>>>>>>>>> other than ESRCH.
>>>>>>>>>>>>> For example, if we analyze broken core (e.g. the core was
>>>>>>>>>>>>> dumped with disk full), we might get EFAULT.
>>>>>>>>>>>>> Thus I prefer to handle ESRCH only in your patch, and also
>>>>>>>>>>>>> I think SA should throw DebuggerException if other error
>>>>>>>>>>>>> is occurred.
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://www.man7.org/linux/man-pages/man2/ptrace.2.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/06/20 5:51, Chris Plummer wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've updated with webrev based on the new finding that a
>>>>>>>>>>>>>> JavaThread cannot be on the ThreadList after its OS
>>>>>>>>>>>>>> thread has been destroyed since the JavaThread removes
>>>>>>>>>>>>>> itself from the ThreadList, and therefore must be running
>>>>>>>>>>>>>> on its OS thread. The logic of the fix is unchanged from
>>>>>>>>>>>>>> the first webrev, but I updated the comments to better
>>>>>>>>>>>>>> reflect what is going on. I also updated the CR:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247533
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8247533/webrev.01/index.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 6/19/20 12:24 AM, David Holmes wrote:
>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 19/06/2020 8:55 am, Chris Plummer wrote:
>>>>>>>>>>>>>>>> On 6/18/20 1:43 AM, David Holmes wrote:
>>>>>>>>>>>>>>>>> On 18/06/2020 4:49 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>> On 6/17/20 10:29 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> On 18/06/2020 3:13 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>> On 6/17/20 10:09 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> On 18/06/2020 2:33 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>>>> On 6/17/20 7:43 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 18/06/2020 6:34 am, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Please help review the following:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247533
>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8247533/webrev.00/index.html
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> The CR contains all the needed details. Here's
>>>>>>>>>>>>>>>>>>>>>>>> a summary of changes in each file:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> The problem sounds to me like a variation of the
>>>>>>>>>>>>>>>>>>>>>>> more general problem of not ensuring a thread is
>>>>>>>>>>>>>>>>>>>>>>> kept alive whilst acting upon it. I don't know
>>>>>>>>>>>>>>>>>>>>>>> how the SA finds these references to the threads
>>>>>>>>>>>>>>>>>>>>>>> it is going to stackwalk, but is it possible to
>>>>>>>>>>>>>>>>>>>>>>> fix this via appropriate uses of
>>>>>>>>>>>>>>>>>>>>>>> ThreadsListHandle/Iterator?
>>>>>>>>>>>>>>>>>>>>>> It fetches ThreadsSMRSupport::_java_thread_list.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Keep in mind that once SA attaches, nothing in
>>>>>>>>>>>>>>>>>>>>>> the VM changes. For example, SA can't create a
>>>>>>>>>>>>>>>>>>>>>> wrapper to a JavaThread, only to have the
>>>>>>>>>>>>>>>>>>>>>> JavaThread be freed later on. It's just not
>>>>>>>>>>>>>>>>>>>>>> possible.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Then how does it obtain a reference to a
>>>>>>>>>>>>>>>>>>>>> JavaThread for which the native OS thread id is
>>>>>>>>>>>>>>>>>>>>> invalid? Any thread found in _java_thread_list is
>>>>>>>>>>>>>>>>>>>>> either live or still to be started. In the latter
>>>>>>>>>>>>>>>>>>>>> case the JavaThread->osThread does not have its
>>>>>>>>>>>>>>>>>>>>> thread_id set yet.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> My assumption was that the JavaThread is in the
>>>>>>>>>>>>>>>>>>>> process of being destroyed, and it has freed its OS
>>>>>>>>>>>>>>>>>>>> thread but is itself still in the thread list. I
>>>>>>>>>>>>>>>>>>>> did notice that the OS thread id being used looked
>>>>>>>>>>>>>>>>>>>> to be in the range of thread id #'s you would
>>>>>>>>>>>>>>>>>>>> expect for the running app, so that to me indicated
>>>>>>>>>>>>>>>>>>>> it was once valid, but is no more.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Keep in mind that although hotspot may have
>>>>>>>>>>>>>>>>>>>> synchronization code that prevents you from pulling
>>>>>>>>>>>>>>>>>>>> a JavaThread off the thread list when it is in the
>>>>>>>>>>>>>>>>>>>> process of being destroyed (I'm guessing it does),
>>>>>>>>>>>>>>>>>>>> SA has no such protections.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But you stated that once the SA has attached, the
>>>>>>>>>>>>>>>>>>> target VM can't change. If the SA gets its set of
>>>>>>>>>>>>>>>>>>> thread from one attach then tries to make queries
>>>>>>>>>>>>>>>>>>> about those threads in a separate attach, then
>>>>>>>>>>>>>>>>>>> obviously it could be providing garbage thread
>>>>>>>>>>>>>>>>>>> information. So you would need to re-validate the
>>>>>>>>>>>>>>>>>>> JavaThread in the target VM before trying to do
>>>>>>>>>>>>>>>>>>> anything with it.
>>>>>>>>>>>>>>>>>> That's not what is going on here. It's attaching and
>>>>>>>>>>>>>>>>>> doing a stack trace, which involves getting the
>>>>>>>>>>>>>>>>>> thread list and iterating through all threads without
>>>>>>>>>>>>>>>>>> detaching.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Okay so I restate my original comment - all the
>>>>>>>>>>>>>>>>> JavaThreads must be alive or not yet started, so how
>>>>>>>>>>>>>>>>> are you encountering an invalid thread id? Any thread
>>>>>>>>>>>>>>>>> you find via the ThreadsList can't have destroyed its
>>>>>>>>>>>>>>>>> osThread. In any case the logic should be checking
>>>>>>>>>>>>>>>>> thread->osThread() for NULL, and then
>>>>>>>>>>>>>>>>> osThread()->get_state() to ensure it is >= INITIALIZED
>>>>>>>>>>>>>>>>> before using the thread_id().
>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I chatted with Dan about this, and he said since the
>>>>>>>>>>>>>>>> JavaThread is responsible for removing itself from the
>>>>>>>>>>>>>>>> ThreadList, it is impossible to have a JavaThread still
>>>>>>>>>>>>>>>> on the ThreadList, but without and underlying OS
>>>>>>>>>>>>>>>> Thread. So I'm a bit perplexed as to how I can find a
>>>>>>>>>>>>>>>> JavaThread on the ThreadList, but that results in ESRCH
>>>>>>>>>>>>>>>> when trying to access the thread with ptrace. My only
>>>>>>>>>>>>>>>> conclusion is that this failure is somehow spurious,
>>>>>>>>>>>>>>>> and maybe the issue it just that the thread is in some
>>>>>>>>>>>>>>>> temporary state that prevents its access. If so, I
>>>>>>>>>>>>>>>> still think the approach I'm taking is the correct one,
>>>>>>>>>>>>>>>> but the comments should be updated.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ESRCH can have other meanings but I don't know enough
>>>>>>>>>>>>>>> about the broader context to know whether they are
>>>>>>>>>>>>>>> applicable in this case.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ESRCH The specified process does not exist, or is
>>>>>>>>>>>>>>> not currently being traced by the caller, or is not stopped
>>>>>>>>>>>>>>> (for requests that require a stopped tracee).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I won't comment further on the fix/workaround as I don't
>>>>>>>>>>>>>>> know the code. I'll leave that to other folk.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I had one other finding. When this issue first turned
>>>>>>>>>>>>>>>> up, it prevented the thread from getting a stack trace
>>>>>>>>>>>>>>>> due to the exception being thrown. What I hadn't
>>>>>>>>>>>>>>>> realize is that after fixing it to not throw an
>>>>>>>>>>>>>>>> exception, which resulted in the stack walking code
>>>>>>>>>>>>>>>> getting all nulls for register values, I actually
>>>>>>>>>>>>>>>> started to see a stack trace printed:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "JLine terminal non blocking reader thread" #26 daemon
>>>>>>>>>>>>>>>> prio=5 tid=0x00007f12f0cd6420 nid=0x1f99 runnable
>>>>>>>>>>>>>>>> [0x00007f125f0f4000]
>>>>>>>>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>>>>>>>> JavaThread state: _thread_in_native
>>>>>>>>>>>>>>>> WARNING: getThreadIntegerRegisterSet0: get_lwp_regs
>>>>>>>>>>>>>>>> failed for lwp (8089)
>>>>>>>>>>>>>>>> CurrentFrameGuess: choosing last Java frame: sp =
>>>>>>>>>>>>>>>> 0x00007f125f0f4770, fp = 0x00007f125f0f47c0
>>>>>>>>>>>>>>>> - java.io.FileInputStream.read0() @bci=0 (Interpreted
>>>>>>>>>>>>>>>> frame)
>>>>>>>>>>>>>>>> - java.io.FileInputStream.read() @bci=1, line=223
>>>>>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>> jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.run()
>>>>>>>>>>>>>>>> @bci=108, line=216 (Interpreted frame)
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>> jdk.internal.org.jline.utils.NonBlockingInputStreamImpl$$Lambda$536+0x0000000800daeca0.run()
>>>>>>>>>>>>>>>> @bci=4 (Interpreted frame)
>>>>>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=832
>>>>>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The "CurrentFrameGuess" output is some debug tracing I
>>>>>>>>>>>>>>>> had enabled, and it indicates that the stack walking
>>>>>>>>>>>>>>>> code is using the "last java frame" setting, which it
>>>>>>>>>>>>>>>> will do if current registers values don't indicate a
>>>>>>>>>>>>>>>> valid frame (as would be the case if sp was null). I
>>>>>>>>>>>>>>>> had previously assumed that without an underling valid
>>>>>>>>>>>>>>>> LWP, there would be no stack trace. Given that there is
>>>>>>>>>>>>>>>> one, there must be a valid LWP. Otherwise I don't see
>>>>>>>>>>>>>>>> how the stack could have been walked. That's another
>>>>>>>>>>>>>>>> indication that the ptrace failure is spurious in nature.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also, even if you are using something like clhsdb to
>>>>>>>>>>>>>>>>>> issue commands on addresses, if the address is no
>>>>>>>>>>>>>>>>>> longer valid for the command you are executing, then
>>>>>>>>>>>>>>>>>> you would get the appropriate error when there is an
>>>>>>>>>>>>>>>>>> attempt to create a wrapper for it. I don't know of
>>>>>>>>>>>>>>>>>> any command that operates directly on a JavaThread,
>>>>>>>>>>>>>>>>>> but I think there are for InstanceKlass. So if you
>>>>>>>>>>>>>>>>>> remembered the address of an InstanceKlass, and then
>>>>>>>>>>>>>>>>>> reattached and tried a command that takes an
>>>>>>>>>>>>>>>>>> InstanceKlass address, you would get an exception
>>>>>>>>>>>>>>>>>> when SA tries to create the wrapper for the
>>>>>>>>>>>>>>>>>> InsanceKlass if it were no longer a valid address for
>>>>>>>>>>>>>>>>>> one.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> -Instead of throwing an exception when the OS
>>>>>>>>>>>>>>>>>>>>>>>> ThreadID is invalid, print a warning.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> -Improve a print_debug message
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/BsdThread.java
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThread.java
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/windbg/amd64/WindbgAMD64Thread.java
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> -Deal with the array of registers read in being
>>>>>>>>>>>>>>>>>>>>>>>> null due to the OS ThreadID not being valid.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/BsdDebuggerLocal.java
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxDebuggerLocal.java
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> -Fix issue with
>>>>>>>>>>>>>>>>>>>>>>>> "sun.jvm.hotspot.debugger.DebuggerException"
>>>>>>>>>>>>>>>>>>>>>>>> appearing twice when printing the exception.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
More information about the serviceability-dev
mailing list