RFR(XS) 8230731: SA tests fail with "Windbg Error: ReadVirtual failed"
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Wed Apr 15 21:40:08 UTC 2020
Hi Chris,
LGTM++
Thanks,
Serguei
On 4/15/20 13:42, Alex Menkov wrote:
> Hi Chris,
>
> The fix looks good.
>
> --alex
>
> On 04/15/2020 10:28, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8230731
>> http://cr.openjdk.java.net/~cjplummer/8230731/webrev.00/index.html
>>
>> SA reads in memory from the target process as needed. The lowest
>> level API that reads in a page of memory on windows is
>> WindbgDebuggerLocal.readBytesFromProcess0(). A typical stack when
>> reading in a page looks like:
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess0(Native
>> Method)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess(WindbgDebuggerLocal.java:482)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase$Fetcher.fetchPage(DebuggerBase.java:80)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getPage(PageCache.java:178)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:383)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:308)
>>
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:72)
>>
>>
>> So readBytesFromProcess() and readBytesFromProcess0() are the only
>> platform dependent bits, but all platforms implement them. Since SA
>> does a lot of guess work to determine the validity of whatever it is
>> looking at, this can result in an attempt to read at an address that
>> is not even in the process. This is suppose to result in an
>> AddressException, and then the SA code is suppose to handle that
>> properly (assuming it was doing something where it wasn't sure if the
>> address was valid).
>>
>> In the above stack trace, the AddressException (actually
>> UnmappedAddressException) is suppose to be thrown by
>> PageCache.getData(). This is suppose happen when a null result from
>> readBytesFromProcess0() works its way up the call chain. This is how
>> it has been working on all platforms...except windows. It's been
>> throwing a DebuggerException from readBytesFromProcess0() if it
>> failed. No one up the call chain knows how to handle it, so it ends
>> up aborting whatever SA command was being executed.
>>
>> The right thing for readBytesFromProcess0() to do if it cannot read
>> in the page is to return null like it does on all other platforms.
>> It's expected that sometimes an attempt to read from an invalid
>> address will be made, and null should be returned when this happens.
>>
>> With this fix, some tests that got "ReadVirtual failed", like
>> ClhsdbScanOops, now pass. Others fail for different reasons because
>> they do not expect the AddressException any more than they expected
>> the DebuggerException. 8242787 [1] is one such reason for these
>> failures, and will be fixed next.
>>
>> Note I could not get ClhsdbPstack.java to fail, which was mentioned
>> in the CR a few times recently. I tried many 100s of times both with
>> and without the fix and never saw it fail. However, looking at the
>> PStack code, it looks like it will still print the exception (now an
>> UnmappedAddressException instead of DebuggerException), and then
>> continue on with the next thread to priont. But since it is no longer
>> a DebuggerException, the test should pass (there is code in
>> ClhsdbLauncher that makes it fail if it sees a DebuggerException).
>> This relates to the email I just sent out yesterday regarding whether
>> or not it is acceptable that sometimes SA can't print a thread's
>> stack trace. I think it is, and this is an example case.
>>
>> This also fixes 8001227 [2], which I will close as a dup once I push.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8242787
>> [2] https://bugs.openjdk.java.net/browse/JDK-8001227
>>
>> thanks,
>>
>> Chris
>>
>>
More information about the serviceability-dev
mailing list