RFR(XS) 8230731: SA tests fail with "Windbg Error: ReadVirtual failed"
Chris Plummer
chris.plummer at oracle.com
Wed Apr 15 17:28:56 UTC 2020
Hello,
Please review the following:
https://bugs.openjdk.java.net/browse/JDK-8230731
http://cr.openjdk.java.net/~cjplummer/8230731/webrev.00/index.html
SA reads in memory from the target process as needed. The lowest level
API that reads in a page of memory on windows is
WindbgDebuggerLocal.readBytesFromProcess0(). A typical stack when
reading in a page looks like:
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess0(Native
Method)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess(WindbgDebuggerLocal.java:482)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase$Fetcher.fetchPage(DebuggerBase.java:80)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getPage(PageCache.java:178)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:383)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:308)
at
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:72)
So readBytesFromProcess() and readBytesFromProcess0() are the only
platform dependent bits, but all platforms implement them. Since SA does
a lot of guess work to determine the validity of whatever it is looking
at, this can result in an attempt to read at an address that is not even
in the process. This is suppose to result in an AddressException, and
then the SA code is suppose to handle that properly (assuming it was
doing something where it wasn't sure if the address was valid).
In the above stack trace, the AddressException (actually
UnmappedAddressException) is suppose to be thrown by
PageCache.getData(). This is suppose happen when a null result from
readBytesFromProcess0() works its way up the call chain. This is how it
has been working on all platforms...except windows. It's been throwing a
DebuggerException from readBytesFromProcess0() if it failed. No one up
the call chain knows how to handle it, so it ends up aborting whatever
SA command was being executed.
The right thing for readBytesFromProcess0() to do if it cannot read in
the page is to return null like it does on all other platforms. It's
expected that sometimes an attempt to read from an invalid address will
be made, and null should be returned when this happens.
With this fix, some tests that got "ReadVirtual failed", like
ClhsdbScanOops, now pass. Others fail for different reasons because they
do not expect the AddressException any more than they expected the
DebuggerException. 8242787 [1] is one such reason for these failures,
and will be fixed next.
Note I could not get ClhsdbPstack.java to fail, which was mentioned in
the CR a few times recently. I tried many 100s of times both with and
without the fix and never saw it fail. However, looking at the PStack
code, it looks like it will still print the exception (now an
UnmappedAddressException instead of DebuggerException), and then
continue on with the next thread to priont. But since it is no longer a
DebuggerException, the test should pass (there is code in ClhsdbLauncher
that makes it fail if it sees a DebuggerException). This relates to the
email I just sent out yesterday regarding whether or not it is
acceptable that sometimes SA can't print a thread's stack trace. I think
it is, and this is an example case.
This also fixes 8001227 [2], which I will close as a dup once I push.
[1] https://bugs.openjdk.java.net/browse/JDK-8242787
[2] https://bugs.openjdk.java.net/browse/JDK-8001227
thanks,
Chris
More information about the serviceability-dev
mailing list