RFR(XS) 8230731: SA tests fail with "Windbg Error: ReadVirtual failed"

Chris Plummer chris.plummer at oracle.com
Wed Apr 15 17:28:56 UTC 2020


Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8230731
http://cr.openjdk.java.net/~cjplummer/8230731/webrev.00/index.html

SA reads in memory from the target process as needed. The lowest level 
API that reads in a page of memory on windows is 
WindbgDebuggerLocal.readBytesFromProcess0(). A typical stack when 
reading in a page looks like:

         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess0(Native 
Method)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess(WindbgDebuggerLocal.java:482)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase$Fetcher.fetchPage(DebuggerBase.java:80)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getPage(PageCache.java:178)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:383)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:308)
         at 
jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:72)

So readBytesFromProcess() and readBytesFromProcess0() are the only 
platform dependent bits, but all platforms implement them. Since SA does 
a lot of guess work to determine the validity of whatever it is looking 
at, this can result in an attempt to read at an address that is not even 
in the process. This is suppose to result in an AddressException, and 
then the SA code is suppose to handle that properly (assuming it was 
doing something where it wasn't sure if the address was valid).

In the above stack trace, the AddressException (actually 
UnmappedAddressException) is suppose to be thrown by 
PageCache.getData(). This is suppose happen when a null result from 
readBytesFromProcess0() works its way up the call chain. This is how it 
has been working on all platforms...except windows. It's been throwing a 
DebuggerException from readBytesFromProcess0() if it failed. No one up 
the call chain knows how to handle it, so it ends up aborting whatever 
SA command was being executed.

The right thing for readBytesFromProcess0() to do if it cannot read in 
the page is to return null like it does on all other platforms. It's 
expected that sometimes an attempt to read from an invalid address will 
be made, and null should be returned when this happens.

With this fix, some tests that got "ReadVirtual failed", like 
ClhsdbScanOops, now pass. Others fail for different reasons because they 
do not expect the AddressException any more than they expected the 
DebuggerException. 8242787 [1] is one such reason for these failures, 
and will be fixed next.

Note I could not get ClhsdbPstack.java to fail, which was mentioned in 
the CR a few times recently. I tried many 100s of times both with and 
without the fix and never saw it fail. However, looking at the PStack 
code, it looks like it will still print the exception (now an 
UnmappedAddressException instead of DebuggerException), and then 
continue on with the next thread to priont. But since it is no longer a 
DebuggerException, the test should pass (there is code in ClhsdbLauncher 
that makes it fail if it sees a DebuggerException). This relates to the 
email I just sent out yesterday regarding whether or not it is 
acceptable that sometimes SA can't print a thread's stack trace. I think 
it is, and this is an example case.

This also fixes 8001227 [2], which I will close as a dup once I push.

[1] https://bugs.openjdk.java.net/browse/JDK-8242787
[2] https://bugs.openjdk.java.net/browse/JDK-8001227

thanks,

Chris




More information about the serviceability-dev mailing list