RFR(XS) 8230731: SA tests fail with "Windbg Error: ReadVirtual failed"

Alex Menkov alexey.menkov at oracle.com
Wed Apr 15 20:42:51 UTC 2020


Hi Chris,

The fix looks good.

--alex

On 04/15/2020 10:28, Chris Plummer wrote:
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8230731
> http://cr.openjdk.java.net/~cjplummer/8230731/webrev.00/index.html
> 
> SA reads in memory from the target process as needed. The lowest level 
> API that reads in a page of memory on windows is 
> WindbgDebuggerLocal.readBytesFromProcess0(). A typical stack when 
> reading in a page looks like:
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess0(Native 
> Method)
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess(WindbgDebuggerLocal.java:482) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase$Fetcher.fetchPage(DebuggerBase.java:80) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getPage(PageCache.java:178) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:383) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:308) 
> 
>          at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:72) 
> 
> 
> So readBytesFromProcess() and readBytesFromProcess0() are the only 
> platform dependent bits, but all platforms implement them. Since SA does 
> a lot of guess work to determine the validity of whatever it is looking 
> at, this can result in an attempt to read at an address that is not even 
> in the process. This is suppose to result in an AddressException, and 
> then the SA code is suppose to handle that properly (assuming it was 
> doing something where it wasn't sure if the address was valid).
> 
> In the above stack trace, the AddressException (actually 
> UnmappedAddressException) is suppose to be thrown by 
> PageCache.getData(). This is suppose happen when a null result from 
> readBytesFromProcess0() works its way up the call chain. This is how it 
> has been working on all platforms...except windows. It's been throwing a 
> DebuggerException from readBytesFromProcess0() if it failed. No one up 
> the call chain knows how to handle it, so it ends up aborting whatever 
> SA command was being executed.
> 
> The right thing for readBytesFromProcess0() to do if it cannot read in 
> the page is to return null like it does on all other platforms. It's 
> expected that sometimes an attempt to read from an invalid address will 
> be made, and null should be returned when this happens.
> 
> With this fix, some tests that got "ReadVirtual failed", like 
> ClhsdbScanOops, now pass. Others fail for different reasons because they 
> do not expect the AddressException any more than they expected the 
> DebuggerException. 8242787 [1] is one such reason for these failures, 
> and will be fixed next.
> 
> Note I could not get ClhsdbPstack.java to fail, which was mentioned in 
> the CR a few times recently. I tried many 100s of times both with and 
> without the fix and never saw it fail. However, looking at the PStack 
> code, it looks like it will still print the exception (now an 
> UnmappedAddressException instead of DebuggerException), and then 
> continue on with the next thread to priont. But since it is no longer a 
> DebuggerException, the test should pass (there is code in ClhsdbLauncher 
> that makes it fail if it sees a DebuggerException). This relates to the 
> email I just sent out yesterday regarding whether or not it is 
> acceptable that sometimes SA can't print a thread's stack trace. I think 
> it is, and this is an example case.
> 
> This also fixes 8001227 [2], which I will close as a dup once I push.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8242787
> [2] https://bugs.openjdk.java.net/browse/JDK-8001227
> 
> thanks,
> 
> Chris
> 
> 


More information about the serviceability-dev mailing list