RFR(XS) 8230731: SA tests fail with "Windbg Error: ReadVirtual failed"

Chris Plummer chris.plummer at oracle.com
Wed Apr 15 21:42:39 UTC 2020


Thanks Serguei and Alex!

On 4/15/20 2:40 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> LGTM++
>
> Thanks,
> Serguei
>
>
> On 4/15/20 13:42, Alex Menkov wrote:
>> Hi Chris,
>>
>> The fix looks good.
>>
>> --alex
>>
>> On 04/15/2020 10:28, Chris Plummer wrote:
>>> Hello,
>>>
>>> Please review the following:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8230731
>>> http://cr.openjdk.java.net/~cjplummer/8230731/webrev.00/index.html
>>>
>>> SA reads in memory from the target process as needed. The lowest 
>>> level API that reads in a page of memory on windows is 
>>> WindbgDebuggerLocal.readBytesFromProcess0(). A typical stack when 
>>> reading in a page looks like:
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess0(Native 
>>> Method)
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readBytesFromProcess(WindbgDebuggerLocal.java:482) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase$Fetcher.fetchPage(DebuggerBase.java:80) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getPage(PageCache.java:178) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getData(PageCache.java:63) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readBytes(DebuggerBase.java:225) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:383) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:308) 
>>>
>>>          at 
>>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:72) 
>>>
>>>
>>> So readBytesFromProcess() and readBytesFromProcess0() are the only 
>>> platform dependent bits, but all platforms implement them. Since SA 
>>> does a lot of guess work to determine the validity of whatever it is 
>>> looking at, this can result in an attempt to read at an address that 
>>> is not even in the process. This is suppose to result in an 
>>> AddressException, and then the SA code is suppose to handle that 
>>> properly (assuming it was doing something where it wasn't sure if 
>>> the address was valid).
>>>
>>> In the above stack trace, the AddressException (actually 
>>> UnmappedAddressException) is suppose to be thrown by 
>>> PageCache.getData(). This is suppose happen when a null result from 
>>> readBytesFromProcess0() works its way up the call chain. This is how 
>>> it has been working on all platforms...except windows. It's been 
>>> throwing a DebuggerException from readBytesFromProcess0() if it 
>>> failed. No one up the call chain knows how to handle it, so it ends 
>>> up aborting whatever SA command was being executed.
>>>
>>> The right thing for readBytesFromProcess0() to do if it cannot read 
>>> in the page is to return null like it does on all other platforms. 
>>> It's expected that sometimes an attempt to read from an invalid 
>>> address will be made, and null should be returned when this happens.
>>>
>>> With this fix, some tests that got "ReadVirtual failed", like 
>>> ClhsdbScanOops, now pass. Others fail for different reasons because 
>>> they do not expect the AddressException any more than they expected 
>>> the DebuggerException. 8242787 [1] is one such reason for these 
>>> failures, and will be fixed next.
>>>
>>> Note I could not get ClhsdbPstack.java to fail, which was mentioned 
>>> in the CR a few times recently. I tried many 100s of times both with 
>>> and without the fix and never saw it fail. However, looking at the 
>>> PStack code, it looks like it will still print the exception (now an 
>>> UnmappedAddressException instead of DebuggerException), and then 
>>> continue on with the next thread to priont. But since it is no 
>>> longer a DebuggerException, the test should pass (there is code in 
>>> ClhsdbLauncher that makes it fail if it sees a DebuggerException). 
>>> This relates to the email I just sent out yesterday regarding 
>>> whether or not it is acceptable that sometimes SA can't print a 
>>> thread's stack trace. I think it is, and this is an example case.
>>>
>>> This also fixes 8001227 [2], which I will close as a dup once I push.
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8242787
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8001227
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>>
>




More information about the serviceability-dev mailing list