[aarch64-port-dev ] RFR(10): 8172020: Internal Error (cpu/arm/vm/frame_arm.cpp:571): assert(obj == __null || Universe::heap()->is_in(obj)) failed: sanity check #
Chris Plummer
chris.plummer at oracle.com
Tue Feb 21 23:41:31 UTC 2017
Thanks Serguei!
On 2/21/17 3:29 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> It looks good to me.
> But let me mediate on the fix a little bit more.
> Consider it reviewed if I'll not come back in 24 hours. :)
>
> Thank you a lot for this great investigation and fix!
> Serguei
>
>
> On 2/21/17 11:31, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> http://cr.openjdk.java.net/~cjplummer/8172020/webrev.01/webrev.hotspot/
>> https://bugs.openjdk.java.net/browse/JDK-8172020
>>
>> [Note there are s390, ppc, and aarch64 changes in this bug fix that I
>> will need to have someone verify for me or I won't be pushing them.]
>>
>> This bug turns up with a closed test so the CR is marked
>> confidential. I'll try to provided the necessary details here. Sorry
>> the explanation is lengthy, but it is a complex bug (with a
>> relatively simple fix).
>>
>> If you are on the s390, ppc, or aarch64 teams and just want to help
>> verify the fix, I would say it's not necessary to understand the
>> details of the bug. What's most important is you test the changes.
>> Unfortunately you won't be able to reproduce the issue since the test
>> is closed, but you should at least verify the fix builds and
>> introduces no new issues. I did test the open aarch64 port since I
>> have the hardware, and someone was kind enough to provide me with
>> binaries both without the patch (which reproduced the bug) and with
>> the patch (which passed). No other aarch64 testing was done.
>>
>> Now on to an explanation of the bug and the fix:
>>
>> The test is testing the JDI/JVMTI forceEarlyReturn support. After the
>> debugger side spawns the debugee process and the debugee thread is
>> created, a vm.suspend() is issued to suspend all threads on the
>> debugee side. The debugger side then issues a JDI forceEarlyReturn,
>> which will cause the debugee thread to exit the currently executing
>> method. Although the intent is to force exit of the run() method, the
>> run() method does make some method calls to some very small methods
>> with "inline" in their name, such as inlineMethodReturningObject().
>> If forceEarlyReturn is done while in one of them, JDI throws
>> InvalidTargetException, which the test correctly handles and will
>> pass if that happens.
>>
>> If the run() method is not yet compiled, what *normally* happens to
>> force the thread to suspend is to swap in the interpreter safepoint
>> dispatch table. This will result in a call_VM out to the interpreter
>> to handle the safepointing. When later the thread is resumed (after
>> initiating the forceEarlyReturn), there is code on the return side of
>> call_VM to make sure the earlyret is handled immediately before any
>> other bytecodes are exectuted. Thus the method that was executing at
>> the time of the forceEarlyReturn should be the same as the method
>> when the earlyret is handled. In fact it should be the same
>> bytetcode(bci). Like I said, this is what *normally* happens, but is
>> not the way the bug is exposed.
>>
>> The bug turns up when the run() method is not yet compiled, and when
>> the suspend is attempted we are in one of the small "inline" methods,
>> and it is compiled. In this case thread suspension is achieved in a
>> different manner. When a compiled method executes its return code, it
>> first pops its frame, and then the " poll_return" safepoint polling
>> instruction is executed. This allows the thread to be suspended. We
>> are technically in the run() method at this point since the frame of
>> the "inline" method has been popped. Therefore the forceEarlyReturn
>> is allowed. When the thread is resumed, the return from the "inline"
>> method finally happens, which puts us in code generated by
>> TemplateInterpreterGenerator::generate_return_entry_for(). However,
>> there are no checks for an earlyret here, so we start executing
>> bytecodes until we finally hit an earlyret check. This happens when
>> the next invokespecial is done. There is a call_VM after the frame is
>> pushed, and the call_VM always does an earlyret check. The problem is
>> now the topmost frame (in the case where it asserts) is
>> inlineMethodReturningObject(). Since it returns an object type, the
>> earlyret code expects there to be an oop in the earlyret slot on the
>> stack. But there isn't because the forceEarlyReturn was done on the
>> void run() method. So the earlyret code fetches garbage, does an
>> validity test on it, and gets the assert you see in the bug title.
>>
>> So what is needed here is an earlyret check after the compiled
>> "inline" method returns, and before any new bytecodes are executed.
>> Since TemplateInterpreterGenerator::generate_return_entry_for() is
>> always executed when the compiled "inline" method returns, the check
>> was added here. For completeness, the popframe check was also added.
>> The header file changes are just to make these methods public so they
>> can be called.
>>
>> I tested the failed test at least 100 times on each supported
>> platform. The assert usually showed up at least 1 in 50 times. I also
>> ran a large set of tests on all supported platforms, including
>> everything we do for nightly testing except for some of the longest
>> running tests.
>>
>> thanks,
>>
>> Chris
>>
>
More information about the aarch64-port-dev
mailing list