[aarch64-port-dev ] RFR(10): 8172020: Internal Error (cpu/arm/vm/frame_arm.cpp:571): assert(obj == __null || Universe::heap()->is_in(obj)) failed: sanity check #
Volker Simonis
volker.simonis at gmail.com
Thu Feb 23 16:52:28 UTC 2017
Hi Chris,
thanks for considering ppc64 and s390x!
Your change builds and works on both platforms.
Regards,
Volker
On Tue, Feb 21, 2017 at 8:31 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> Hello,
>
> Please review the following:
>
> http://cr.openjdk.java.net/~cjplummer/8172020/webrev.01/webrev.hotspot/
> https://bugs.openjdk.java.net/browse/JDK-8172020
>
> [Note there are s390, ppc, and aarch64 changes in this bug fix that I will
> need to have someone verify for me or I won't be pushing them.]
>
> This bug turns up with a closed test so the CR is marked confidential. I'll
> try to provided the necessary details here. Sorry the explanation is
> lengthy, but it is a complex bug (with a relatively simple fix).
>
> If you are on the s390, ppc, or aarch64 teams and just want to help verify
> the fix, I would say it's not necessary to understand the details of the
> bug. What's most important is you test the changes. Unfortunately you won't
> be able to reproduce the issue since the test is closed, but you should at
> least verify the fix builds and introduces no new issues. I did test the
> open aarch64 port since I have the hardware, and someone was kind enough to
> provide me with binaries both without the patch (which reproduced the bug)
> and with the patch (which passed). No other aarch64 testing was done.
>
> Now on to an explanation of the bug and the fix:
>
> The test is testing the JDI/JVMTI forceEarlyReturn support. After the
> debugger side spawns the debugee process and the debugee thread is created,
> a vm.suspend() is issued to suspend all threads on the debugee side. The
> debugger side then issues a JDI forceEarlyReturn, which will cause the
> debugee thread to exit the currently executing method. Although the intent
> is to force exit of the run() method, the run() method does make some method
> calls to some very small methods with "inline" in their name, such as
> inlineMethodReturningObject(). If forceEarlyReturn is done while in one of
> them, JDI throws InvalidTargetException, which the test correctly handles
> and will pass if that happens.
>
> If the run() method is not yet compiled, what *normally* happens to force
> the thread to suspend is to swap in the interpreter safepoint dispatch
> table. This will result in a call_VM out to the interpreter to handle the
> safepointing. When later the thread is resumed (after initiating the
> forceEarlyReturn), there is code on the return side of call_VM to make sure
> the earlyret is handled immediately before any other bytecodes are
> exectuted. Thus the method that was executing at the time of the
> forceEarlyReturn should be the same as the method when the earlyret is
> handled. In fact it should be the same bytetcode(bci). Like I said, this is
> what *normally* happens, but is not the way the bug is exposed.
>
> The bug turns up when the run() method is not yet compiled, and when the
> suspend is attempted we are in one of the small "inline" methods, and it is
> compiled. In this case thread suspension is achieved in a different manner.
> When a compiled method executes its return code, it first pops its frame,
> and then the " poll_return" safepoint polling instruction is executed. This
> allows the thread to be suspended. We are technically in the run() method at
> this point since the frame of the "inline" method has been popped.
> Therefore the forceEarlyReturn is allowed. When the thread is resumed, the
> return from the "inline" method finally happens, which puts us in code
> generated by TemplateInterpreterGenerator::generate_return_entry_for().
> However, there are no checks for an earlyret here, so we start executing
> bytecodes until we finally hit an earlyret check. This happens when the next
> invokespecial is done. There is a call_VM after the frame is pushed, and the
> call_VM always does an earlyret check. The problem is now the topmost frame
> (in the case where it asserts) is inlineMethodReturningObject(). Since it
> returns an object type, the earlyret code expects there to be an oop in the
> earlyret slot on the stack. But there isn't because the forceEarlyReturn was
> done on the void run() method. So the earlyret code fetches garbage, does an
> validity test on it, and gets the assert you see in the bug title.
>
> So what is needed here is an earlyret check after the compiled "inline"
> method returns, and before any new bytecodes are executed. Since
> TemplateInterpreterGenerator::generate_return_entry_for() is always executed
> when the compiled "inline" method returns, the check was added here. For
> completeness, the popframe check was also added. The header file changes are
> just to make these methods public so they can be called.
>
> I tested the failed test at least 100 times on each supported platform. The
> assert usually showed up at least 1 in 50 times. I also ran a large set of
> tests on all supported platforms, including everything we do for nightly
> testing except for some of the longest running tests.
>
> thanks,
>
> Chris
>
More information about the aarch64-port-dev
mailing list