[aarch64-port-dev ] RFR(10): 8172020: Internal Error (cpu/arm/vm/frame_arm.cpp:571): assert(obj == __null || Universe::heap()->is_in(obj)) failed: sanity check #
Chris Plummer
chris.plummer at oracle.com
Tue Feb 21 19:31:07 UTC 2017
Hello,
Please review the following:
http://cr.openjdk.java.net/~cjplummer/8172020/webrev.01/webrev.hotspot/
https://bugs.openjdk.java.net/browse/JDK-8172020
[Note there are s390, ppc, and aarch64 changes in this bug fix that I
will need to have someone verify for me or I won't be pushing them.]
This bug turns up with a closed test so the CR is marked confidential.
I'll try to provided the necessary details here. Sorry the explanation
is lengthy, but it is a complex bug (with a relatively simple fix).
If you are on the s390, ppc, or aarch64 teams and just want to help
verify the fix, I would say it's not necessary to understand the details
of the bug. What's most important is you test the changes. Unfortunately
you won't be able to reproduce the issue since the test is closed, but
you should at least verify the fix builds and introduces no new issues.
I did test the open aarch64 port since I have the hardware, and someone
was kind enough to provide me with binaries both without the patch
(which reproduced the bug) and with the patch (which passed). No other
aarch64 testing was done.
Now on to an explanation of the bug and the fix:
The test is testing the JDI/JVMTI forceEarlyReturn support. After the
debugger side spawns the debugee process and the debugee thread is
created, a vm.suspend() is issued to suspend all threads on the debugee
side. The debugger side then issues a JDI forceEarlyReturn, which will
cause the debugee thread to exit the currently executing method.
Although the intent is to force exit of the run() method, the run()
method does make some method calls to some very small methods with
"inline" in their name, such as inlineMethodReturningObject(). If
forceEarlyReturn is done while in one of them, JDI throws
InvalidTargetException, which the test correctly handles and will pass
if that happens.
If the run() method is not yet compiled, what *normally* happens to
force the thread to suspend is to swap in the interpreter safepoint
dispatch table. This will result in a call_VM out to the interpreter to
handle the safepointing. When later the thread is resumed (after
initiating the forceEarlyReturn), there is code on the return side of
call_VM to make sure the earlyret is handled immediately before any
other bytecodes are exectuted. Thus the method that was executing at the
time of the forceEarlyReturn should be the same as the method when the
earlyret is handled. In fact it should be the same bytetcode(bci). Like
I said, this is what *normally* happens, but is not the way the bug is
exposed.
The bug turns up when the run() method is not yet compiled, and when the
suspend is attempted we are in one of the small "inline" methods, and it
is compiled. In this case thread suspension is achieved in a different
manner. When a compiled method executes its return code, it first pops
its frame, and then the " poll_return" safepoint polling instruction is
executed. This allows the thread to be suspended. We are technically in
the run() method at this point since the frame of the "inline" method
has been popped. Therefore the forceEarlyReturn is allowed. When the
thread is resumed, the return from the "inline" method finally happens,
which puts us in code generated by
TemplateInterpreterGenerator::generate_return_entry_for(). However,
there are no checks for an earlyret here, so we start executing
bytecodes until we finally hit an earlyret check. This happens when the
next invokespecial is done. There is a call_VM after the frame is
pushed, and the call_VM always does an earlyret check. The problem is
now the topmost frame (in the case where it asserts) is
inlineMethodReturningObject(). Since it returns an object type, the
earlyret code expects there to be an oop in the earlyret slot on the
stack. But there isn't because the forceEarlyReturn was done on the void
run() method. So the earlyret code fetches garbage, does an validity
test on it, and gets the assert you see in the bug title.
So what is needed here is an earlyret check after the compiled "inline"
method returns, and before any new bytecodes are executed. Since
TemplateInterpreterGenerator::generate_return_entry_for() is always
executed when the compiled "inline" method returns, the check was added
here. For completeness, the popframe check was also added. The header
file changes are just to make these methods public so they can be called.
I tested the failed test at least 100 times on each supported platform.
The assert usually showed up at least 1 in 50 times. I also ran a large
set of tests on all supported platforms, including everything we do for
nightly testing except for some of the longest running tests.
thanks,
Chris
More information about the aarch64-port-dev
mailing list