RFR(10): 8172020: Internal Error (cpu/arm/vm/frame_arm.cpp:571): assert(obj == __null || Universe::heap()->is_in(obj)) failed: sanity check #
Chris Plummer
chris.plummer at oracle.com
Mon Feb 27 18:41:55 UTC 2017
Thanks Volker!
On 2/23/17 8:52 AM, Volker Simonis wrote:
> Hi Chris,
>
> thanks for considering ppc64 and s390x!
> Your change builds and works on both platforms.
>
> Regards,
> Volker
>
>
> On Tue, Feb 21, 2017 at 8:31 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>> Hello,
>>
>> Please review the following:
>>
>> http://cr.openjdk.java.net/~cjplummer/8172020/webrev.01/webrev.hotspot/
>> https://bugs.openjdk.java.net/browse/JDK-8172020
>>
>> [Note there are s390, ppc, and aarch64 changes in this bug fix that I will
>> need to have someone verify for me or I won't be pushing them.]
>>
>> This bug turns up with a closed test so the CR is marked confidential. I'll
>> try to provided the necessary details here. Sorry the explanation is
>> lengthy, but it is a complex bug (with a relatively simple fix).
>>
>> If you are on the s390, ppc, or aarch64 teams and just want to help verify
>> the fix, I would say it's not necessary to understand the details of the
>> bug. What's most important is you test the changes. Unfortunately you won't
>> be able to reproduce the issue since the test is closed, but you should at
>> least verify the fix builds and introduces no new issues. I did test the
>> open aarch64 port since I have the hardware, and someone was kind enough to
>> provide me with binaries both without the patch (which reproduced the bug)
>> and with the patch (which passed). No other aarch64 testing was done.
>>
>> Now on to an explanation of the bug and the fix:
>>
>> The test is testing the JDI/JVMTI forceEarlyReturn support. After the
>> debugger side spawns the debugee process and the debugee thread is created,
>> a vm.suspend() is issued to suspend all threads on the debugee side. The
>> debugger side then issues a JDI forceEarlyReturn, which will cause the
>> debugee thread to exit the currently executing method. Although the intent
>> is to force exit of the run() method, the run() method does make some method
>> calls to some very small methods with "inline" in their name, such as
>> inlineMethodReturningObject(). If forceEarlyReturn is done while in one of
>> them, JDI throws InvalidTargetException, which the test correctly handles
>> and will pass if that happens.
>>
>> If the run() method is not yet compiled, what *normally* happens to force
>> the thread to suspend is to swap in the interpreter safepoint dispatch
>> table. This will result in a call_VM out to the interpreter to handle the
>> safepointing. When later the thread is resumed (after initiating the
>> forceEarlyReturn), there is code on the return side of call_VM to make sure
>> the earlyret is handled immediately before any other bytecodes are
>> exectuted. Thus the method that was executing at the time of the
>> forceEarlyReturn should be the same as the method when the earlyret is
>> handled. In fact it should be the same bytetcode(bci). Like I said, this is
>> what *normally* happens, but is not the way the bug is exposed.
>>
>> The bug turns up when the run() method is not yet compiled, and when the
>> suspend is attempted we are in one of the small "inline" methods, and it is
>> compiled. In this case thread suspension is achieved in a different manner.
>> When a compiled method executes its return code, it first pops its frame,
>> and then the " poll_return" safepoint polling instruction is executed. This
>> allows the thread to be suspended. We are technically in the run() method at
>> this point since the frame of the "inline" method has been popped.
>> Therefore the forceEarlyReturn is allowed. When the thread is resumed, the
>> return from the "inline" method finally happens, which puts us in code
>> generated by TemplateInterpreterGenerator::generate_return_entry_for().
>> However, there are no checks for an earlyret here, so we start executing
>> bytecodes until we finally hit an earlyret check. This happens when the next
>> invokespecial is done. There is a call_VM after the frame is pushed, and the
>> call_VM always does an earlyret check. The problem is now the topmost frame
>> (in the case where it asserts) is inlineMethodReturningObject(). Since it
>> returns an object type, the earlyret code expects there to be an oop in the
>> earlyret slot on the stack. But there isn't because the forceEarlyReturn was
>> done on the void run() method. So the earlyret code fetches garbage, does an
>> validity test on it, and gets the assert you see in the bug title.
>>
>> So what is needed here is an earlyret check after the compiled "inline"
>> method returns, and before any new bytecodes are executed. Since
>> TemplateInterpreterGenerator::generate_return_entry_for() is always executed
>> when the compiled "inline" method returns, the check was added here. For
>> completeness, the popframe check was also added. The header file changes are
>> just to make these methods public so they can be called.
>>
>> I tested the failed test at least 100 times on each supported platform. The
>> assert usually showed up at least 1 in 50 times. I also ran a large set of
>> tests on all supported platforms, including everything we do for nightly
>> testing except for some of the longest running tests.
>>
>> thanks,
>>
>> Chris
>>
More information about the ppc-aix-port-dev
mailing list