RFR: fix for crash caused by earlyret of compiled method
Hi! Please review the fix [1] for the crash happens when compiled method is forced to return early. We found this issue when running vm/jvmti/ForceEarlyReturnObject/fero001/fero00103/fero00103.html JCK test on CPU with 47 cores. The crash happens at the time when test code tries to check returned value and gets invalid oop. Despite correct object was acquired by load_earlyret_value(), it is accidentally substituted on the stack when remove_activation() is executed (see TemplateInterpreterGenerator::generate_earlyret_entry_for). Substitution, in turns, happens due to overlapping of expression and native stacks. Particularly esp equals to sp because frame with no expression is restored (that is correct). After that execution of following code sequence replaces TOS with some irrelevant value. push(tos) // -> puts value on expression stack unlock_object() // -> calls call_VM() and save registers on native stack, in fact, at the same place as esp pop(tos) // -> restores invalid value The fix restores expression stack to max possible for deoptimized method size. Please consider if this fix covers the case workarounded by [2] (I was not able to trace back and find the reason for this changeset). [1] http://cr.openjdk.java.net/~snazarki/earlyret_crash/ [2] http://hg.openjdk.java.net/aarch64-port/jdk8u60/hotspot/file/83f5fdfd56ec/sr... Sergey Nazarkin
On 13/06/17 14:24, Sergey Nazarkin wrote:
Hi!
Please review the fix [1] for the crash happens when compiled method is forced to return early. We found this issue when running vm/jvmti/ForceEarlyReturnObject/fero001/fero00103/fero00103.html JCK test on CPU with 47 cores.
The crash happens at the time when test code tries to check returned value and gets invalid oop. Despite correct object was acquired by load_earlyret_value(), it is accidentally substituted on the stack when remove_activation() is executed (see TemplateInterpreterGenerator::generate_earlyret_entry_for). Substitution, in turns, happens due to overlapping of expression and native stacks. Particularly esp equals to sp because frame with no expression is restored (that is correct).
After that execution of following code sequence replaces TOS with some irrelevant value.
push(tos) // -> puts value on expression stack unlock_object() // -> calls call_VM() and save registers on native stack, in fact, at the same place as esp pop(tos) // -> restores invalid value
The fix restores expression stack to max possible for deoptimized method size. Please consider if this fix covers the case workarounded by [2] (I was not able to trace back and find the reason for this changeset).
[1] http://cr.openjdk.java.net/~snazarki/earlyret_crash/ [2] http://hg.openjdk.java.net/aarch64-port/jdk8u60/hotspot/file/83f5fdfd56ec/sr... I don't quite understand this. If we are the top frame, then we must allow space for max_stack, but in that case temps + extra_args is equal to max_stack anyway, so we don't need to add max_stack to extra_args. If we aren't the top frame, we peel back the native SP at the point of call.
Can you tell me a bit more about this? Andrew. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi Andrew, we are at top frame, but temps and extra_args are 0 #0 AbstractInterpreter::size_activation (max_stack=3, temps=0, extra_args=0, monitors=1, callee_params=0, callee_locals=0, is_top_frame=true) at /media/psf/Home/projects/zulu8-arm64-dev/hotspot/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp:1635 #1 0x0000007fb78c6b10 in vframeArrayElement::on_stack_size (this=0x7f840014b8, callee_parameters=0, callee_locals=0, is_top_frame=true, popframe_extra_stack_expression_els=0) at /media/psf/Home/projects/zulu8-arm64-dev/hotspot/src/share/vm/runtime/vframeArray.cpp:442 Sergey Nazarkin
On 13 Jun 2017, at 17:54, Andrew Haley <aph@redhat.com> wrote:
On 13/06/17 14:24, Sergey Nazarkin wrote:
Hi!
Please review the fix [1] for the crash happens when compiled method is forced to return early. We found this issue when running vm/jvmti/ForceEarlyReturnObject/fero001/fero00103/fero00103.html JCK test on CPU with 47 cores.
The crash happens at the time when test code tries to check returned value and gets invalid oop. Despite correct object was acquired by load_earlyret_value(), it is accidentally substituted on the stack when remove_activation() is executed (see TemplateInterpreterGenerator::generate_earlyret_entry_for). Substitution, in turns, happens due to overlapping of expression and native stacks. Particularly esp equals to sp because frame with no expression is restored (that is correct).
After that execution of following code sequence replaces TOS with some irrelevant value.
push(tos) // -> puts value on expression stack unlock_object() // -> calls call_VM() and save registers on native stack, in fact, at the same place as esp pop(tos) // -> restores invalid value
The fix restores expression stack to max possible for deoptimized method size. Please consider if this fix covers the case workarounded by [2] (I was not able to trace back and find the reason for this changeset).
[1] http://cr.openjdk.java.net/~snazarki/earlyret_crash/ [2] http://hg.openjdk.java.net/aarch64-port/jdk8u60/hotspot/file/83f5fdfd56ec/sr... I don't quite understand this. If we are the top frame, then we must allow space for max_stack, but in that case temps + extra_args is equal to max_stack anyway, so we don't need to add max_stack to extra_args. If we aren't the top frame, we peel back the native SP at the point of call.
Can you tell me a bit more about this?
Andrew.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
On 14/06/17 14:33, Sergey Nazarkin wrote:
Hi Andrew,
we are at top frame, but temps and extra_args are 0
#0 AbstractInterpreter::size_activation (max_stack=3, temps=0, extra_args=0, monitors=1, callee_params=0, callee_locals=0, is_top_frame=true) at /media/psf/Home/projects/zulu8-arm64-dev/hotspot/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp:1635 #1 0x0000007fb78c6b10 in vframeArrayElement::on_stack_size (this=0x7f840014b8, callee_parameters=0, callee_locals=0, is_top_frame=true, popframe_extra_stack_expression_els=0) at /media/psf/Home/projects/zulu8-arm64-dev/hotspot/src/share/vm/runtime/vframeArray.cpp:442
In the method above,CodeEmitInfo::interpreter_frame_size(), int extra_args = state->scope()->method()->max_stack() - state->stack_size(); and int temps = state->stack_size(); int frame_size = BytesPerWord * Interpreter::size_activation(method->max_stack(), temps + callee_parameters, extra_args, locks, callee_parameters, callee_locals, is_top_frame); so, extra_args == max_stack - temps So, it looks right in that case, and it doesn't make sense to add in extra_args twice. In the case of vframeArrayElement::on_stack_size, we need to know whether we are the top frame or not in order to determine the amount of stack we need because we don't allocate max_stack at call sites, only the stack we need. We are passed the information about whether we're a top frame or not. PPC uses const int max_alignment_space = StackAlignmentInBytes / Interpreter::stackElementSize; const int abi_scratch = is_top_frame ? (frame::abi_reg_args_size / Interpreter::stackElementSize) : (frame::abi_minframe_size / Interpreter::stackElementSize); const int size = max_stack + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + max_alignment_space + abi_scratch + frame::ijava_state_size / Interpreter::stackElementSize; // Fixed size of an interpreter frame, align to 16-byte. return (size & -2); which looks reasonable: it must allocate max_stack at every call, but we don't do that on AArch64. I think this might be correct for us: int size = overhead + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + is_top_frame ? max_stack : temps + extra_args; I'm going to try to run the test to see for myself. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi, On 14/06/17 16:58, Andrew Haley wrote:
I think this might be correct for us:
int size = overhead + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + is_top_frame ? max_stack : temps + extra_args;
or rather, int size = overhead + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + (is_top_frame ? max_stack : temps + extra_args); -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi, This is the patch I'd like to go with. It allocates the correct amount of stack space in every case, as far as I can see, and I've carefully stepped through your failing test case. Can you please test it in your environment? Thanks. diff --git a/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp b/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp --- a/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp +++ b/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp @@ -109,9 +109,15 @@ // for the callee's params we only need to account for the extra // locals. int size = overhead + - (callee_locals - callee_params)*Interpreter::stackElementWords + + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + - temps* Interpreter::stackElementWords + extra_args; + // On the top frame, at all times SP <= ESP, and SP is + // 16-aligned. We ensure this by adjusting SP on method + // entry and re-entry to allow room for the maximum size of + // the expression stack. When we call another method we bump + // SP so that no stack space is wasted. So, only on the top + // frame do we need to allow max_stack words. + (is_top_frame ? max_stack : temps + extra_args); // On AArch64 we always keep the stack pointer 16-aligned, so we // must round up here. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi Andrew, thank you, the patch works fine. I’ve updated webrev with correct patch for JDK8 Sergey Nazarkin
On 15 Jun 2017, at 21:12, Andrew Haley <aph@redhat.com> wrote:
Hi,
This is the patch I'd like to go with. It allocates the correct amount of stack space in every case, as far as I can see, and I've carefully stepped through your failing test case.
Can you please test it in your environment? Thanks.
diff --git a/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp b/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp --- a/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp +++ b/src/cpu/aarch64/vm/abstractInterpreter_aarch64.cpp @@ -109,9 +109,15 @@ // for the callee's params we only need to account for the extra // locals. int size = overhead + - (callee_locals - callee_params)*Interpreter::stackElementWords + + (callee_locals - callee_params) + monitors * frame::interpreter_frame_monitor_size() + - temps* Interpreter::stackElementWords + extra_args; + // On the top frame, at all times SP <= ESP, and SP is + // 16-aligned. We ensure this by adjusting SP on method + // entry and re-entry to allow room for the maximum size of + // the expression stack. When we call another method we bump + // SP so that no stack space is wasted. So, only on the top + // frame do we need to allow max_stack words. + (is_top_frame ? max_stack : temps + extra_args);
// On AArch64 we always keep the stack pointer 16-aligned, so we // must round up here.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
On 16/06/17 12:51, Sergey Nazarkin wrote:
thank you, the patch works fine. I’ve updated webrev with correct patch for JDK8
OK for 8, 9, and 10. Thanks. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Looking for kind soul who can updated repos since I’ve no rights to do it. Sergey Nazarkin
On 16 Jun 2017, at 19:04, Andrew Haley <aph@redhat.com> wrote:
On 16/06/17 12:51, Sergey Nazarkin wrote:
thank you, the patch works fine. I’ve updated webrev with correct patch for JDK8
OK for 8, 9, and 10.
Thanks.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi Sergey, I will perform a jtreg test for the new patch and push it if everything is fine. Thanks for reporting the bug. Felix On 19 June 2017 at 19:10, Sergey Nazarkin <snazarkin@azul.com> wrote:
Looking for kind soul who can updated repos since I’ve no rights to do it.
Sergey Nazarkin
On 16 Jun 2017, at 19:04, Andrew Haley <aph@redhat.com> wrote:
On 16/06/17 12:51, Sergey Nazarkin wrote:
thank you, the patch works fine. I’ve updated webrev with correct patch for JDK8
OK for 8, 9, and 10.
Thanks.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
On 19/06/17 14:42, Felix Yang wrote:
I will perform a jtreg test for the new patch and push it if everything is fine. Thanks for reporting the bug.
Do you intend also to do backports? I'm guessing we need 9 and 10 at least, maybe also 8. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Yes, this has been tested and pushed into 8, 9, and 10. On 27 June 2017 at 01:09, Andrew Haley <aph@redhat.com> wrote:
On 19/06/17 14:42, Felix Yang wrote:
I will perform a jtreg test for the new patch and push it if
everything
is fine. Thanks for reporting the bug.
Do you intend also to do backports? I'm guessing we need 9 and 10 at least, maybe also 8.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
participants (3)
-
Andrew Haley
-
Felix Yang
-
Sergey Nazarkin