RFR: 8256757: Incorrect MachCallRuntimeNode::ret_addr_offset() for CallLeafNoFP on x86_32

Jie Fu jiefu at openjdk.java.net
Thu Nov 26 14:05:53 UTC 2020


On Thu, 26 Nov 2020 12:22:31 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> JDK-8254231 added new assert in `output.cpp`:
> 
>     assert(!is_mcall || (call_returns[block->_pre_order] <= (uint) current_offset))
> 
> Which verifies that the offset returned by MachCallNode::ret_addr_offset() (and sub-types) at matches the emitted code, to avoid potential conflicts between oop maps of different calls.
> 
> It caught the failure running `compiler/intrinsics/string/TestStringLatin1IndexOfChar.java` on Linux x86_32. I think that happened because the test forced lower SSE settings. But more tests fail if you force with lower SSE settings with e.g. `TEST_VM_OPTS=-XX:UseSSE=1`. The real issue is `MachCallRuntimeNode::ret_addr_offset()` computing the offset incorrectly for `CallLeafNoFP`: the match rule for it does not include `FFree_Float_Stack_All`. With `UseSSE < 2`, `FFree_Float_Stack_All` becomes non-zero.
> 
> See the definitions in `x86_32.ad`:
> 
>   enc_class FFree_Float_Stack_All %{    // Free_Float_Stack_All
>     MacroAssembler masm(&cbuf);
>     int start = masm.offset();
>     if (UseSSE >= 2) {
>       if (VerifyFPU) {
>         masm.verify_FPU(0, "must be empty in SSE2+ mode");
>       }
>     } else { // <---- taking this with UseSSE < 2
>       // External c_calling_convention expects the FPU stack to be 'clean'.
>       // Compiled code leaves it dirty.  Do cleanup now.
>       masm.empty_FPU_stack();  
>     }
>     if (sizeof_FFree_Float_Stack_All == -1) {
>       sizeof_FFree_Float_Stack_All = masm.offset() - start;
>     } else {
>       assert(masm.offset() - start == sizeof_FFree_Float_Stack_All, "wrong size");
>     }
>   %}
> 
> // Call runtime without safepoint
> instruct CallLeafDirect(method meth) %{
>   match(CallLeaf);
>   effect(USE meth);
> 
>   ins_cost(300);
>   format %{ "CALL_LEAF,runtime " %}
>   opcode(0xE8); /* E8 cd */
>   ins_encode( pre_call_resets,
>               FFree_Float_Stack_All,
>               Java_To_Runtime( meth ),
>               Verify_FPU_For_Leaf, post_call_FPU );
>   ins_pipe( pipe_slow );
> %}
> 
> instruct CallLeafNoFPDirect(method meth) %{
>   match(CallLeafNoFP);
>   effect(USE meth);
> 
>   ins_cost(300);
>   format %{ "CALL_LEAF_NOFP,runtime " %}
>   opcode(0xE8); /* E8 cd */
>   ins_encode(pre_call_resets, Java_To_Runtime(meth));
>   ins_pipe( pipe_slow );
> %}
> 
> This does not affect `x86_64`, AFAICS.
> 
> Additional testing:
>  - [x] A few known failing tests on Linux x86_32
>  - [x] Linux x86_32 `tier1`
>  - [x] Linux x86_32 `tier1` with `-XX:UseSSE=1`

Looks good.
And thanks for fixing it.

How about replacing '5' with 'NativeCall::instruction_size' and also keeping the code generation order like this:
  return pre_call_resets_size() + (_leaf_no_fp ? 0 : sizeof_FFree_Float_Stack_All) + NativeCall::instruction_size;

-------------

Marked as reviewed by jiefu (Committer).

PR: https://git.openjdk.java.net/jdk/pull/1452


More information about the hotspot-compiler-dev mailing list