[9] RFR(S): 8068945: Use RBP register as proper frame pointer in JIT compiled code on x86

Zoltán Majó zoltan.majo at oracle.com
Wed Apr 22 16:26:22 UTC 2015


Hi Vladimir,


I managed to do some more work on this enhancement. Please see details 
below.

On 04/01/2015 03:38 PM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> On 03/30/2015 11:39 PM, Vladimir Kozlov wrote:
>> On 3/30/15 2:13 PM, Zoltán Majó wrote:
>>> Hi Vladimir,
>>>
>>>
>>> thank you for the feedback!
>>>
>>> On 03/30/2015 10:27 PM, Vladimir Kozlov wrote:
>>>> How about PreserveFramePointer instead of simple FramePointer?
>>>>
>>>> PreserveFramePointer will mean that compiled (or other) code will use
>>>> that register only as Frame pointer.
>>>
>>> I will change the flag's name to PreserveFramePointer and will also
>>> update the description.

I changed the flag's name to PreserveFramePointer, just as you suggested.

>>>
>>>> Zoltan, x86 flags setting should be in general globals_x86.hpp. You
>>>> can #ifdef _LP64 there too. I don't understand why you only set it to
>>>> true on linux-x64.
>>>
>>> I remembered that the original discussion with Brendan Gregg mentioned
>>> only Linux's perf tool as a possible use case for "proper" frame
>>> pointers. So I was unsure whether to enable proper frame pointers by
>>> default on other x64 platforms as well.
>>>
>>> But if you think it would be better to have proper frame pointers on 
>>> all
>>> x64 platforms, I will change the code to set PreserveFramePointer to
>>> true for all x64 platforms. Just please let me know.

The current webrev sets the PreserveFramePointer flag to to true on all 
x86_64 platforms and to false on all other platforms.

>>
>> Currently compiled code for all x86 platforms is almost the same 
>> (win64 has difference in registers usage) and we should keep it that 
>> way.
>>
>> Also the original request was to have flag to enable such behavior 
>> (use RBP only as FP). So to have it off by default is acceptable. If 
>> performance group or someone find a regression (or bug) due to this 
>> change we can switch the flag off by default before jdk9 release.
>>
>> Try to run pstack on Solaris and jstack on OSX to make sure they 
>> report correct call stack with compiled java methods. And JFR.
>> Also it would be nice to run SunStudio analyzer to verify that it works.
>
> I ran all tools you've suggested. JFR and jstack is unaffected, pstack 
> produces nice stack traces (it did not always do so before).

I tested the current webrev with the following setup: I used two tests, 
one that generates a long chain of lambda form invocations and an other 
one that generates a long chain of "regular" method invocations. Both 
tests were executed on an x64 machine in four configurations: with +/- 
Xcomp and with +/- PreserveFramePointer.

Just as before, JFR and jstack stack traces are unaffected for both 
tests, pstack can now produce stack traces with both tests if 
PreserveFramePointer is enabled.

> However, I've encountered a problem with SunStudio: Two asserts fail 
> in the fastdebug build. Both of them  "soft" failures, as neither the 
> VM nor SunStudio crash with the product build. I worked on the problem 
> today and have a partial understanding of the issue, but more 
> investigation is needed to have a patch that preserves the correct 
> behavior of SunStudio as well.

I was able to track down the problems with SunStudio. I had to change 
the code at two places.


Change #1 (in src/cpu/x86/vm/frame_x86.cpp):

*** 222,232 ****
       }

       if (sender_blob->is_nmethod()) {
           nmethod* nm = sender_blob->as_nmethod_or_null();
           if (nm != NULL) {
!             if (nm->is_deopt_mh_entry(sender_pc) || 
nm->is_deopt_entry(sender_pc)) {
                   return false;
               }
           }
       }

--- 222,233 ----
       }

       if (sender_blob->is_nmethod()) {
           nmethod* nm = sender_blob->as_nmethod_or_null();
           if (nm != NULL) {
!             if (nm->is_deopt_mh_entry(sender_pc) || 
nm->is_deopt_entry(sender_pc) ||
!                 nm->method()->is_method_handle_intrinsic()) {
                   return false;
               }
           }
       }

The reason for this change is the following. Method handle intrinsics 
(i.e., the intrinsics _invokeBasic, _linkToVirtual,_linkToStatic, 
_linkToSpecial, and _linkToInterface) do not allocate stack space when 
invoked, but they can extend the stack space of their caller "temporarily".

For example, if VerifyMethodHandles is enabled, some stack space is used 
during verification. The temporarily used stack space is released before 
the intrinsic jumps to its target. As a result, the target of a method 
handle intrinsic will have a correct SP when it returns and the 
program's control flow is correct.

Moreover, if the stack is walked synchronously (e.g., at safepoints), no 
problems appear either, because the synchronous interruption can happen 
while execution is within the method handle intrinsic.

The problem is that the SunStudio analyzer can interrupt the VM 
asynchonously and walk the stack. If execution of a thread is 
interrupted while the thread is in a method handle intrinsic, the SP 
might contain an invalid value.

The new webrev adds a check that marks the current frame unsafe for 
sender if the frame belongs to a method handle intrinsic 
(frame::safe_for_sender returns false in this case).


Change #2 (in src/share/vm/prims/forte.cpp):

*** 425,435 ****

       RegisterMap map(thd, false);
       initial_Java_frame = initial_Java_frame.sender(&map);
     }

!   vframeStreamForte st(thd, initial_Java_frame, false);

     for (; !st.at_end() && count < depth; st.forte_next(), count++) {
       bci = st.bci();
       method = st.method();

--- 425,435 ----

       RegisterMap map(thd, false);
       initial_Java_frame = initial_Java_frame.sender(&map);
     }

!   vframeStreamForte st(thd, initial_Java_frame, true);

     for (; !st.at_end() && count < depth; st.forte_next(), count++) {
       bci = st.bci();
       method = st.method();

The problem is that the following assert in forte.cpp on line 103

assert(filled_in, "invariant");

fails. The problem appears if we have a stack trace like:

V  [libjvm.so+0x1c98c4a]  void VMError::report(outputStream*)+0xb1a
V  [libjvm.so+0x1c9a3e8]  void VMError::report_and_die()+0x748
V  [libjvm.so+0x1003c8e]  void report_vm_error(const char*,int,const 
char*,const char*)+0x7e
V  [libjvm.so+0x10efa22]  vframeStreamForte::vframeStreamForte #Nvariant 
1(JavaThread*,frame,bool)+0xe2
--> (Frame #5) V  [libjvm.so+0x10f0bb9]  void 
forte_fill_call_trace_given_top(JavaThread*,ASGCT_CallTrace*,int,frame)+0x789
V  [libjvm.so+0x10f1436]  AsyncGetCallTrace+0x246
C  [libcollector.so+0x272a8]  __collector_ext_jstack_unwind+0xb8
C  [libcollector.so+0x277df]  __collector_get_frame_info+0x27f
C  [libcollector.so+0x2f093]  __collector_getUserCtx+0x13
C  [libcollector.so+0x1abc7]  __collector_ext_profile_handler+0x127
C  [libcollector.so+0x17535]  collector_sigprof_dispatcher+0x85
C  [libc.so.1+0x122476]  __sighndlr+0x6
C  [libc.so.1+0x115972]  call_user_handler+0x2ce
C  [libc.so.1+0x115e1b]  sigacthandler+0xdb
C  0xffffffffffffffff
V  [libjvm.so+0x1959089]  void os::PlatformEvent::park()+0xd9
V  [libjvm.so+0x18a6b34]  int ParkCommon(ParkEvent*,long)+0x34
V  [libjvm.so+0x18a7657]  int Monitor::IWait(Thread*,long)+0xb7
V  [libjvm.so+0x18a8b86]  bool Monitor::wait(bool,long,bool)+0x346
V  [libjvm.so+0xf7073d]  void 
CompileBroker::wait_for_completion(CompileTask*)+0xad
V  [libjvm.so+0xf6f6b6]  void 
CompileBroker::compile_method_base(methodHandle,int,int,methodHandle,int,const 
char*,Thread*)+0x406
V  [libjvm.so+0xf6fd96] 
nmethod*CompileBroker::compile_method(methodHandle,int,int,methodHandle,int,const 
char*,Thread*)+0x586
V  [libjvm.so+0xbbad72]  void 
AdvancedThresholdPolicy::submit_compile(methodHandle,int,CompLevel,JavaThread*)+0xb2
V  [libjvm.so+0x1aef92f]  void 
SimpleThresholdPolicy::compile(methodHandle,int,CompLevel,JavaThread*)+0x14f
V  [libjvm.so+0xbbb00f]  void 
AdvancedThresholdPolicy::method_invocation_event(methodHandle,methodHandle,CompLevel,nmethod*,JavaThread*)+0x1ff
V  [libjvm.so+0x1aef765] 
nmethod*SimpleThresholdPolicy::event(methodHandle,methodHandle,int,int,CompLevel,nmethod*,JavaThread*)+0x2e5
V  [libjvm.so+0xdb93bc]  unsigned 
char*Runtime1::counter_overflow(JavaThread*,int,Method*)+0x31c
v  ~RuntimeStub::counter_overflow Runtime1 stub
--> (Frame #29) J 143 C1 
java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @ 
0xffff80ffacc6962a [0xffff80ffacc69580+0xaa]
--> (Frame #30) v  ~StubRoutines::call_stub
V  [libjvm.so+0x13ca50b]  void 
JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x41b
V  [libjvm.so+0x152a111]  JVM_DoPrivileged+0xfb1
C  [libjava.so+0x12f42] 
Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2+0x12
J 142 
java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object; 
(0 bytes) @ 0xffff80ffb400c57c [0xffff80ffb400c420+0x15c\
]
J 134 C1 
java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class; 
(47 bytes) @ 0xffff80ffacc66014 [0xffff80ffacc65ec0+0x154]
... more stack frames

The forte_fill_call_trace_given_top() method (Frame #5) first checks if 
the first Java frame found is fully decipherable (line 395 in 
forte.cpp). In our case the first Java frame is Frame #29 (the 
C1-compiled version of java.net.URLClassLoader$1.run).

In our case Frame #29 is not decipherable, because 
java.net.URLClassLoader$1.run has been made "not entrant" (a C2-compiled 
version of the same method has been produced shortly before).

Afterwards, forte_fill_call_trace_given_top() checks if the method is 
"safe for sender" (line 424 in forte.cpp). The caller of the 
java.net.URLClassLoader$1.run method is ~StubRoutines::call_stub, which 
is considered "safe for sender" by the VM.

Then, initial_Java_frame is set to the ~StubRoutines::call_stub stub 
(line 430). This does not seem to be correct because the stub is not a 
Java method and causes the assert(filled_in, "invariant") in the 
constructor of vframeStreamForte (line 103 in forte.cpp) to fail 
(because the frame cannot be filled from a stub).

To avoid this failure, I propose to call the constructor of 
vframeStreamForte with parameter stop_at_java_call_stub set to true 
(instead of false) so that the VM stops walking the stack if a call stub 
has been reached.


Here is the updated webrev:

http://cr.openjdk.java.net/~zmajo/8068945/webrev.02/

In addition to testing the changeset with the tools mentioned before, I 
executed
- all JPRT tests, all pass;
- all java/lang/invoke and compiler JTREG tests; all tests that pass 
with the unmodified source trace pass with the changes as well.

Thank you very much in advance!

Best regards,


Zoltan

>
> So that will put this RFR on hold for a while, unfortunately.
>
> Thank you for the feedback and suggestions so far!
>
> Best regards,
>
>
> Zoltan
>
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Thank you!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 3/30/15 8:30 AM, Zoltán Majó wrote:
>>>>> Hi Ed,
>>>>>
>>>>>
>>>>> thank you for your feedback! Please see comments below.
>>>>>
>>>>> On 03/30/2015 11:11 AM, Edward Nevill wrote:
>>>>>> Hi Zoltán,
>>>>>>
>>>>>> On Fri, 2015-03-27 at 15:34 +0100, Zoltán Majó wrote:
>>>>>>> Full JPRT run, all tests pass. I also ran all hotspot compiler
>>>>>>> tests and
>>>>>>> the jdk tests in java/lang/invoke on both x86_64 and x86_32. All 
>>>>>>> tests
>>>>>>> that pass without the patch pass also with the patch.
>>>>>>>
>>>>>>> I ran the SPEC JVM 2008 benchmarks on our performance
>>>>>>> infrastructure for
>>>>>>> x86_64. The performance evaluation suggests that there is no
>>>>>>> statistically significant performance degradation due to having 
>>>>>>> proper
>>>>>>> frame pointers. Therefore I propose to have OmitFramePointer set to
>>>>>>> false by default on x86_64 (and set to true on all other 
>>>>>>> platforms).
>>>>>> This patch looks good, however I think there is a problem with the
>>>>>> logic of OmitFramePointer.
>>>>>>
>>>>>> Here is my test case.
>>>>>>
>>>>>> --- CUT HERE ---
>>>>>> // $Id: fibo.java,v 1.2 2000/12/24 19:10:50 doug Exp $
>>>>>> // http://www.bagley.org/~doug/shootout/
>>>>>>
>>>>>> public class fibo {
>>>>>>      public static void main(String args[]) {
>>>>>>     int N = Integer.parseInt(args[0]);
>>>>>>     System.out.println(fib(N));
>>>>>>      }
>>>>>>      public static int fib(int n) {
>>>>>>     if (n < 2) return(1);
>>>>>>     return( fib(n-2) + fib(n-1) );
>>>>>>      }
>>>>>> }
>>>>>> --- CUT HERE ---
>>>>>>
>>>>>> If I run it as follows on my x86 64 bit linux.
>>>>>>
>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation 
>>>>>> -XX:+PrintCompilation
>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>> -XX:-OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>
>>>>>> I get
>>>>>>
>>>>>>    # {method} {0x00007fc62c97f388} 'fib' '(I)I' in 'fibo'
>>>>>>    # parm0:    rsi       = int
>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>    0x00007fc625071100: mov    %eax,-0x14000(%rsp)
>>>>>>    0x00007fc625071107: push   %rbp
>>>>>>    0x00007fc625071108: mov    %rsp,%rbp
>>>>>>    0x00007f836907110b: sub    $0x20,%rsp ;*synchronization entry
>>>>>>
>>>>>> which is correct, it is NOT(-) OmitFramePointer, therefore it is 
>>>>>> using
>>>>>> the frame pointer
>>>>>>
>>>>>> Now if I try just changing -XX:-OmitFramePointer to
>>>>>> -XX:+OmitFramePointer in the above I get
>>>>>>
>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation 
>>>>>> -XX:+PrintCompilation
>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>> -XX:+OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>
>>>>>> I get
>>>>>>
>>>>>>    # {method} {0x00007f14d3c00388} 'fib' '(I)I' in 'fibo'
>>>>>>    # parm0:    rsi       = int
>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>    0x00007f14e1071100: mov    %eax,-0x14000(%rsp)
>>>>>>    0x00007f14e1071107: push   %rbp
>>>>>>    0x00007f14e1071108: sub    $0x20,%rsp ;*synchronization entry
>>>>>>
>>>>>> which is correct, it is ID(+) OmitFramePointer, therefore it does 
>>>>>> not
>>>>>> use a frame pointer.
>>>>>>
>>>>>> However, if I now delete the -XX:+/-OmitFramePointer altogether, IE
>>>>>>
>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation 
>>>>>> -XX:+PrintCompilation
>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>> -XX:+PrintAssembly fibo 43
>>>>>>
>>>>>> I get
>>>>>>
>>>>>>    # {method} {0x00007f0c4b730388} 'fib' '(I)I' in 'fibo'
>>>>>>    # parm0:    rsi       = int
>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>    0x00007f0c75071100: mov    %eax,-0x14000(%rsp)
>>>>>>    0x00007f0c75071107: push   %rbp
>>>>>>    0x00007f0c75071108: sub    $0x20,%rsp ;*synchronization entry
>>>>>>
>>>>>> It is not using a frame pointer which is the equivalent of
>>>>>> -XX:+OmitFramePointer. However in your description above you say
>>>>>>
>>>>>>> Therefore I propose to have OmitFramePointer set to false by 
>>>>>>> default
>>>>>>> on x86_64 (and set to true on all other platforms).
>>>>>> whereas OmitFramePointer actually seems to be set to true on x86_64
>>>>>>
>>>>>> I think the problem may be with the declaration and definition of
>>>>>> OmitFramePointer in globals.hpp and globals_x86.hpp
>>>>>>
>>>>>> In globals.hpp it does
>>>>>>
>>>>>> product(bool, OmitFramePointer, true,
>>>>>>
>>>>>> In globals_x86.hpp it does
>>>>>>
>>>>>> LP64_ONLY(define_pd_global(bool, OmitFramePointer, false););
>>>>>>
>>>>>> I am not sure that you can mix product(...) and product_pd(...) like
>>>>>> this, so I think it just ends up getting the default from the
>>>>>> product(...).
>>>>>
>>>>> You are right, mixing product and product_pd does not make sense 
>>>>> at all.
>>>>> Thank you for doing additional testing and for drawing attention 
>>>>> to the
>>>>> problem.
>>>>>
>>>>> I updated the code to use product_pd and define_pd_global on all
>>>>> relevant platforms.
>>>>>
>>>>>> Aside: In general, I do not like options which include a negative in
>>>>>> them because I have to do a double think when I see something like,
>>>>>> -XX:-OmitFramePointer, as in, it is omitting the frame pointer,
>>>>>> therefore it is using a frame pointer. How about FramePointer so we
>>>>>> have -XX:+FramePointer to say I want frame pointers and
>>>>>> -XX:-FramePointer to say I don't.
>>>>>
>>>>> That is a good idea. Double negation is an unnecessary 
>>>>> complication, so
>>>>> I changed the name of the flag to FramePointer, just as you 
>>>>> suggested.
>>>>>
>>>>>>
>>>>>> I did some timing on the above 'fibo' test
>>>>>>
>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>> -XX:-OmitFramePointer fibo 43
>>>>>> 701408733
>>>>>>
>>>>>> real    0m1.545s
>>>>>> user    0m1.571s
>>>>>> sys    0m0.015s
>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>> -XX:+OmitFramePointer fibo 43
>>>>>> 701408733
>>>>>>
>>>>>> real    0m1.504s
>>>>>> user    0m1.527s
>>>>>> sys    0m0.019s
>>>>>>
>>>>>> which is ~3% difference on this test case. On aarch64, I see ~7%
>>>>>> difference on this test case.
>>>>>
>>>>> Thank you for the performance measurements!
>>>>>
>>>>>> With the above change to fix the logic of OmitFramePointer (and
>>>>>> possible change its name) the patch looks good to me.
>>>>>
>>>>> Here is the updated webrev (the same webrev that was already included
>>>>> into my reply to Roland):
>>>>>
>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.01/
>>>>>
>>>>>> I will prepare a mirror patch for aarch64.
>>>>>
>>>>> That would be great!
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>>
>>>>> Zoltán
>>>>>
>>>>>>
>>>>>> All the best,
>>>>>> Ed.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>



More information about the hotspot-compiler-dev mailing list