[9] RFR(S): 8068945: Use RBP register as proper frame pointer in JIT compiled code on x86

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri Apr 24 17:52:25 UTC 2015


 > - (2) the assert is there only for SunStudio (according to the comments
 > on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
 > where forte_fill_call_trace_given_top() is called) does not expect debug
 > information to available for all frames, it just needs the jmethodID for
 > every frame.

This is incorrect interpretation of comments. The assert is in general 
code to catch incorrect bci or pcoffset during vframe construction. The 
comment said that instead of having several asserts we have only one for 
all these case to simplify disabling it when running performance analyzer.

So you can only skip it when you run with performance analyzer but not 
during general execution.

May be you can set or path flag when vframe construction is called from 
forte and skip assert in such case. I assume you hit it only with 
performance analyzer. Right?

Thanks,
Vladimir

On 4/24/15 10:21 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> On 04/24/2015 06:26 PM, Vladimir Kozlov wrote:
>> Yes, this looks good. Nice work and thank you for testing.
>
> Thank you for all the feedback you've provided while I was working on
> this issue!
>
> I would have one final question: Can I disable the assert in
> vframeStreamCommon::found_bad_method_frame() on line 443 in vframe.cpp?
> I would prefer that for the reasons I mentioned in my previous message
> (please see also below).
>
> Here is an updated webrev (with the assert disabled):
> http://cr.openjdk.java.net/~zmajo/8068945/webrev.04/
>
> Could you please let me know which webrev I can push, webrev.03 with the
> assert left enabled or webrev.04 with the assert disabled?
>
> Thank you and best regards,
>
>
> Zoltan
>
>>> I ran the experiments *with the assert enabled* in
>>> vframeStreamCommon::found_bad_method_frame(). The assert was not
>>> triggered.
>>>
>>> I would be, however, inclined to disable that assert because:
>>> - (1) it is better to have some information in stack traces than no
>>> information and a crash (even though the crash happens in a fastdebug
>>> build);
>>> - (2) the assert is there only for SunStudio (according to the comments
>>> on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
>>> where forte_fill_call_trace_given_top() is called) does not expect debug
>>> information to available for all frames, it just needs the jmethodID for
>>> every frame.
>>>
>>>> Do you mean "can NOT happen"?:
>>>> > Moreover, if the stack is walked synchronously (e.g., at
>>>> safepoints), no
>>>> > problems appear either, because the synchronous interruption can
>>>> happen
>>>> > while execution is within the method handle intrinsic.
>>>
>>> Yes, I meant that it can *not* happen. Sorry for the confusion.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> Vladimir
>>>>
>>>> On 4/22/15 9:26 AM, Zoltán Majó wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>>
>>>>> I managed to do some more work on this enhancement. Please see details
>>>>> below.
>>>>>
>>>>> On 04/01/2015 03:38 PM, Zoltán Majó wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>>
>>>>>> On 03/30/2015 11:39 PM, Vladimir Kozlov wrote:
>>>>>>> On 3/30/15 2:13 PM, Zoltán Majó wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>>
>>>>>>>> thank you for the feedback!
>>>>>>>>
>>>>>>>> On 03/30/2015 10:27 PM, Vladimir Kozlov wrote:
>>>>>>>>> How about PreserveFramePointer instead of simple FramePointer?
>>>>>>>>>
>>>>>>>>> PreserveFramePointer will mean that compiled (or other) code will
>>>>>>>>> use
>>>>>>>>> that register only as Frame pointer.
>>>>>>>>
>>>>>>>> I will change the flag's name to PreserveFramePointer and will also
>>>>>>>> update the description.
>>>>>
>>>>> I changed the flag's name to PreserveFramePointer, just as you
>>>>> suggested.
>>>>>
>>>>>>>>
>>>>>>>>> Zoltan, x86 flags setting should be in general globals_x86.hpp.
>>>>>>>>> You
>>>>>>>>> can #ifdef _LP64 there too. I don't understand why you only set
>>>>>>>>> it to
>>>>>>>>> true on linux-x64.
>>>>>>>>
>>>>>>>> I remembered that the original discussion with Brendan Gregg
>>>>>>>> mentioned
>>>>>>>> only Linux's perf tool as a possible use case for "proper" frame
>>>>>>>> pointers. So I was unsure whether to enable proper frame
>>>>>>>> pointers by
>>>>>>>> default on other x64 platforms as well.
>>>>>>>>
>>>>>>>> But if you think it would be better to have proper frame
>>>>>>>> pointers on
>>>>>>>> all
>>>>>>>> x64 platforms, I will change the code to set
>>>>>>>> PreserveFramePointer to
>>>>>>>> true for all x64 platforms. Just please let me know.
>>>>>
>>>>> The current webrev sets the PreserveFramePointer flag to to true on
>>>>> all
>>>>> x86_64 platforms and to false on all other platforms.
>>>>>
>>>>>>>
>>>>>>> Currently compiled code for all x86 platforms is almost the same
>>>>>>> (win64 has difference in registers usage) and we should keep it that
>>>>>>> way.
>>>>>>>
>>>>>>> Also the original request was to have flag to enable such behavior
>>>>>>> (use RBP only as FP). So to have it off by default is acceptable. If
>>>>>>> performance group or someone find a regression (or bug) due to this
>>>>>>> change we can switch the flag off by default before jdk9 release.
>>>>>>>
>>>>>>> Try to run pstack on Solaris and jstack on OSX to make sure they
>>>>>>> report correct call stack with compiled java methods. And JFR.
>>>>>>> Also it would be nice to run SunStudio analyzer to verify that it
>>>>>>> works.
>>>>>>
>>>>>> I ran all tools you've suggested. JFR and jstack is unaffected,
>>>>>> pstack
>>>>>> produces nice stack traces (it did not always do so before).
>>>>>
>>>>> I tested the current webrev with the following setup: I used two
>>>>> tests,
>>>>> one that generates a long chain of lambda form invocations and an
>>>>> other
>>>>> one that generates a long chain of "regular" method invocations. Both
>>>>> tests were executed on an x64 machine in four configurations: with +/-
>>>>> Xcomp and with +/- PreserveFramePointer.
>>>>>
>>>>> Just as before, JFR and jstack stack traces are unaffected for both
>>>>> tests, pstack can now produce stack traces with both tests if
>>>>> PreserveFramePointer is enabled.
>>>>>
>>>>>> However, I've encountered a problem with SunStudio: Two asserts fail
>>>>>> in the fastdebug build. Both of them  "soft" failures, as neither the
>>>>>> VM nor SunStudio crash with the product build. I worked on the
>>>>>> problem
>>>>>> today and have a partial understanding of the issue, but more
>>>>>> investigation is needed to have a patch that preserves the correct
>>>>>> behavior of SunStudio as well.
>>>>>
>>>>> I was able to track down the problems with SunStudio. I had to change
>>>>> the code at two places.
>>>>>
>>>>>
>>>>> Change #1 (in src/cpu/x86/vm/frame_x86.cpp):
>>>>>
>>>>> *** 222,232 ****
>>>>>        }
>>>>>
>>>>>        if (sender_blob->is_nmethod()) {
>>>>>            nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>            if (nm != NULL) {
>>>>> !             if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>> nm->is_deopt_entry(sender_pc)) {
>>>>>                    return false;
>>>>>                }
>>>>>            }
>>>>>        }
>>>>>
>>>>> --- 222,233 ----
>>>>>        }
>>>>>
>>>>>        if (sender_blob->is_nmethod()) {
>>>>>            nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>            if (nm != NULL) {
>>>>> !             if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>> nm->is_deopt_entry(sender_pc) ||
>>>>> ! nm->method()->is_method_handle_intrinsic()) {
>>>>>                    return false;
>>>>>                }
>>>>>            }
>>>>>        }
>>>>>
>>>>> The reason for this change is the following. Method handle intrinsics
>>>>> (i.e., the intrinsics _invokeBasic, _linkToVirtual,_linkToStatic,
>>>>> _linkToSpecial, and _linkToInterface) do not allocate stack space when
>>>>> invoked, but they can extend the stack space of their caller
>>>>> "temporarily".
>>>>>
>>>>> For example, if VerifyMethodHandles is enabled, some stack space is
>>>>> used
>>>>> during verification. The temporarily used stack space is released
>>>>> before
>>>>> the intrinsic jumps to its target. As a result, the target of a method
>>>>> handle intrinsic will have a correct SP when it returns and the
>>>>> program's control flow is correct.
>>>>>
>>>>> Moreover, if the stack is walked synchronously (e.g., at
>>>>> safepoints), no
>>>>> problems appear either, because the synchronous interruption can
>>>>> happen
>>>>> while execution is within the method handle intrinsic.
>>>>>
>>>>> The problem is that the SunStudio analyzer can interrupt the VM
>>>>> asynchonously and walk the stack. If execution of a thread is
>>>>> interrupted while the thread is in a method handle intrinsic, the SP
>>>>> might contain an invalid value.
>>>>>
>>>>> The new webrev adds a check that marks the current frame unsafe for
>>>>> sender if the frame belongs to a method handle intrinsic
>>>>> (frame::safe_for_sender returns false in this case).
>>>>>
>>>>>
>>>>> Change #2 (in src/share/vm/prims/forte.cpp):
>>>>>
>>>>> *** 425,435 ****
>>>>>
>>>>>        RegisterMap map(thd, false);
>>>>>        initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>      }
>>>>>
>>>>> !   vframeStreamForte st(thd, initial_Java_frame, false);
>>>>>
>>>>>      for (; !st.at_end() && count < depth; st.forte_next(), count++) {
>>>>>        bci = st.bci();
>>>>>        method = st.method();
>>>>>
>>>>> --- 425,435 ----
>>>>>
>>>>>        RegisterMap map(thd, false);
>>>>>        initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>      }
>>>>>
>>>>> !   vframeStreamForte st(thd, initial_Java_frame, true);
>>>>>
>>>>>      for (; !st.at_end() && count < depth; st.forte_next(), count++) {
>>>>>        bci = st.bci();
>>>>>        method = st.method();
>>>>>
>>>>> The problem is that the following assert in forte.cpp on line 103
>>>>>
>>>>> assert(filled_in, "invariant");
>>>>>
>>>>> fails. The problem appears if we have a stack trace like:
>>>>>
>>>>> V  [libjvm.so+0x1c98c4a]  void VMError::report(outputStream*)+0xb1a
>>>>> V  [libjvm.so+0x1c9a3e8]  void VMError::report_and_die()+0x748
>>>>> V  [libjvm.so+0x1003c8e]  void report_vm_error(const char*,int,const
>>>>> char*,const char*)+0x7e
>>>>> V  [libjvm.so+0x10efa22] vframeStreamForte::vframeStreamForte
>>>>> #Nvariant
>>>>> 1(JavaThread*,frame,bool)+0xe2
>>>>> --> (Frame #5) V  [libjvm.so+0x10f0bb9]  void
>>>>> forte_fill_call_trace_given_top(JavaThread*,ASGCT_CallTrace*,int,frame)+0x789
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0x10f1436]  AsyncGetCallTrace+0x246
>>>>> C  [libcollector.so+0x272a8] __collector_ext_jstack_unwind+0xb8
>>>>> C  [libcollector.so+0x277df] __collector_get_frame_info+0x27f
>>>>> C  [libcollector.so+0x2f093]  __collector_getUserCtx+0x13
>>>>> C  [libcollector.so+0x1abc7] __collector_ext_profile_handler+0x127
>>>>> C  [libcollector.so+0x17535] collector_sigprof_dispatcher+0x85
>>>>> C  [libc.so.1+0x122476]  __sighndlr+0x6
>>>>> C  [libc.so.1+0x115972]  call_user_handler+0x2ce
>>>>> C  [libc.so.1+0x115e1b]  sigacthandler+0xdb
>>>>> C  0xffffffffffffffff
>>>>> V  [libjvm.so+0x1959089]  void os::PlatformEvent::park()+0xd9
>>>>> V  [libjvm.so+0x18a6b34]  int ParkCommon(ParkEvent*,long)+0x34
>>>>> V  [libjvm.so+0x18a7657]  int Monitor::IWait(Thread*,long)+0xb7
>>>>> V  [libjvm.so+0x18a8b86]  bool Monitor::wait(bool,long,bool)+0x346
>>>>> V  [libjvm.so+0xf7073d]  void
>>>>> CompileBroker::wait_for_completion(CompileTask*)+0xad
>>>>> V  [libjvm.so+0xf6f6b6]  void
>>>>> CompileBroker::compile_method_base(methodHandle,int,int,methodHandle,int,const
>>>>>
>>>>>
>>>>> char*,Thread*)+0x406
>>>>> V  [libjvm.so+0xf6fd96]
>>>>> nmethod*CompileBroker::compile_method(methodHandle,int,int,methodHandle,int,const
>>>>>
>>>>>
>>>>> char*,Thread*)+0x586
>>>>> V  [libjvm.so+0xbbad72]  void
>>>>> AdvancedThresholdPolicy::submit_compile(methodHandle,int,CompLevel,JavaThread*)+0xb2
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0x1aef92f]  void
>>>>> SimpleThresholdPolicy::compile(methodHandle,int,CompLevel,JavaThread*)+0x14f
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0xbbb00f]  void
>>>>> AdvancedThresholdPolicy::method_invocation_event(methodHandle,methodHandle,CompLevel,nmethod*,JavaThread*)+0x1ff
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0x1aef765]
>>>>> nmethod*SimpleThresholdPolicy::event(methodHandle,methodHandle,int,int,CompLevel,nmethod*,JavaThread*)+0x2e5
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0xdb93bc]  unsigned
>>>>> char*Runtime1::counter_overflow(JavaThread*,int,Method*)+0x31c
>>>>> v  ~RuntimeStub::counter_overflow Runtime1 stub
>>>>> --> (Frame #29) J 143 C1
>>>>> java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @
>>>>> 0xffff80ffacc6962a [0xffff80ffacc69580+0xaa]
>>>>> --> (Frame #30) v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x13ca50b]  void
>>>>> JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x41b
>>>>>
>>>>>
>>>>>
>>>>> V  [libjvm.so+0x152a111]  JVM_DoPrivileged+0xfb1
>>>>> C  [libjava.so+0x12f42]
>>>>> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2+0x12
>>>>>
>>>>>
>>>>>
>>>>> J 142
>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;
>>>>>
>>>>>
>>>>> (0 bytes) @ 0xffff80ffb400c57c [0xffff80ffb400c420+0x15c\
>>>>> ]
>>>>> J 134 C1
>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;
>>>>> (47 bytes) @ 0xffff80ffacc66014 [0xffff80ffacc65ec0+0x154]
>>>>> ... more stack frames
>>>>>
>>>>> The forte_fill_call_trace_given_top() method (Frame #5) first
>>>>> checks if
>>>>> the first Java frame found is fully decipherable (line 395 in
>>>>> forte.cpp). In our case the first Java frame is Frame #29 (the
>>>>> C1-compiled version of java.net.URLClassLoader$1.run).
>>>>>
>>>>> In our case Frame #29 is not decipherable, because
>>>>> java.net.URLClassLoader$1.run has been made "not entrant" (a
>>>>> C2-compiled
>>>>> version of the same method has been produced shortly before).
>>>>>
>>>>> Afterwards, forte_fill_call_trace_given_top() checks if the method is
>>>>> "safe for sender" (line 424 in forte.cpp). The caller of the
>>>>> java.net.URLClassLoader$1.run method is ~StubRoutines::call_stub,
>>>>> which
>>>>> is considered "safe for sender" by the VM.
>>>>>
>>>>> Then, initial_Java_frame is set to the ~StubRoutines::call_stub stub
>>>>> (line 430). This does not seem to be correct because the stub is not a
>>>>> Java method and causes the assert(filled_in, "invariant") in the
>>>>> constructor of vframeStreamForte (line 103 in forte.cpp) to fail
>>>>> (because the frame cannot be filled from a stub).
>>>>>
>>>>> To avoid this failure, I propose to call the constructor of
>>>>> vframeStreamForte with parameter stop_at_java_call_stub set to true
>>>>> (instead of false) so that the VM stops walking the stack if a call
>>>>> stub
>>>>> has been reached.
>>>>>
>>>>>
>>>>> Here is the updated webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.02/
>>>>>
>>>>> In addition to testing the changeset with the tools mentioned
>>>>> before, I
>>>>> executed
>>>>> - all JPRT tests, all pass;
>>>>> - all java/lang/invoke and compiler JTREG tests; all tests that pass
>>>>> with the unmodified source trace pass with the changes as well.
>>>>>
>>>>> Thank you very much in advance!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> So that will put this RFR on hold for a while, unfortunately.
>>>>>>
>>>>>> Thank you for the feedback and suggestions so far!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>>
>>>>>>>> Zoltan
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>> On 3/30/15 8:30 AM, Zoltán Majó wrote:
>>>>>>>>>> Hi Ed,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> thank you for your feedback! Please see comments below.
>>>>>>>>>>
>>>>>>>>>> On 03/30/2015 11:11 AM, Edward Nevill wrote:
>>>>>>>>>>> Hi Zoltán,
>>>>>>>>>>>
>>>>>>>>>>> On Fri, 2015-03-27 at 15:34 +0100, Zoltán Majó wrote:
>>>>>>>>>>>> Full JPRT run, all tests pass. I also ran all hotspot compiler
>>>>>>>>>>>> tests and
>>>>>>>>>>>> the jdk tests in java/lang/invoke on both x86_64 and x86_32.
>>>>>>>>>>>> All
>>>>>>>>>>>> tests
>>>>>>>>>>>> that pass without the patch pass also with the patch.
>>>>>>>>>>>>
>>>>>>>>>>>> I ran the SPEC JVM 2008 benchmarks on our performance
>>>>>>>>>>>> infrastructure for
>>>>>>>>>>>> x86_64. The performance evaluation suggests that there is no
>>>>>>>>>>>> statistically significant performance degradation due to having
>>>>>>>>>>>> proper
>>>>>>>>>>>> frame pointers. Therefore I propose to have OmitFramePointer
>>>>>>>>>>>> set to
>>>>>>>>>>>> false by default on x86_64 (and set to true on all other
>>>>>>>>>>>> platforms).
>>>>>>>>>>> This patch looks good, however I think there is a problem
>>>>>>>>>>> with the
>>>>>>>>>>> logic of OmitFramePointer.
>>>>>>>>>>>
>>>>>>>>>>> Here is my test case.
>>>>>>>>>>>
>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>> // $Id: fibo.java,v 1.2 2000/12/24 19:10:50 doug Exp $
>>>>>>>>>>> // http://www.bagley.org/~doug/shootout/
>>>>>>>>>>>
>>>>>>>>>>> public class fibo {
>>>>>>>>>>>      public static void main(String args[]) {
>>>>>>>>>>>     int N = Integer.parseInt(args[0]);
>>>>>>>>>>>     System.out.println(fib(N));
>>>>>>>>>>>      }
>>>>>>>>>>>      public static int fib(int n) {
>>>>>>>>>>>     if (n < 2) return(1);
>>>>>>>>>>>     return( fib(n-2) + fib(n-1) );
>>>>>>>>>>>      }
>>>>>>>>>>> }
>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>>
>>>>>>>>>>> If I run it as follows on my x86 64 bit linux.
>>>>>>>>>>>
>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:-OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>
>>>>>>>>>>> I get
>>>>>>>>>>>
>>>>>>>>>>>    # {method} {0x00007fc62c97f388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>    0x00007fc625071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>    0x00007fc625071107: push   %rbp
>>>>>>>>>>>    0x00007fc625071108: mov    %rsp,%rbp
>>>>>>>>>>>    0x00007f836907110b: sub    $0x20,%rsp ;*synchronization entry
>>>>>>>>>>>
>>>>>>>>>>> which is correct, it is NOT(-) OmitFramePointer, therefore it is
>>>>>>>>>>> using
>>>>>>>>>>> the frame pointer
>>>>>>>>>>>
>>>>>>>>>>> Now if I try just changing -XX:-OmitFramePointer to
>>>>>>>>>>> -XX:+OmitFramePointer in the above I get
>>>>>>>>>>>
>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:+OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>
>>>>>>>>>>> I get
>>>>>>>>>>>
>>>>>>>>>>>    # {method} {0x00007f14d3c00388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>    0x00007f14e1071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>    0x00007f14e1071107: push   %rbp
>>>>>>>>>>>    0x00007f14e1071108: sub    $0x20,%rsp ;*synchronization entry
>>>>>>>>>>>
>>>>>>>>>>> which is correct, it is ID(+) OmitFramePointer, therefore it
>>>>>>>>>>> does
>>>>>>>>>>> not
>>>>>>>>>>> use a frame pointer.
>>>>>>>>>>>
>>>>>>>>>>> However, if I now delete the -XX:+/-OmitFramePointer
>>>>>>>>>>> altogether, IE
>>>>>>>>>>>
>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:+PrintAssembly fibo 43
>>>>>>>>>>>
>>>>>>>>>>> I get
>>>>>>>>>>>
>>>>>>>>>>>    # {method} {0x00007f0c4b730388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>    0x00007f0c75071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>    0x00007f0c75071107: push   %rbp
>>>>>>>>>>>    0x00007f0c75071108: sub    $0x20,%rsp ;*synchronization entry
>>>>>>>>>>>
>>>>>>>>>>> It is not using a frame pointer which is the equivalent of
>>>>>>>>>>> -XX:+OmitFramePointer. However in your description above you say
>>>>>>>>>>>
>>>>>>>>>>>> Therefore I propose to have OmitFramePointer set to false by
>>>>>>>>>>>> default
>>>>>>>>>>>> on x86_64 (and set to true on all other platforms).
>>>>>>>>>>> whereas OmitFramePointer actually seems to be set to true on
>>>>>>>>>>> x86_64
>>>>>>>>>>>
>>>>>>>>>>> I think the problem may be with the declaration and
>>>>>>>>>>> definition of
>>>>>>>>>>> OmitFramePointer in globals.hpp and globals_x86.hpp
>>>>>>>>>>>
>>>>>>>>>>> In globals.hpp it does
>>>>>>>>>>>
>>>>>>>>>>> product(bool, OmitFramePointer, true,
>>>>>>>>>>>
>>>>>>>>>>> In globals_x86.hpp it does
>>>>>>>>>>>
>>>>>>>>>>> LP64_ONLY(define_pd_global(bool, OmitFramePointer, false););
>>>>>>>>>>>
>>>>>>>>>>> I am not sure that you can mix product(...) and product_pd(...)
>>>>>>>>>>> like
>>>>>>>>>>> this, so I think it just ends up getting the default from the
>>>>>>>>>>> product(...).
>>>>>>>>>>
>>>>>>>>>> You are right, mixing product and product_pd does not make sense
>>>>>>>>>> at all.
>>>>>>>>>> Thank you for doing additional testing and for drawing attention
>>>>>>>>>> to the
>>>>>>>>>> problem.
>>>>>>>>>>
>>>>>>>>>> I updated the code to use product_pd and define_pd_global on all
>>>>>>>>>> relevant platforms.
>>>>>>>>>>
>>>>>>>>>>> Aside: In general, I do not like options which include a
>>>>>>>>>>> negative in
>>>>>>>>>>> them because I have to do a double think when I see something
>>>>>>>>>>> like,
>>>>>>>>>>> -XX:-OmitFramePointer, as in, it is omitting the frame pointer,
>>>>>>>>>>> therefore it is using a frame pointer. How about FramePointer
>>>>>>>>>>> so we
>>>>>>>>>>> have -XX:+FramePointer to say I want frame pointers and
>>>>>>>>>>> -XX:-FramePointer to say I don't.
>>>>>>>>>>
>>>>>>>>>> That is a good idea. Double negation is an unnecessary
>>>>>>>>>> complication, so
>>>>>>>>>> I changed the name of the flag to FramePointer, just as you
>>>>>>>>>> suggested.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I did some timing on the above 'fibo' test
>>>>>>>>>>>
>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>> -XX:-OmitFramePointer fibo 43
>>>>>>>>>>> 701408733
>>>>>>>>>>>
>>>>>>>>>>> real    0m1.545s
>>>>>>>>>>> user    0m1.571s
>>>>>>>>>>> sys    0m0.015s
>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>> -XX:+OmitFramePointer fibo 43
>>>>>>>>>>> 701408733
>>>>>>>>>>>
>>>>>>>>>>> real    0m1.504s
>>>>>>>>>>> user    0m1.527s
>>>>>>>>>>> sys    0m0.019s
>>>>>>>>>>>
>>>>>>>>>>> which is ~3% difference on this test case. On aarch64, I see ~7%
>>>>>>>>>>> difference on this test case.
>>>>>>>>>>
>>>>>>>>>> Thank you for the performance measurements!
>>>>>>>>>>
>>>>>>>>>>> With the above change to fix the logic of OmitFramePointer (and
>>>>>>>>>>> possible change its name) the patch looks good to me.
>>>>>>>>>>
>>>>>>>>>> Here is the updated webrev (the same webrev that was already
>>>>>>>>>> included
>>>>>>>>>> into my reply to Roland):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.01/
>>>>>>>>>>
>>>>>>>>>>> I will prepare a mirror patch for aarch64.
>>>>>>>>>>
>>>>>>>>>> That would be great!
>>>>>>>>>>
>>>>>>>>>> Thank you and best regards,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Zoltán
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> All the best,
>>>>>>>>>>> Ed.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>


More information about the hotspot-compiler-dev mailing list