[9] RFR(S): 8068945: Use RBP register as proper frame pointer in JIT compiled code on x86

Zoltán Majó zoltan.majo at oracle.com
Fri Apr 24 18:30:37 UTC 2015


Hi Vladimir,


On 04/24/2015 07:52 PM, Vladimir Kozlov wrote:
> > - (2) the assert is there only for SunStudio (according to the comments
> > on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
> > where forte_fill_call_trace_given_top() is called) does not expect 
> debug
> > information to available for all frames, it just needs the jmethodID 
> for
> > every frame.
>
> This is incorrect interpretation of comments. The assert is in general 
> code to catch incorrect bci or pcoffset during vframe construction. 
> The comment said that instead of having several asserts we have only 
> one for all these case to simplify disabling it when running 
> performance analyzer.

Thank you for the clarification!

> So you can only skip it when you run with performance analyzer but not 
> during general execution.
>
> May be you can set or path flag when vframe construction is called 
> from forte and skip assert in such case. I assume you hit it only with 
> performance analyzer. Right?

You are right. The assert can be triggered only if vframe construction 
is called from forte, but not otherwise. So setting the command-line 
flag -XX:SuppressErrorAt=vframe.cpp:443 (as suggested in vframe.cpp 
lines 439--442) is sufficient to skip the assert when running a 
fastdebug VM with the performance analyzer.

I'll leave the assert enabled and intend to push webrev.03 then.

Thank you and best regards,


Zoltan

>
> Thanks,
> Vladimir
>
> On 4/24/15 10:21 AM, Zoltán Majó wrote:
>> Hi Vladimir,
>>
>>
>> On 04/24/2015 06:26 PM, Vladimir Kozlov wrote:
>>> Yes, this looks good. Nice work and thank you for testing.
>>
>> Thank you for all the feedback you've provided while I was working on
>> this issue!
>>
>> I would have one final question: Can I disable the assert in
>> vframeStreamCommon::found_bad_method_frame() on line 443 in vframe.cpp?
>> I would prefer that for the reasons I mentioned in my previous message
>> (please see also below).
>>
>> Here is an updated webrev (with the assert disabled):
>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.04/
>>
>> Could you please let me know which webrev I can push, webrev.03 with the
>> assert left enabled or webrev.04 with the assert disabled?
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>>> I ran the experiments *with the assert enabled* in
>>>> vframeStreamCommon::found_bad_method_frame(). The assert was not
>>>> triggered.
>>>>
>>>> I would be, however, inclined to disable that assert because:
>>>> - (1) it is better to have some information in stack traces than no
>>>> information and a crash (even though the crash happens in a fastdebug
>>>> build);
>>>> - (2) the assert is there only for SunStudio (according to the 
>>>> comments
>>>> on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
>>>> where forte_fill_call_trace_given_top() is called) does not expect 
>>>> debug
>>>> information to available for all frames, it just needs the 
>>>> jmethodID for
>>>> every frame.
>>>>
>>>>> Do you mean "can NOT happen"?:
>>>>> > Moreover, if the stack is walked synchronously (e.g., at
>>>>> safepoints), no
>>>>> > problems appear either, because the synchronous interruption can
>>>>> happen
>>>>> > while execution is within the method handle intrinsic.
>>>>
>>>> Yes, I meant that it can *not* happen. Sorry for the confusion.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>>
>>>>> Vladimir
>>>>>
>>>>> On 4/22/15 9:26 AM, Zoltán Majó wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>>
>>>>>> I managed to do some more work on this enhancement. Please see 
>>>>>> details
>>>>>> below.
>>>>>>
>>>>>> On 04/01/2015 03:38 PM, Zoltán Majó wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>>
>>>>>>> On 03/30/2015 11:39 PM, Vladimir Kozlov wrote:
>>>>>>>> On 3/30/15 2:13 PM, Zoltán Majó wrote:
>>>>>>>>> Hi Vladimir,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> thank you for the feedback!
>>>>>>>>>
>>>>>>>>> On 03/30/2015 10:27 PM, Vladimir Kozlov wrote:
>>>>>>>>>> How about PreserveFramePointer instead of simple FramePointer?
>>>>>>>>>>
>>>>>>>>>> PreserveFramePointer will mean that compiled (or other) code 
>>>>>>>>>> will
>>>>>>>>>> use
>>>>>>>>>> that register only as Frame pointer.
>>>>>>>>>
>>>>>>>>> I will change the flag's name to PreserveFramePointer and will 
>>>>>>>>> also
>>>>>>>>> update the description.
>>>>>>
>>>>>> I changed the flag's name to PreserveFramePointer, just as you
>>>>>> suggested.
>>>>>>
>>>>>>>>>
>>>>>>>>>> Zoltan, x86 flags setting should be in general globals_x86.hpp.
>>>>>>>>>> You
>>>>>>>>>> can #ifdef _LP64 there too. I don't understand why you only set
>>>>>>>>>> it to
>>>>>>>>>> true on linux-x64.
>>>>>>>>>
>>>>>>>>> I remembered that the original discussion with Brendan Gregg
>>>>>>>>> mentioned
>>>>>>>>> only Linux's perf tool as a possible use case for "proper" frame
>>>>>>>>> pointers. So I was unsure whether to enable proper frame
>>>>>>>>> pointers by
>>>>>>>>> default on other x64 platforms as well.
>>>>>>>>>
>>>>>>>>> But if you think it would be better to have proper frame
>>>>>>>>> pointers on
>>>>>>>>> all
>>>>>>>>> x64 platforms, I will change the code to set
>>>>>>>>> PreserveFramePointer to
>>>>>>>>> true for all x64 platforms. Just please let me know.
>>>>>>
>>>>>> The current webrev sets the PreserveFramePointer flag to to true on
>>>>>> all
>>>>>> x86_64 platforms and to false on all other platforms.
>>>>>>
>>>>>>>>
>>>>>>>> Currently compiled code for all x86 platforms is almost the same
>>>>>>>> (win64 has difference in registers usage) and we should keep it 
>>>>>>>> that
>>>>>>>> way.
>>>>>>>>
>>>>>>>> Also the original request was to have flag to enable such behavior
>>>>>>>> (use RBP only as FP). So to have it off by default is 
>>>>>>>> acceptable. If
>>>>>>>> performance group or someone find a regression (or bug) due to 
>>>>>>>> this
>>>>>>>> change we can switch the flag off by default before jdk9 release.
>>>>>>>>
>>>>>>>> Try to run pstack on Solaris and jstack on OSX to make sure they
>>>>>>>> report correct call stack with compiled java methods. And JFR.
>>>>>>>> Also it would be nice to run SunStudio analyzer to verify that it
>>>>>>>> works.
>>>>>>>
>>>>>>> I ran all tools you've suggested. JFR and jstack is unaffected,
>>>>>>> pstack
>>>>>>> produces nice stack traces (it did not always do so before).
>>>>>>
>>>>>> I tested the current webrev with the following setup: I used two
>>>>>> tests,
>>>>>> one that generates a long chain of lambda form invocations and an
>>>>>> other
>>>>>> one that generates a long chain of "regular" method invocations. 
>>>>>> Both
>>>>>> tests were executed on an x64 machine in four configurations: 
>>>>>> with +/-
>>>>>> Xcomp and with +/- PreserveFramePointer.
>>>>>>
>>>>>> Just as before, JFR and jstack stack traces are unaffected for both
>>>>>> tests, pstack can now produce stack traces with both tests if
>>>>>> PreserveFramePointer is enabled.
>>>>>>
>>>>>>> However, I've encountered a problem with SunStudio: Two asserts 
>>>>>>> fail
>>>>>>> in the fastdebug build. Both of them  "soft" failures, as 
>>>>>>> neither the
>>>>>>> VM nor SunStudio crash with the product build. I worked on the
>>>>>>> problem
>>>>>>> today and have a partial understanding of the issue, but more
>>>>>>> investigation is needed to have a patch that preserves the correct
>>>>>>> behavior of SunStudio as well.
>>>>>>
>>>>>> I was able to track down the problems with SunStudio. I had to 
>>>>>> change
>>>>>> the code at two places.
>>>>>>
>>>>>>
>>>>>> Change #1 (in src/cpu/x86/vm/frame_x86.cpp):
>>>>>>
>>>>>> *** 222,232 ****
>>>>>>        }
>>>>>>
>>>>>>        if (sender_blob->is_nmethod()) {
>>>>>>            nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>>            if (nm != NULL) {
>>>>>> !             if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>>> nm->is_deopt_entry(sender_pc)) {
>>>>>>                    return false;
>>>>>>                }
>>>>>>            }
>>>>>>        }
>>>>>>
>>>>>> --- 222,233 ----
>>>>>>        }
>>>>>>
>>>>>>        if (sender_blob->is_nmethod()) {
>>>>>>            nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>>            if (nm != NULL) {
>>>>>> !             if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>>> nm->is_deopt_entry(sender_pc) ||
>>>>>> ! nm->method()->is_method_handle_intrinsic()) {
>>>>>>                    return false;
>>>>>>                }
>>>>>>            }
>>>>>>        }
>>>>>>
>>>>>> The reason for this change is the following. Method handle 
>>>>>> intrinsics
>>>>>> (i.e., the intrinsics _invokeBasic, _linkToVirtual,_linkToStatic,
>>>>>> _linkToSpecial, and _linkToInterface) do not allocate stack space 
>>>>>> when
>>>>>> invoked, but they can extend the stack space of their caller
>>>>>> "temporarily".
>>>>>>
>>>>>> For example, if VerifyMethodHandles is enabled, some stack space is
>>>>>> used
>>>>>> during verification. The temporarily used stack space is released
>>>>>> before
>>>>>> the intrinsic jumps to its target. As a result, the target of a 
>>>>>> method
>>>>>> handle intrinsic will have a correct SP when it returns and the
>>>>>> program's control flow is correct.
>>>>>>
>>>>>> Moreover, if the stack is walked synchronously (e.g., at
>>>>>> safepoints), no
>>>>>> problems appear either, because the synchronous interruption can
>>>>>> happen
>>>>>> while execution is within the method handle intrinsic.
>>>>>>
>>>>>> The problem is that the SunStudio analyzer can interrupt the VM
>>>>>> asynchonously and walk the stack. If execution of a thread is
>>>>>> interrupted while the thread is in a method handle intrinsic, the SP
>>>>>> might contain an invalid value.
>>>>>>
>>>>>> The new webrev adds a check that marks the current frame unsafe for
>>>>>> sender if the frame belongs to a method handle intrinsic
>>>>>> (frame::safe_for_sender returns false in this case).
>>>>>>
>>>>>>
>>>>>> Change #2 (in src/share/vm/prims/forte.cpp):
>>>>>>
>>>>>> *** 425,435 ****
>>>>>>
>>>>>>        RegisterMap map(thd, false);
>>>>>>        initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>>      }
>>>>>>
>>>>>> !   vframeStreamForte st(thd, initial_Java_frame, false);
>>>>>>
>>>>>>      for (; !st.at_end() && count < depth; st.forte_next(), 
>>>>>> count++) {
>>>>>>        bci = st.bci();
>>>>>>        method = st.method();
>>>>>>
>>>>>> --- 425,435 ----
>>>>>>
>>>>>>        RegisterMap map(thd, false);
>>>>>>        initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>>      }
>>>>>>
>>>>>> !   vframeStreamForte st(thd, initial_Java_frame, true);
>>>>>>
>>>>>>      for (; !st.at_end() && count < depth; st.forte_next(), 
>>>>>> count++) {
>>>>>>        bci = st.bci();
>>>>>>        method = st.method();
>>>>>>
>>>>>> The problem is that the following assert in forte.cpp on line 103
>>>>>>
>>>>>> assert(filled_in, "invariant");
>>>>>>
>>>>>> fails. The problem appears if we have a stack trace like:
>>>>>>
>>>>>> V  [libjvm.so+0x1c98c4a]  void VMError::report(outputStream*)+0xb1a
>>>>>> V  [libjvm.so+0x1c9a3e8]  void VMError::report_and_die()+0x748
>>>>>> V  [libjvm.so+0x1003c8e]  void report_vm_error(const char*,int,const
>>>>>> char*,const char*)+0x7e
>>>>>> V  [libjvm.so+0x10efa22] vframeStreamForte::vframeStreamForte
>>>>>> #Nvariant
>>>>>> 1(JavaThread*,frame,bool)+0xe2
>>>>>> --> (Frame #5) V  [libjvm.so+0x10f0bb9]  void
>>>>>> forte_fill_call_trace_given_top(JavaThread*,ASGCT_CallTrace*,int,frame)+0x789 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0x10f1436]  AsyncGetCallTrace+0x246
>>>>>> C  [libcollector.so+0x272a8] __collector_ext_jstack_unwind+0xb8
>>>>>> C  [libcollector.so+0x277df] __collector_get_frame_info+0x27f
>>>>>> C  [libcollector.so+0x2f093] __collector_getUserCtx+0x13
>>>>>> C  [libcollector.so+0x1abc7] __collector_ext_profile_handler+0x127
>>>>>> C  [libcollector.so+0x17535] collector_sigprof_dispatcher+0x85
>>>>>> C  [libc.so.1+0x122476]  __sighndlr+0x6
>>>>>> C  [libc.so.1+0x115972]  call_user_handler+0x2ce
>>>>>> C  [libc.so.1+0x115e1b]  sigacthandler+0xdb
>>>>>> C  0xffffffffffffffff
>>>>>> V  [libjvm.so+0x1959089]  void os::PlatformEvent::park()+0xd9
>>>>>> V  [libjvm.so+0x18a6b34]  int ParkCommon(ParkEvent*,long)+0x34
>>>>>> V  [libjvm.so+0x18a7657]  int Monitor::IWait(Thread*,long)+0xb7
>>>>>> V  [libjvm.so+0x18a8b86]  bool Monitor::wait(bool,long,bool)+0x346
>>>>>> V  [libjvm.so+0xf7073d]  void
>>>>>> CompileBroker::wait_for_completion(CompileTask*)+0xad
>>>>>> V  [libjvm.so+0xf6f6b6]  void
>>>>>> CompileBroker::compile_method_base(methodHandle,int,int,methodHandle,int,const 
>>>>>>
>>>>>>
>>>>>>
>>>>>> char*,Thread*)+0x406
>>>>>> V  [libjvm.so+0xf6fd96]
>>>>>> nmethod*CompileBroker::compile_method(methodHandle,int,int,methodHandle,int,const 
>>>>>>
>>>>>>
>>>>>>
>>>>>> char*,Thread*)+0x586
>>>>>> V  [libjvm.so+0xbbad72]  void
>>>>>> AdvancedThresholdPolicy::submit_compile(methodHandle,int,CompLevel,JavaThread*)+0xb2 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0x1aef92f]  void
>>>>>> SimpleThresholdPolicy::compile(methodHandle,int,CompLevel,JavaThread*)+0x14f 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0xbbb00f]  void
>>>>>> AdvancedThresholdPolicy::method_invocation_event(methodHandle,methodHandle,CompLevel,nmethod*,JavaThread*)+0x1ff 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0x1aef765]
>>>>>> nmethod*SimpleThresholdPolicy::event(methodHandle,methodHandle,int,int,CompLevel,nmethod*,JavaThread*)+0x2e5 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0xdb93bc]  unsigned
>>>>>> char*Runtime1::counter_overflow(JavaThread*,int,Method*)+0x31c
>>>>>> v  ~RuntimeStub::counter_overflow Runtime1 stub
>>>>>> --> (Frame #29) J 143 C1
>>>>>> java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @
>>>>>> 0xffff80ffacc6962a [0xffff80ffacc69580+0xaa]
>>>>>> --> (Frame #30) v  ~StubRoutines::call_stub
>>>>>> V  [libjvm.so+0x13ca50b]  void
>>>>>> JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x41b 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> V  [libjvm.so+0x152a111]  JVM_DoPrivileged+0xfb1
>>>>>> C  [libjava.so+0x12f42]
>>>>>> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2+0x12 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> J 142
>>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object; 
>>>>>>
>>>>>>
>>>>>>
>>>>>> (0 bytes) @ 0xffff80ffb400c57c [0xffff80ffb400c420+0x15c\
>>>>>> ]
>>>>>> J 134 C1
>>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class; 
>>>>>>
>>>>>> (47 bytes) @ 0xffff80ffacc66014 [0xffff80ffacc65ec0+0x154]
>>>>>> ... more stack frames
>>>>>>
>>>>>> The forte_fill_call_trace_given_top() method (Frame #5) first
>>>>>> checks if
>>>>>> the first Java frame found is fully decipherable (line 395 in
>>>>>> forte.cpp). In our case the first Java frame is Frame #29 (the
>>>>>> C1-compiled version of java.net.URLClassLoader$1.run).
>>>>>>
>>>>>> In our case Frame #29 is not decipherable, because
>>>>>> java.net.URLClassLoader$1.run has been made "not entrant" (a
>>>>>> C2-compiled
>>>>>> version of the same method has been produced shortly before).
>>>>>>
>>>>>> Afterwards, forte_fill_call_trace_given_top() checks if the 
>>>>>> method is
>>>>>> "safe for sender" (line 424 in forte.cpp). The caller of the
>>>>>> java.net.URLClassLoader$1.run method is ~StubRoutines::call_stub,
>>>>>> which
>>>>>> is considered "safe for sender" by the VM.
>>>>>>
>>>>>> Then, initial_Java_frame is set to the ~StubRoutines::call_stub stub
>>>>>> (line 430). This does not seem to be correct because the stub is 
>>>>>> not a
>>>>>> Java method and causes the assert(filled_in, "invariant") in the
>>>>>> constructor of vframeStreamForte (line 103 in forte.cpp) to fail
>>>>>> (because the frame cannot be filled from a stub).
>>>>>>
>>>>>> To avoid this failure, I propose to call the constructor of
>>>>>> vframeStreamForte with parameter stop_at_java_call_stub set to true
>>>>>> (instead of false) so that the VM stops walking the stack if a call
>>>>>> stub
>>>>>> has been reached.
>>>>>>
>>>>>>
>>>>>> Here is the updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.02/
>>>>>>
>>>>>> In addition to testing the changeset with the tools mentioned
>>>>>> before, I
>>>>>> executed
>>>>>> - all JPRT tests, all pass;
>>>>>> - all java/lang/invoke and compiler JTREG tests; all tests that pass
>>>>>> with the unmodified source trace pass with the changes as well.
>>>>>>
>>>>>> Thank you very much in advance!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>>>>
>>>>>>> So that will put this RFR on hold for a while, unfortunately.
>>>>>>>
>>>>>>> Thank you for the feedback and suggestions so far!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you!
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Zoltan
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 3/30/15 8:30 AM, Zoltán Majó wrote:
>>>>>>>>>>> Hi Ed,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> thank you for your feedback! Please see comments below.
>>>>>>>>>>>
>>>>>>>>>>> On 03/30/2015 11:11 AM, Edward Nevill wrote:
>>>>>>>>>>>> Hi Zoltán,
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 2015-03-27 at 15:34 +0100, Zoltán Majó wrote:
>>>>>>>>>>>>> Full JPRT run, all tests pass. I also ran all hotspot 
>>>>>>>>>>>>> compiler
>>>>>>>>>>>>> tests and
>>>>>>>>>>>>> the jdk tests in java/lang/invoke on both x86_64 and x86_32.
>>>>>>>>>>>>> All
>>>>>>>>>>>>> tests
>>>>>>>>>>>>> that pass without the patch pass also with the patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I ran the SPEC JVM 2008 benchmarks on our performance
>>>>>>>>>>>>> infrastructure for
>>>>>>>>>>>>> x86_64. The performance evaluation suggests that there is no
>>>>>>>>>>>>> statistically significant performance degradation due to 
>>>>>>>>>>>>> having
>>>>>>>>>>>>> proper
>>>>>>>>>>>>> frame pointers. Therefore I propose to have OmitFramePointer
>>>>>>>>>>>>> set to
>>>>>>>>>>>>> false by default on x86_64 (and set to true on all other
>>>>>>>>>>>>> platforms).
>>>>>>>>>>>> This patch looks good, however I think there is a problem
>>>>>>>>>>>> with the
>>>>>>>>>>>> logic of OmitFramePointer.
>>>>>>>>>>>>
>>>>>>>>>>>> Here is my test case.
>>>>>>>>>>>>
>>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>>> // $Id: fibo.java,v 1.2 2000/12/24 19:10:50 doug Exp $
>>>>>>>>>>>> // http://www.bagley.org/~doug/shootout/
>>>>>>>>>>>>
>>>>>>>>>>>> public class fibo {
>>>>>>>>>>>>      public static void main(String args[]) {
>>>>>>>>>>>>     int N = Integer.parseInt(args[0]);
>>>>>>>>>>>>     System.out.println(fib(N));
>>>>>>>>>>>>      }
>>>>>>>>>>>>      public static int fib(int n) {
>>>>>>>>>>>>     if (n < 2) return(1);
>>>>>>>>>>>>     return( fib(n-2) + fib(n-1) );
>>>>>>>>>>>>      }
>>>>>>>>>>>> }
>>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>>>
>>>>>>>>>>>> If I run it as follows on my x86 64 bit linux.
>>>>>>>>>>>>
>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>> -XX:-OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>
>>>>>>>>>>>> I get
>>>>>>>>>>>>
>>>>>>>>>>>>    # {method} {0x00007fc62c97f388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>>    0x00007fc625071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>    0x00007fc625071107: push   %rbp
>>>>>>>>>>>>    0x00007fc625071108: mov    %rsp,%rbp
>>>>>>>>>>>>    0x00007f836907110b: sub    $0x20,%rsp ;*synchronization 
>>>>>>>>>>>> entry
>>>>>>>>>>>>
>>>>>>>>>>>> which is correct, it is NOT(-) OmitFramePointer, therefore 
>>>>>>>>>>>> it is
>>>>>>>>>>>> using
>>>>>>>>>>>> the frame pointer
>>>>>>>>>>>>
>>>>>>>>>>>> Now if I try just changing -XX:-OmitFramePointer to
>>>>>>>>>>>> -XX:+OmitFramePointer in the above I get
>>>>>>>>>>>>
>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>> -XX:+OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>
>>>>>>>>>>>> I get
>>>>>>>>>>>>
>>>>>>>>>>>>    # {method} {0x00007f14d3c00388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>>    0x00007f14e1071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>    0x00007f14e1071107: push   %rbp
>>>>>>>>>>>>    0x00007f14e1071108: sub    $0x20,%rsp ;*synchronization 
>>>>>>>>>>>> entry
>>>>>>>>>>>>
>>>>>>>>>>>> which is correct, it is ID(+) OmitFramePointer, therefore it
>>>>>>>>>>>> does
>>>>>>>>>>>> not
>>>>>>>>>>>> use a frame pointer.
>>>>>>>>>>>>
>>>>>>>>>>>> However, if I now delete the -XX:+/-OmitFramePointer
>>>>>>>>>>>> altogether, IE
>>>>>>>>>>>>
>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>> -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>
>>>>>>>>>>>> I get
>>>>>>>>>>>>
>>>>>>>>>>>>    # {method} {0x00007f0c4b730388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>    # parm0:    rsi       = int
>>>>>>>>>>>>    #           [sp+0x30]  (sp of caller)
>>>>>>>>>>>>    0x00007f0c75071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>    0x00007f0c75071107: push   %rbp
>>>>>>>>>>>>    0x00007f0c75071108: sub    $0x20,%rsp ;*synchronization 
>>>>>>>>>>>> entry
>>>>>>>>>>>>
>>>>>>>>>>>> It is not using a frame pointer which is the equivalent of
>>>>>>>>>>>> -XX:+OmitFramePointer. However in your description above 
>>>>>>>>>>>> you say
>>>>>>>>>>>>
>>>>>>>>>>>>> Therefore I propose to have OmitFramePointer set to false by
>>>>>>>>>>>>> default
>>>>>>>>>>>>> on x86_64 (and set to true on all other platforms).
>>>>>>>>>>>> whereas OmitFramePointer actually seems to be set to true on
>>>>>>>>>>>> x86_64
>>>>>>>>>>>>
>>>>>>>>>>>> I think the problem may be with the declaration and
>>>>>>>>>>>> definition of
>>>>>>>>>>>> OmitFramePointer in globals.hpp and globals_x86.hpp
>>>>>>>>>>>>
>>>>>>>>>>>> In globals.hpp it does
>>>>>>>>>>>>
>>>>>>>>>>>> product(bool, OmitFramePointer, true,
>>>>>>>>>>>>
>>>>>>>>>>>> In globals_x86.hpp it does
>>>>>>>>>>>>
>>>>>>>>>>>> LP64_ONLY(define_pd_global(bool, OmitFramePointer, false););
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure that you can mix product(...) and 
>>>>>>>>>>>> product_pd(...)
>>>>>>>>>>>> like
>>>>>>>>>>>> this, so I think it just ends up getting the default from the
>>>>>>>>>>>> product(...).
>>>>>>>>>>>
>>>>>>>>>>> You are right, mixing product and product_pd does not make 
>>>>>>>>>>> sense
>>>>>>>>>>> at all.
>>>>>>>>>>> Thank you for doing additional testing and for drawing 
>>>>>>>>>>> attention
>>>>>>>>>>> to the
>>>>>>>>>>> problem.
>>>>>>>>>>>
>>>>>>>>>>> I updated the code to use product_pd and define_pd_global on 
>>>>>>>>>>> all
>>>>>>>>>>> relevant platforms.
>>>>>>>>>>>
>>>>>>>>>>>> Aside: In general, I do not like options which include a
>>>>>>>>>>>> negative in
>>>>>>>>>>>> them because I have to do a double think when I see something
>>>>>>>>>>>> like,
>>>>>>>>>>>> -XX:-OmitFramePointer, as in, it is omitting the frame 
>>>>>>>>>>>> pointer,
>>>>>>>>>>>> therefore it is using a frame pointer. How about FramePointer
>>>>>>>>>>>> so we
>>>>>>>>>>>> have -XX:+FramePointer to say I want frame pointers and
>>>>>>>>>>>> -XX:-FramePointer to say I don't.
>>>>>>>>>>>
>>>>>>>>>>> That is a good idea. Double negation is an unnecessary
>>>>>>>>>>> complication, so
>>>>>>>>>>> I changed the name of the flag to FramePointer, just as you
>>>>>>>>>>> suggested.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I did some timing on the above 'fibo' test
>>>>>>>>>>>>
>>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>>> -XX:-OmitFramePointer fibo 43
>>>>>>>>>>>> 701408733
>>>>>>>>>>>>
>>>>>>>>>>>> real    0m1.545s
>>>>>>>>>>>> user    0m1.571s
>>>>>>>>>>>> sys    0m0.015s
>>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>>> -XX:+OmitFramePointer fibo 43
>>>>>>>>>>>> 701408733
>>>>>>>>>>>>
>>>>>>>>>>>> real    0m1.504s
>>>>>>>>>>>> user    0m1.527s
>>>>>>>>>>>> sys    0m0.019s
>>>>>>>>>>>>
>>>>>>>>>>>> which is ~3% difference on this test case. On aarch64, I 
>>>>>>>>>>>> see ~7%
>>>>>>>>>>>> difference on this test case.
>>>>>>>>>>>
>>>>>>>>>>> Thank you for the performance measurements!
>>>>>>>>>>>
>>>>>>>>>>>> With the above change to fix the logic of OmitFramePointer 
>>>>>>>>>>>> (and
>>>>>>>>>>>> possible change its name) the patch looks good to me.
>>>>>>>>>>>
>>>>>>>>>>> Here is the updated webrev (the same webrev that was already
>>>>>>>>>>> included
>>>>>>>>>>> into my reply to Roland):
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>>> I will prepare a mirror patch for aarch64.
>>>>>>>>>>>
>>>>>>>>>>> That would be great!
>>>>>>>>>>>
>>>>>>>>>>> Thank you and best regards,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Zoltán
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> All the best,
>>>>>>>>>>>> Ed.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>



More information about the hotspot-compiler-dev mailing list