[9] RFR(S): 8068945: Use RBP register as proper frame pointer in JIT compiled code on x86
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Apr 24 16:26:54 UTC 2015
Yes, this looks good. Nice work and thank you for testing.
Regards,
Vladimir
On 4/24/15 4:49 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> thank you for the feedback!
>
> On 04/22/2015 11:47 PM, Vladimir Kozlov wrote:
>> Looks like 2 issues left.
>>
>> First, after discussion on mailing list lets set the flag off by
>> default and ask SQE to add it to rotation flags for Nightly testing
>> (after you push changes).
>
> OK, I set the flag to off on all architectures and will notify SQE about
> it once the change has been pushed.
>
>> Second, I am concern about vframeStreamForte() call change. Call stack
>> may have several Call stubs. They should be skipped to get all java
>> frames on stack as we do in other places. Otherwise we can get
>> incorrect profiling information.
>
> I changed vframeStreamForte back to the way as it was (to *not* stop at
> call stubs and return all Java frames). But I removed the if-block for
> the case the first Java frame found is not decipherable (lines 407--427
> in forte.cpp). Here are the reasons for that.
>
> The forte_fill_call_trace_given_top() function returns a list of methods
> to the profiler (the list of Java methods that have been found on the
> stack). For every method, two pieces of information are returned (in an
> ASGCT_CallFrame structure):
>
> - (1) the jmethodID corresponding to that method and
> - (2) debug information (the line number where execution was in the
> method when the profiler interrupted the VM).
>
> (1) The jmethodID is *critical* for forte_fill_call_trace_given_top()
> because it provides information about the method that a frames belongs
> to. If the jmethodID is not available (because no Method* was obtained
> by looking at that frame or the Method* found is invalid), the VM stops
> walking the stack and returns right away.
>
> (2) The debug information is *not critical*: If there is no BCI
> corresponding to a PC stored within a frame, the line number in in
> ASGCT_CallFrame is set to -1 for interpreted/compiled frames and to -3
> for native frames (as described on lines 512--518 in forte.cpp) and the
> stack is walked further.
>
> The stack is walked using vframeStreamForte. The vframeStreamForte
> object is created from the first Java frame available on the stack.
>
> In the original version of the code, if the first Java frame is not
> decipherable (i.e., there is no debug information available for it), the
> first Java frame is *handled separately* (lines 407--427 in forte.cpp)
> and the vframeStreamForte object is created from the *sender* of the
> first Java frame (see reasons below).
>
> The original version of the code assumes that the sender of the first
> Java frame is also a Java frame. But that is not always the case, as the
> calling frame can be also something else, for example a call stub. Using
> a call stub as a Java frame can result in a failure.
>
> The *reason for separately handling* the first Java frame if the frame
> is not decipherable is to avoid triggering the assert in
> found_bad_method_frame() (line 443 in vframe.hpp). The function
> found_bad_method_frame() can be called in two ways:
>
> - vframeStreamForte::vframeStreamForte -> vframeStreamForte::fill_frame
> -> vframeStreamCommon::fill_from_interpreter_frame ->
> found_bad_method_frame()
> - vframeStreamForte::vframeStreamForte -> vframeStreamForte::fill_frame
> -> vframeStreamCommon::fill_from_compiled_frame -> found_bad_method_frame()
>
> In a non-product build, the assert fails if no debug information is
> available for an interpreted/compiled frame. In a product build,
> however, a Method* is returned also for interpreted/compiled frames with
> no debug information and filling the stack trace continues.
>
> I think the approach in product builds is better, because critical
> information (the JmethodID of all stack frames) is returned to
> StunStudio even if non-critical debug information is not available for
> some frames in the trace. The fastdebug build, however, crashes, if it
> encounters a single frame with no debug information.
>
> Here is the newest webrev:
> http://cr.openjdk.java.net/~zmajo/8068945/webrev.03/
>
> Files updated relative to the previous webrev (webrev.02):
> - forte.cpp
> - globals_x86.hpp
>
> Testing:
> - JPRT: all tests pass; tested with PreserveFramePointer enabled on x64
> - Repeated all SunStudio experiments described for webrev.02 (two tests
> w/ and w/o lamba forms, executed +/-PreserveFramePointer, w/ and w/o
> -Xcomp)
> - all java/lang/invoke and compiler tests: all tests that pass with the
> unmodified source tree pass with the changes as well.
>
> I ran the experiments *with the assert enabled* in
> vframeStreamCommon::found_bad_method_frame(). The assert was not triggered.
>
> I would be, however, inclined to disable that assert because:
> - (1) it is better to have some information in stack traces than no
> information and a crash (even though the crash happens in a fastdebug
> build);
> - (2) the assert is there only for SunStudio (according to the comments
> on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
> where forte_fill_call_trace_given_top() is called) does not expect debug
> information to available for all frames, it just needs the jmethodID for
> every frame.
>
>> Do you mean "can NOT happen"?:
>> > Moreover, if the stack is walked synchronously (e.g., at
>> safepoints), no
>> > problems appear either, because the synchronous interruption can happen
>> > while execution is within the method handle intrinsic.
>
> Yes, I meant that it can *not* happen. Sorry for the confusion.
>
> Thank you and best regards,
>
>
> Zoltan
>
>>
>> Vladimir
>>
>> On 4/22/15 9:26 AM, Zoltán Majó wrote:
>>> Hi Vladimir,
>>>
>>>
>>> I managed to do some more work on this enhancement. Please see details
>>> below.
>>>
>>> On 04/01/2015 03:38 PM, Zoltán Majó wrote:
>>>> Hi Vladimir,
>>>>
>>>>
>>>> On 03/30/2015 11:39 PM, Vladimir Kozlov wrote:
>>>>> On 3/30/15 2:13 PM, Zoltán Majó wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>>
>>>>>> thank you for the feedback!
>>>>>>
>>>>>> On 03/30/2015 10:27 PM, Vladimir Kozlov wrote:
>>>>>>> How about PreserveFramePointer instead of simple FramePointer?
>>>>>>>
>>>>>>> PreserveFramePointer will mean that compiled (or other) code will
>>>>>>> use
>>>>>>> that register only as Frame pointer.
>>>>>>
>>>>>> I will change the flag's name to PreserveFramePointer and will also
>>>>>> update the description.
>>>
>>> I changed the flag's name to PreserveFramePointer, just as you
>>> suggested.
>>>
>>>>>>
>>>>>>> Zoltan, x86 flags setting should be in general globals_x86.hpp. You
>>>>>>> can #ifdef _LP64 there too. I don't understand why you only set
>>>>>>> it to
>>>>>>> true on linux-x64.
>>>>>>
>>>>>> I remembered that the original discussion with Brendan Gregg
>>>>>> mentioned
>>>>>> only Linux's perf tool as a possible use case for "proper" frame
>>>>>> pointers. So I was unsure whether to enable proper frame pointers by
>>>>>> default on other x64 platforms as well.
>>>>>>
>>>>>> But if you think it would be better to have proper frame pointers on
>>>>>> all
>>>>>> x64 platforms, I will change the code to set PreserveFramePointer to
>>>>>> true for all x64 platforms. Just please let me know.
>>>
>>> The current webrev sets the PreserveFramePointer flag to to true on all
>>> x86_64 platforms and to false on all other platforms.
>>>
>>>>>
>>>>> Currently compiled code for all x86 platforms is almost the same
>>>>> (win64 has difference in registers usage) and we should keep it that
>>>>> way.
>>>>>
>>>>> Also the original request was to have flag to enable such behavior
>>>>> (use RBP only as FP). So to have it off by default is acceptable. If
>>>>> performance group or someone find a regression (or bug) due to this
>>>>> change we can switch the flag off by default before jdk9 release.
>>>>>
>>>>> Try to run pstack on Solaris and jstack on OSX to make sure they
>>>>> report correct call stack with compiled java methods. And JFR.
>>>>> Also it would be nice to run SunStudio analyzer to verify that it
>>>>> works.
>>>>
>>>> I ran all tools you've suggested. JFR and jstack is unaffected, pstack
>>>> produces nice stack traces (it did not always do so before).
>>>
>>> I tested the current webrev with the following setup: I used two tests,
>>> one that generates a long chain of lambda form invocations and an other
>>> one that generates a long chain of "regular" method invocations. Both
>>> tests were executed on an x64 machine in four configurations: with +/-
>>> Xcomp and with +/- PreserveFramePointer.
>>>
>>> Just as before, JFR and jstack stack traces are unaffected for both
>>> tests, pstack can now produce stack traces with both tests if
>>> PreserveFramePointer is enabled.
>>>
>>>> However, I've encountered a problem with SunStudio: Two asserts fail
>>>> in the fastdebug build. Both of them "soft" failures, as neither the
>>>> VM nor SunStudio crash with the product build. I worked on the problem
>>>> today and have a partial understanding of the issue, but more
>>>> investigation is needed to have a patch that preserves the correct
>>>> behavior of SunStudio as well.
>>>
>>> I was able to track down the problems with SunStudio. I had to change
>>> the code at two places.
>>>
>>>
>>> Change #1 (in src/cpu/x86/vm/frame_x86.cpp):
>>>
>>> *** 222,232 ****
>>> }
>>>
>>> if (sender_blob->is_nmethod()) {
>>> nmethod* nm = sender_blob->as_nmethod_or_null();
>>> if (nm != NULL) {
>>> ! if (nm->is_deopt_mh_entry(sender_pc) ||
>>> nm->is_deopt_entry(sender_pc)) {
>>> return false;
>>> }
>>> }
>>> }
>>>
>>> --- 222,233 ----
>>> }
>>>
>>> if (sender_blob->is_nmethod()) {
>>> nmethod* nm = sender_blob->as_nmethod_or_null();
>>> if (nm != NULL) {
>>> ! if (nm->is_deopt_mh_entry(sender_pc) ||
>>> nm->is_deopt_entry(sender_pc) ||
>>> ! nm->method()->is_method_handle_intrinsic()) {
>>> return false;
>>> }
>>> }
>>> }
>>>
>>> The reason for this change is the following. Method handle intrinsics
>>> (i.e., the intrinsics _invokeBasic, _linkToVirtual,_linkToStatic,
>>> _linkToSpecial, and _linkToInterface) do not allocate stack space when
>>> invoked, but they can extend the stack space of their caller
>>> "temporarily".
>>>
>>> For example, if VerifyMethodHandles is enabled, some stack space is used
>>> during verification. The temporarily used stack space is released before
>>> the intrinsic jumps to its target. As a result, the target of a method
>>> handle intrinsic will have a correct SP when it returns and the
>>> program's control flow is correct.
>>>
>>> Moreover, if the stack is walked synchronously (e.g., at safepoints), no
>>> problems appear either, because the synchronous interruption can happen
>>> while execution is within the method handle intrinsic.
>>>
>>> The problem is that the SunStudio analyzer can interrupt the VM
>>> asynchonously and walk the stack. If execution of a thread is
>>> interrupted while the thread is in a method handle intrinsic, the SP
>>> might contain an invalid value.
>>>
>>> The new webrev adds a check that marks the current frame unsafe for
>>> sender if the frame belongs to a method handle intrinsic
>>> (frame::safe_for_sender returns false in this case).
>>>
>>>
>>> Change #2 (in src/share/vm/prims/forte.cpp):
>>>
>>> *** 425,435 ****
>>>
>>> RegisterMap map(thd, false);
>>> initial_Java_frame = initial_Java_frame.sender(&map);
>>> }
>>>
>>> ! vframeStreamForte st(thd, initial_Java_frame, false);
>>>
>>> for (; !st.at_end() && count < depth; st.forte_next(), count++) {
>>> bci = st.bci();
>>> method = st.method();
>>>
>>> --- 425,435 ----
>>>
>>> RegisterMap map(thd, false);
>>> initial_Java_frame = initial_Java_frame.sender(&map);
>>> }
>>>
>>> ! vframeStreamForte st(thd, initial_Java_frame, true);
>>>
>>> for (; !st.at_end() && count < depth; st.forte_next(), count++) {
>>> bci = st.bci();
>>> method = st.method();
>>>
>>> The problem is that the following assert in forte.cpp on line 103
>>>
>>> assert(filled_in, "invariant");
>>>
>>> fails. The problem appears if we have a stack trace like:
>>>
>>> V [libjvm.so+0x1c98c4a] void VMError::report(outputStream*)+0xb1a
>>> V [libjvm.so+0x1c9a3e8] void VMError::report_and_die()+0x748
>>> V [libjvm.so+0x1003c8e] void report_vm_error(const char*,int,const
>>> char*,const char*)+0x7e
>>> V [libjvm.so+0x10efa22] vframeStreamForte::vframeStreamForte #Nvariant
>>> 1(JavaThread*,frame,bool)+0xe2
>>> --> (Frame #5) V [libjvm.so+0x10f0bb9] void
>>> forte_fill_call_trace_given_top(JavaThread*,ASGCT_CallTrace*,int,frame)+0x789
>>>
>>>
>>> V [libjvm.so+0x10f1436] AsyncGetCallTrace+0x246
>>> C [libcollector.so+0x272a8] __collector_ext_jstack_unwind+0xb8
>>> C [libcollector.so+0x277df] __collector_get_frame_info+0x27f
>>> C [libcollector.so+0x2f093] __collector_getUserCtx+0x13
>>> C [libcollector.so+0x1abc7] __collector_ext_profile_handler+0x127
>>> C [libcollector.so+0x17535] collector_sigprof_dispatcher+0x85
>>> C [libc.so.1+0x122476] __sighndlr+0x6
>>> C [libc.so.1+0x115972] call_user_handler+0x2ce
>>> C [libc.so.1+0x115e1b] sigacthandler+0xdb
>>> C 0xffffffffffffffff
>>> V [libjvm.so+0x1959089] void os::PlatformEvent::park()+0xd9
>>> V [libjvm.so+0x18a6b34] int ParkCommon(ParkEvent*,long)+0x34
>>> V [libjvm.so+0x18a7657] int Monitor::IWait(Thread*,long)+0xb7
>>> V [libjvm.so+0x18a8b86] bool Monitor::wait(bool,long,bool)+0x346
>>> V [libjvm.so+0xf7073d] void
>>> CompileBroker::wait_for_completion(CompileTask*)+0xad
>>> V [libjvm.so+0xf6f6b6] void
>>> CompileBroker::compile_method_base(methodHandle,int,int,methodHandle,int,const
>>>
>>> char*,Thread*)+0x406
>>> V [libjvm.so+0xf6fd96]
>>> nmethod*CompileBroker::compile_method(methodHandle,int,int,methodHandle,int,const
>>>
>>> char*,Thread*)+0x586
>>> V [libjvm.so+0xbbad72] void
>>> AdvancedThresholdPolicy::submit_compile(methodHandle,int,CompLevel,JavaThread*)+0xb2
>>>
>>>
>>> V [libjvm.so+0x1aef92f] void
>>> SimpleThresholdPolicy::compile(methodHandle,int,CompLevel,JavaThread*)+0x14f
>>>
>>>
>>> V [libjvm.so+0xbbb00f] void
>>> AdvancedThresholdPolicy::method_invocation_event(methodHandle,methodHandle,CompLevel,nmethod*,JavaThread*)+0x1ff
>>>
>>>
>>> V [libjvm.so+0x1aef765]
>>> nmethod*SimpleThresholdPolicy::event(methodHandle,methodHandle,int,int,CompLevel,nmethod*,JavaThread*)+0x2e5
>>>
>>>
>>> V [libjvm.so+0xdb93bc] unsigned
>>> char*Runtime1::counter_overflow(JavaThread*,int,Method*)+0x31c
>>> v ~RuntimeStub::counter_overflow Runtime1 stub
>>> --> (Frame #29) J 143 C1
>>> java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @
>>> 0xffff80ffacc6962a [0xffff80ffacc69580+0xaa]
>>> --> (Frame #30) v ~StubRoutines::call_stub
>>> V [libjvm.so+0x13ca50b] void
>>> JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x41b
>>>
>>>
>>> V [libjvm.so+0x152a111] JVM_DoPrivileged+0xfb1
>>> C [libjava.so+0x12f42]
>>> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2+0x12
>>>
>>>
>>> J 142
>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;
>>>
>>> (0 bytes) @ 0xffff80ffb400c57c [0xffff80ffb400c420+0x15c\
>>> ]
>>> J 134 C1
>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;
>>> (47 bytes) @ 0xffff80ffacc66014 [0xffff80ffacc65ec0+0x154]
>>> ... more stack frames
>>>
>>> The forte_fill_call_trace_given_top() method (Frame #5) first checks if
>>> the first Java frame found is fully decipherable (line 395 in
>>> forte.cpp). In our case the first Java frame is Frame #29 (the
>>> C1-compiled version of java.net.URLClassLoader$1.run).
>>>
>>> In our case Frame #29 is not decipherable, because
>>> java.net.URLClassLoader$1.run has been made "not entrant" (a C2-compiled
>>> version of the same method has been produced shortly before).
>>>
>>> Afterwards, forte_fill_call_trace_given_top() checks if the method is
>>> "safe for sender" (line 424 in forte.cpp). The caller of the
>>> java.net.URLClassLoader$1.run method is ~StubRoutines::call_stub, which
>>> is considered "safe for sender" by the VM.
>>>
>>> Then, initial_Java_frame is set to the ~StubRoutines::call_stub stub
>>> (line 430). This does not seem to be correct because the stub is not a
>>> Java method and causes the assert(filled_in, "invariant") in the
>>> constructor of vframeStreamForte (line 103 in forte.cpp) to fail
>>> (because the frame cannot be filled from a stub).
>>>
>>> To avoid this failure, I propose to call the constructor of
>>> vframeStreamForte with parameter stop_at_java_call_stub set to true
>>> (instead of false) so that the VM stops walking the stack if a call stub
>>> has been reached.
>>>
>>>
>>> Here is the updated webrev:
>>>
>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.02/
>>>
>>> In addition to testing the changeset with the tools mentioned before, I
>>> executed
>>> - all JPRT tests, all pass;
>>> - all java/lang/invoke and compiler JTREG tests; all tests that pass
>>> with the unmodified source trace pass with the changes as well.
>>>
>>> Thank you very much in advance!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> So that will put this RFR on hold for a while, unfortunately.
>>>>
>>>> Thank you for the feedback and suggestions so far!
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 3/30/15 8:30 AM, Zoltán Majó wrote:
>>>>>>>> Hi Ed,
>>>>>>>>
>>>>>>>>
>>>>>>>> thank you for your feedback! Please see comments below.
>>>>>>>>
>>>>>>>> On 03/30/2015 11:11 AM, Edward Nevill wrote:
>>>>>>>>> Hi Zoltán,
>>>>>>>>>
>>>>>>>>> On Fri, 2015-03-27 at 15:34 +0100, Zoltán Majó wrote:
>>>>>>>>>> Full JPRT run, all tests pass. I also ran all hotspot compiler
>>>>>>>>>> tests and
>>>>>>>>>> the jdk tests in java/lang/invoke on both x86_64 and x86_32. All
>>>>>>>>>> tests
>>>>>>>>>> that pass without the patch pass also with the patch.
>>>>>>>>>>
>>>>>>>>>> I ran the SPEC JVM 2008 benchmarks on our performance
>>>>>>>>>> infrastructure for
>>>>>>>>>> x86_64. The performance evaluation suggests that there is no
>>>>>>>>>> statistically significant performance degradation due to having
>>>>>>>>>> proper
>>>>>>>>>> frame pointers. Therefore I propose to have OmitFramePointer
>>>>>>>>>> set to
>>>>>>>>>> false by default on x86_64 (and set to true on all other
>>>>>>>>>> platforms).
>>>>>>>>> This patch looks good, however I think there is a problem with the
>>>>>>>>> logic of OmitFramePointer.
>>>>>>>>>
>>>>>>>>> Here is my test case.
>>>>>>>>>
>>>>>>>>> --- CUT HERE ---
>>>>>>>>> // $Id: fibo.java,v 1.2 2000/12/24 19:10:50 doug Exp $
>>>>>>>>> // http://www.bagley.org/~doug/shootout/
>>>>>>>>>
>>>>>>>>> public class fibo {
>>>>>>>>> public static void main(String args[]) {
>>>>>>>>> int N = Integer.parseInt(args[0]);
>>>>>>>>> System.out.println(fib(N));
>>>>>>>>> }
>>>>>>>>> public static int fib(int n) {
>>>>>>>>> if (n < 2) return(1);
>>>>>>>>> return( fib(n-2) + fib(n-1) );
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> --- CUT HERE ---
>>>>>>>>>
>>>>>>>>> If I run it as follows on my x86 64 bit linux.
>>>>>>>>>
>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:-OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>
>>>>>>>>> I get
>>>>>>>>>
>>>>>>>>> # {method} {0x00007fc62c97f388} 'fib' '(I)I' in 'fibo'
>>>>>>>>> # parm0: rsi = int
>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>> 0x00007fc625071100: mov %eax,-0x14000(%rsp)
>>>>>>>>> 0x00007fc625071107: push %rbp
>>>>>>>>> 0x00007fc625071108: mov %rsp,%rbp
>>>>>>>>> 0x00007f836907110b: sub $0x20,%rsp ;*synchronization entry
>>>>>>>>>
>>>>>>>>> which is correct, it is NOT(-) OmitFramePointer, therefore it is
>>>>>>>>> using
>>>>>>>>> the frame pointer
>>>>>>>>>
>>>>>>>>> Now if I try just changing -XX:-OmitFramePointer to
>>>>>>>>> -XX:+OmitFramePointer in the above I get
>>>>>>>>>
>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:+OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>
>>>>>>>>> I get
>>>>>>>>>
>>>>>>>>> # {method} {0x00007f14d3c00388} 'fib' '(I)I' in 'fibo'
>>>>>>>>> # parm0: rsi = int
>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>> 0x00007f14e1071100: mov %eax,-0x14000(%rsp)
>>>>>>>>> 0x00007f14e1071107: push %rbp
>>>>>>>>> 0x00007f14e1071108: sub $0x20,%rsp ;*synchronization entry
>>>>>>>>>
>>>>>>>>> which is correct, it is ID(+) OmitFramePointer, therefore it does
>>>>>>>>> not
>>>>>>>>> use a frame pointer.
>>>>>>>>>
>>>>>>>>> However, if I now delete the -XX:+/-OmitFramePointer
>>>>>>>>> altogether, IE
>>>>>>>>>
>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:+PrintAssembly fibo 43
>>>>>>>>>
>>>>>>>>> I get
>>>>>>>>>
>>>>>>>>> # {method} {0x00007f0c4b730388} 'fib' '(I)I' in 'fibo'
>>>>>>>>> # parm0: rsi = int
>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>> 0x00007f0c75071100: mov %eax,-0x14000(%rsp)
>>>>>>>>> 0x00007f0c75071107: push %rbp
>>>>>>>>> 0x00007f0c75071108: sub $0x20,%rsp ;*synchronization entry
>>>>>>>>>
>>>>>>>>> It is not using a frame pointer which is the equivalent of
>>>>>>>>> -XX:+OmitFramePointer. However in your description above you say
>>>>>>>>>
>>>>>>>>>> Therefore I propose to have OmitFramePointer set to false by
>>>>>>>>>> default
>>>>>>>>>> on x86_64 (and set to true on all other platforms).
>>>>>>>>> whereas OmitFramePointer actually seems to be set to true on
>>>>>>>>> x86_64
>>>>>>>>>
>>>>>>>>> I think the problem may be with the declaration and definition of
>>>>>>>>> OmitFramePointer in globals.hpp and globals_x86.hpp
>>>>>>>>>
>>>>>>>>> In globals.hpp it does
>>>>>>>>>
>>>>>>>>> product(bool, OmitFramePointer, true,
>>>>>>>>>
>>>>>>>>> In globals_x86.hpp it does
>>>>>>>>>
>>>>>>>>> LP64_ONLY(define_pd_global(bool, OmitFramePointer, false););
>>>>>>>>>
>>>>>>>>> I am not sure that you can mix product(...) and product_pd(...)
>>>>>>>>> like
>>>>>>>>> this, so I think it just ends up getting the default from the
>>>>>>>>> product(...).
>>>>>>>>
>>>>>>>> You are right, mixing product and product_pd does not make sense
>>>>>>>> at all.
>>>>>>>> Thank you for doing additional testing and for drawing attention
>>>>>>>> to the
>>>>>>>> problem.
>>>>>>>>
>>>>>>>> I updated the code to use product_pd and define_pd_global on all
>>>>>>>> relevant platforms.
>>>>>>>>
>>>>>>>>> Aside: In general, I do not like options which include a
>>>>>>>>> negative in
>>>>>>>>> them because I have to do a double think when I see something
>>>>>>>>> like,
>>>>>>>>> -XX:-OmitFramePointer, as in, it is omitting the frame pointer,
>>>>>>>>> therefore it is using a frame pointer. How about FramePointer
>>>>>>>>> so we
>>>>>>>>> have -XX:+FramePointer to say I want frame pointers and
>>>>>>>>> -XX:-FramePointer to say I don't.
>>>>>>>>
>>>>>>>> That is a good idea. Double negation is an unnecessary
>>>>>>>> complication, so
>>>>>>>> I changed the name of the flag to FramePointer, just as you
>>>>>>>> suggested.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I did some timing on the above 'fibo' test
>>>>>>>>>
>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>> -XX:-OmitFramePointer fibo 43
>>>>>>>>> 701408733
>>>>>>>>>
>>>>>>>>> real 0m1.545s
>>>>>>>>> user 0m1.571s
>>>>>>>>> sys 0m0.015s
>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>> -XX:+OmitFramePointer fibo 43
>>>>>>>>> 701408733
>>>>>>>>>
>>>>>>>>> real 0m1.504s
>>>>>>>>> user 0m1.527s
>>>>>>>>> sys 0m0.019s
>>>>>>>>>
>>>>>>>>> which is ~3% difference on this test case. On aarch64, I see ~7%
>>>>>>>>> difference on this test case.
>>>>>>>>
>>>>>>>> Thank you for the performance measurements!
>>>>>>>>
>>>>>>>>> With the above change to fix the logic of OmitFramePointer (and
>>>>>>>>> possible change its name) the patch looks good to me.
>>>>>>>>
>>>>>>>> Here is the updated webrev (the same webrev that was already
>>>>>>>> included
>>>>>>>> into my reply to Roland):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.01/
>>>>>>>>
>>>>>>>>> I will prepare a mirror patch for aarch64.
>>>>>>>>
>>>>>>>> That would be great!
>>>>>>>>
>>>>>>>> Thank you and best regards,
>>>>>>>>
>>>>>>>>
>>>>>>>> Zoltán
>>>>>>>>
>>>>>>>>>
>>>>>>>>> All the best,
>>>>>>>>> Ed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>
More information about the hotspot-compiler-dev
mailing list