[9] RFR(S): 8068945: Use RBP register as proper frame pointer in JIT compiled code on x86
Zoltán Majó
zoltan.majo at oracle.com
Mon Apr 27 08:45:11 UTC 2015
Thank you for the reviews, Vladimir, Roland, Dean, Ed, Vitaly, and Aleksey!
Best regards,
Zoltan
On 04/24/2015 08:30 PM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> On 04/24/2015 07:52 PM, Vladimir Kozlov wrote:
>> > - (2) the assert is there only for SunStudio (according to the
>> comments
>> > on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only place
>> > where forte_fill_call_trace_given_top() is called) does not expect
>> debug
>> > information to available for all frames, it just needs the
>> jmethodID for
>> > every frame.
>>
>> This is incorrect interpretation of comments. The assert is in
>> general code to catch incorrect bci or pcoffset during vframe
>> construction. The comment said that instead of having several asserts
>> we have only one for all these case to simplify disabling it when
>> running performance analyzer.
>
> Thank you for the clarification!
>
>> So you can only skip it when you run with performance analyzer but
>> not during general execution.
>>
>> May be you can set or path flag when vframe construction is called
>> from forte and skip assert in such case. I assume you hit it only
>> with performance analyzer. Right?
>
> You are right. The assert can be triggered only if vframe construction
> is called from forte, but not otherwise. So setting the command-line
> flag -XX:SuppressErrorAt=vframe.cpp:443 (as suggested in vframe.cpp
> lines 439--442) is sufficient to skip the assert when running a
> fastdebug VM with the performance analyzer.
>
> I'll leave the assert enabled and intend to push webrev.03 then.
>
> Thank you and best regards,
>
>
> Zoltan
>
>>
>> Thanks,
>> Vladimir
>>
>> On 4/24/15 10:21 AM, Zoltán Majó wrote:
>>> Hi Vladimir,
>>>
>>>
>>> On 04/24/2015 06:26 PM, Vladimir Kozlov wrote:
>>>> Yes, this looks good. Nice work and thank you for testing.
>>>
>>> Thank you for all the feedback you've provided while I was working on
>>> this issue!
>>>
>>> I would have one final question: Can I disable the assert in
>>> vframeStreamCommon::found_bad_method_frame() on line 443 in vframe.cpp?
>>> I would prefer that for the reasons I mentioned in my previous message
>>> (please see also below).
>>>
>>> Here is an updated webrev (with the assert disabled):
>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.04/
>>>
>>> Could you please let me know which webrev I can push, webrev.03 with
>>> the
>>> assert left enabled or webrev.04 with the assert disabled?
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>> I ran the experiments *with the assert enabled* in
>>>>> vframeStreamCommon::found_bad_method_frame(). The assert was not
>>>>> triggered.
>>>>>
>>>>> I would be, however, inclined to disable that assert because:
>>>>> - (1) it is better to have some information in stack traces than no
>>>>> information and a crash (even though the crash happens in a fastdebug
>>>>> build);
>>>>> - (2) the assert is there only for SunStudio (according to the
>>>>> comments
>>>>> on lines 439--442 in vframe.cpp) and AsyncGetCallTrace (the only
>>>>> place
>>>>> where forte_fill_call_trace_given_top() is called) does not expect
>>>>> debug
>>>>> information to available for all frames, it just needs the
>>>>> jmethodID for
>>>>> every frame.
>>>>>
>>>>>> Do you mean "can NOT happen"?:
>>>>>> > Moreover, if the stack is walked synchronously (e.g., at
>>>>>> safepoints), no
>>>>>> > problems appear either, because the synchronous interruption can
>>>>>> happen
>>>>>> > while execution is within the method handle intrinsic.
>>>>>
>>>>> Yes, I meant that it can *not* happen. Sorry for the confusion.
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> Vladimir
>>>>>>
>>>>>> On 4/22/15 9:26 AM, Zoltán Majó wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>>
>>>>>>> I managed to do some more work on this enhancement. Please see
>>>>>>> details
>>>>>>> below.
>>>>>>>
>>>>>>> On 04/01/2015 03:38 PM, Zoltán Majó wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>>
>>>>>>>> On 03/30/2015 11:39 PM, Vladimir Kozlov wrote:
>>>>>>>>> On 3/30/15 2:13 PM, Zoltán Majó wrote:
>>>>>>>>>> Hi Vladimir,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> thank you for the feedback!
>>>>>>>>>>
>>>>>>>>>> On 03/30/2015 10:27 PM, Vladimir Kozlov wrote:
>>>>>>>>>>> How about PreserveFramePointer instead of simple FramePointer?
>>>>>>>>>>>
>>>>>>>>>>> PreserveFramePointer will mean that compiled (or other) code
>>>>>>>>>>> will
>>>>>>>>>>> use
>>>>>>>>>>> that register only as Frame pointer.
>>>>>>>>>>
>>>>>>>>>> I will change the flag's name to PreserveFramePointer and
>>>>>>>>>> will also
>>>>>>>>>> update the description.
>>>>>>>
>>>>>>> I changed the flag's name to PreserveFramePointer, just as you
>>>>>>> suggested.
>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Zoltan, x86 flags setting should be in general globals_x86.hpp.
>>>>>>>>>>> You
>>>>>>>>>>> can #ifdef _LP64 there too. I don't understand why you only set
>>>>>>>>>>> it to
>>>>>>>>>>> true on linux-x64.
>>>>>>>>>>
>>>>>>>>>> I remembered that the original discussion with Brendan Gregg
>>>>>>>>>> mentioned
>>>>>>>>>> only Linux's perf tool as a possible use case for "proper" frame
>>>>>>>>>> pointers. So I was unsure whether to enable proper frame
>>>>>>>>>> pointers by
>>>>>>>>>> default on other x64 platforms as well.
>>>>>>>>>>
>>>>>>>>>> But if you think it would be better to have proper frame
>>>>>>>>>> pointers on
>>>>>>>>>> all
>>>>>>>>>> x64 platforms, I will change the code to set
>>>>>>>>>> PreserveFramePointer to
>>>>>>>>>> true for all x64 platforms. Just please let me know.
>>>>>>>
>>>>>>> The current webrev sets the PreserveFramePointer flag to to true on
>>>>>>> all
>>>>>>> x86_64 platforms and to false on all other platforms.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Currently compiled code for all x86 platforms is almost the same
>>>>>>>>> (win64 has difference in registers usage) and we should keep
>>>>>>>>> it that
>>>>>>>>> way.
>>>>>>>>>
>>>>>>>>> Also the original request was to have flag to enable such
>>>>>>>>> behavior
>>>>>>>>> (use RBP only as FP). So to have it off by default is
>>>>>>>>> acceptable. If
>>>>>>>>> performance group or someone find a regression (or bug) due to
>>>>>>>>> this
>>>>>>>>> change we can switch the flag off by default before jdk9 release.
>>>>>>>>>
>>>>>>>>> Try to run pstack on Solaris and jstack on OSX to make sure they
>>>>>>>>> report correct call stack with compiled java methods. And JFR.
>>>>>>>>> Also it would be nice to run SunStudio analyzer to verify that it
>>>>>>>>> works.
>>>>>>>>
>>>>>>>> I ran all tools you've suggested. JFR and jstack is unaffected,
>>>>>>>> pstack
>>>>>>>> produces nice stack traces (it did not always do so before).
>>>>>>>
>>>>>>> I tested the current webrev with the following setup: I used two
>>>>>>> tests,
>>>>>>> one that generates a long chain of lambda form invocations and an
>>>>>>> other
>>>>>>> one that generates a long chain of "regular" method invocations.
>>>>>>> Both
>>>>>>> tests were executed on an x64 machine in four configurations:
>>>>>>> with +/-
>>>>>>> Xcomp and with +/- PreserveFramePointer.
>>>>>>>
>>>>>>> Just as before, JFR and jstack stack traces are unaffected for both
>>>>>>> tests, pstack can now produce stack traces with both tests if
>>>>>>> PreserveFramePointer is enabled.
>>>>>>>
>>>>>>>> However, I've encountered a problem with SunStudio: Two asserts
>>>>>>>> fail
>>>>>>>> in the fastdebug build. Both of them "soft" failures, as
>>>>>>>> neither the
>>>>>>>> VM nor SunStudio crash with the product build. I worked on the
>>>>>>>> problem
>>>>>>>> today and have a partial understanding of the issue, but more
>>>>>>>> investigation is needed to have a patch that preserves the correct
>>>>>>>> behavior of SunStudio as well.
>>>>>>>
>>>>>>> I was able to track down the problems with SunStudio. I had to
>>>>>>> change
>>>>>>> the code at two places.
>>>>>>>
>>>>>>>
>>>>>>> Change #1 (in src/cpu/x86/vm/frame_x86.cpp):
>>>>>>>
>>>>>>> *** 222,232 ****
>>>>>>> }
>>>>>>>
>>>>>>> if (sender_blob->is_nmethod()) {
>>>>>>> nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>>> if (nm != NULL) {
>>>>>>> ! if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>>>> nm->is_deopt_entry(sender_pc)) {
>>>>>>> return false;
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> --- 222,233 ----
>>>>>>> }
>>>>>>>
>>>>>>> if (sender_blob->is_nmethod()) {
>>>>>>> nmethod* nm = sender_blob->as_nmethod_or_null();
>>>>>>> if (nm != NULL) {
>>>>>>> ! if (nm->is_deopt_mh_entry(sender_pc) ||
>>>>>>> nm->is_deopt_entry(sender_pc) ||
>>>>>>> ! nm->method()->is_method_handle_intrinsic()) {
>>>>>>> return false;
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> The reason for this change is the following. Method handle
>>>>>>> intrinsics
>>>>>>> (i.e., the intrinsics _invokeBasic, _linkToVirtual,_linkToStatic,
>>>>>>> _linkToSpecial, and _linkToInterface) do not allocate stack
>>>>>>> space when
>>>>>>> invoked, but they can extend the stack space of their caller
>>>>>>> "temporarily".
>>>>>>>
>>>>>>> For example, if VerifyMethodHandles is enabled, some stack space is
>>>>>>> used
>>>>>>> during verification. The temporarily used stack space is released
>>>>>>> before
>>>>>>> the intrinsic jumps to its target. As a result, the target of a
>>>>>>> method
>>>>>>> handle intrinsic will have a correct SP when it returns and the
>>>>>>> program's control flow is correct.
>>>>>>>
>>>>>>> Moreover, if the stack is walked synchronously (e.g., at
>>>>>>> safepoints), no
>>>>>>> problems appear either, because the synchronous interruption can
>>>>>>> happen
>>>>>>> while execution is within the method handle intrinsic.
>>>>>>>
>>>>>>> The problem is that the SunStudio analyzer can interrupt the VM
>>>>>>> asynchonously and walk the stack. If execution of a thread is
>>>>>>> interrupted while the thread is in a method handle intrinsic,
>>>>>>> the SP
>>>>>>> might contain an invalid value.
>>>>>>>
>>>>>>> The new webrev adds a check that marks the current frame unsafe for
>>>>>>> sender if the frame belongs to a method handle intrinsic
>>>>>>> (frame::safe_for_sender returns false in this case).
>>>>>>>
>>>>>>>
>>>>>>> Change #2 (in src/share/vm/prims/forte.cpp):
>>>>>>>
>>>>>>> *** 425,435 ****
>>>>>>>
>>>>>>> RegisterMap map(thd, false);
>>>>>>> initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>>> }
>>>>>>>
>>>>>>> ! vframeStreamForte st(thd, initial_Java_frame, false);
>>>>>>>
>>>>>>> for (; !st.at_end() && count < depth; st.forte_next(),
>>>>>>> count++) {
>>>>>>> bci = st.bci();
>>>>>>> method = st.method();
>>>>>>>
>>>>>>> --- 425,435 ----
>>>>>>>
>>>>>>> RegisterMap map(thd, false);
>>>>>>> initial_Java_frame = initial_Java_frame.sender(&map);
>>>>>>> }
>>>>>>>
>>>>>>> ! vframeStreamForte st(thd, initial_Java_frame, true);
>>>>>>>
>>>>>>> for (; !st.at_end() && count < depth; st.forte_next(),
>>>>>>> count++) {
>>>>>>> bci = st.bci();
>>>>>>> method = st.method();
>>>>>>>
>>>>>>> The problem is that the following assert in forte.cpp on line 103
>>>>>>>
>>>>>>> assert(filled_in, "invariant");
>>>>>>>
>>>>>>> fails. The problem appears if we have a stack trace like:
>>>>>>>
>>>>>>> V [libjvm.so+0x1c98c4a] void VMError::report(outputStream*)+0xb1a
>>>>>>> V [libjvm.so+0x1c9a3e8] void VMError::report_and_die()+0x748
>>>>>>> V [libjvm.so+0x1003c8e] void report_vm_error(const
>>>>>>> char*,int,const
>>>>>>> char*,const char*)+0x7e
>>>>>>> V [libjvm.so+0x10efa22] vframeStreamForte::vframeStreamForte
>>>>>>> #Nvariant
>>>>>>> 1(JavaThread*,frame,bool)+0xe2
>>>>>>> --> (Frame #5) V [libjvm.so+0x10f0bb9] void
>>>>>>> forte_fill_call_trace_given_top(JavaThread*,ASGCT_CallTrace*,int,frame)+0x789
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0x10f1436] AsyncGetCallTrace+0x246
>>>>>>> C [libcollector.so+0x272a8] __collector_ext_jstack_unwind+0xb8
>>>>>>> C [libcollector.so+0x277df] __collector_get_frame_info+0x27f
>>>>>>> C [libcollector.so+0x2f093] __collector_getUserCtx+0x13
>>>>>>> C [libcollector.so+0x1abc7] __collector_ext_profile_handler+0x127
>>>>>>> C [libcollector.so+0x17535] collector_sigprof_dispatcher+0x85
>>>>>>> C [libc.so.1+0x122476] __sighndlr+0x6
>>>>>>> C [libc.so.1+0x115972] call_user_handler+0x2ce
>>>>>>> C [libc.so.1+0x115e1b] sigacthandler+0xdb
>>>>>>> C 0xffffffffffffffff
>>>>>>> V [libjvm.so+0x1959089] void os::PlatformEvent::park()+0xd9
>>>>>>> V [libjvm.so+0x18a6b34] int ParkCommon(ParkEvent*,long)+0x34
>>>>>>> V [libjvm.so+0x18a7657] int Monitor::IWait(Thread*,long)+0xb7
>>>>>>> V [libjvm.so+0x18a8b86] bool Monitor::wait(bool,long,bool)+0x346
>>>>>>> V [libjvm.so+0xf7073d] void
>>>>>>> CompileBroker::wait_for_completion(CompileTask*)+0xad
>>>>>>> V [libjvm.so+0xf6f6b6] void
>>>>>>> CompileBroker::compile_method_base(methodHandle,int,int,methodHandle,int,const
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> char*,Thread*)+0x406
>>>>>>> V [libjvm.so+0xf6fd96]
>>>>>>> nmethod*CompileBroker::compile_method(methodHandle,int,int,methodHandle,int,const
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> char*,Thread*)+0x586
>>>>>>> V [libjvm.so+0xbbad72] void
>>>>>>> AdvancedThresholdPolicy::submit_compile(methodHandle,int,CompLevel,JavaThread*)+0xb2
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0x1aef92f] void
>>>>>>> SimpleThresholdPolicy::compile(methodHandle,int,CompLevel,JavaThread*)+0x14f
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0xbbb00f] void
>>>>>>> AdvancedThresholdPolicy::method_invocation_event(methodHandle,methodHandle,CompLevel,nmethod*,JavaThread*)+0x1ff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0x1aef765]
>>>>>>> nmethod*SimpleThresholdPolicy::event(methodHandle,methodHandle,int,int,CompLevel,nmethod*,JavaThread*)+0x2e5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0xdb93bc] unsigned
>>>>>>> char*Runtime1::counter_overflow(JavaThread*,int,Method*)+0x31c
>>>>>>> v ~RuntimeStub::counter_overflow Runtime1 stub
>>>>>>> --> (Frame #29) J 143 C1
>>>>>>> java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @
>>>>>>> 0xffff80ffacc6962a [0xffff80ffacc69580+0xaa]
>>>>>>> --> (Frame #30) v ~StubRoutines::call_stub
>>>>>>> V [libjvm.so+0x13ca50b] void
>>>>>>> JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x41b
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> V [libjvm.so+0x152a111] JVM_DoPrivileged+0xfb1
>>>>>>> C [libjava.so+0x12f42]
>>>>>>> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2+0x12
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> J 142
>>>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (0 bytes) @ 0xffff80ffb400c57c [0xffff80ffb400c420+0x15c\
>>>>>>> ]
>>>>>>> J 134 C1
>>>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;
>>>>>>>
>>>>>>> (47 bytes) @ 0xffff80ffacc66014 [0xffff80ffacc65ec0+0x154]
>>>>>>> ... more stack frames
>>>>>>>
>>>>>>> The forte_fill_call_trace_given_top() method (Frame #5) first
>>>>>>> checks if
>>>>>>> the first Java frame found is fully decipherable (line 395 in
>>>>>>> forte.cpp). In our case the first Java frame is Frame #29 (the
>>>>>>> C1-compiled version of java.net.URLClassLoader$1.run).
>>>>>>>
>>>>>>> In our case Frame #29 is not decipherable, because
>>>>>>> java.net.URLClassLoader$1.run has been made "not entrant" (a
>>>>>>> C2-compiled
>>>>>>> version of the same method has been produced shortly before).
>>>>>>>
>>>>>>> Afterwards, forte_fill_call_trace_given_top() checks if the
>>>>>>> method is
>>>>>>> "safe for sender" (line 424 in forte.cpp). The caller of the
>>>>>>> java.net.URLClassLoader$1.run method is ~StubRoutines::call_stub,
>>>>>>> which
>>>>>>> is considered "safe for sender" by the VM.
>>>>>>>
>>>>>>> Then, initial_Java_frame is set to the ~StubRoutines::call_stub
>>>>>>> stub
>>>>>>> (line 430). This does not seem to be correct because the stub is
>>>>>>> not a
>>>>>>> Java method and causes the assert(filled_in, "invariant") in the
>>>>>>> constructor of vframeStreamForte (line 103 in forte.cpp) to fail
>>>>>>> (because the frame cannot be filled from a stub).
>>>>>>>
>>>>>>> To avoid this failure, I propose to call the constructor of
>>>>>>> vframeStreamForte with parameter stop_at_java_call_stub set to true
>>>>>>> (instead of false) so that the VM stops walking the stack if a call
>>>>>>> stub
>>>>>>> has been reached.
>>>>>>>
>>>>>>>
>>>>>>> Here is the updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.02/
>>>>>>>
>>>>>>> In addition to testing the changeset with the tools mentioned
>>>>>>> before, I
>>>>>>> executed
>>>>>>> - all JPRT tests, all pass;
>>>>>>> - all java/lang/invoke and compiler JTREG tests; all tests that
>>>>>>> pass
>>>>>>> with the unmodified source trace pass with the changes as well.
>>>>>>>
>>>>>>> Thank you very much in advance!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>>>>
>>>>>>>> So that will put this RFR on hold for a while, unfortunately.
>>>>>>>>
>>>>>>>> Thank you for the feedback and suggestions so far!
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>>
>>>>>>>> Zoltan
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you!
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Zoltan
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 3/30/15 8:30 AM, Zoltán Majó wrote:
>>>>>>>>>>>> Hi Ed,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> thank you for your feedback! Please see comments below.
>>>>>>>>>>>>
>>>>>>>>>>>> On 03/30/2015 11:11 AM, Edward Nevill wrote:
>>>>>>>>>>>>> Hi Zoltán,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, 2015-03-27 at 15:34 +0100, Zoltán Majó wrote:
>>>>>>>>>>>>>> Full JPRT run, all tests pass. I also ran all hotspot
>>>>>>>>>>>>>> compiler
>>>>>>>>>>>>>> tests and
>>>>>>>>>>>>>> the jdk tests in java/lang/invoke on both x86_64 and x86_32.
>>>>>>>>>>>>>> All
>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>> that pass without the patch pass also with the patch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I ran the SPEC JVM 2008 benchmarks on our performance
>>>>>>>>>>>>>> infrastructure for
>>>>>>>>>>>>>> x86_64. The performance evaluation suggests that there is no
>>>>>>>>>>>>>> statistically significant performance degradation due to
>>>>>>>>>>>>>> having
>>>>>>>>>>>>>> proper
>>>>>>>>>>>>>> frame pointers. Therefore I propose to have OmitFramePointer
>>>>>>>>>>>>>> set to
>>>>>>>>>>>>>> false by default on x86_64 (and set to true on all other
>>>>>>>>>>>>>> platforms).
>>>>>>>>>>>>> This patch looks good, however I think there is a problem
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> logic of OmitFramePointer.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is my test case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>>>> // $Id: fibo.java,v 1.2 2000/12/24 19:10:50 doug Exp $
>>>>>>>>>>>>> // http://www.bagley.org/~doug/shootout/
>>>>>>>>>>>>>
>>>>>>>>>>>>> public class fibo {
>>>>>>>>>>>>> public static void main(String args[]) {
>>>>>>>>>>>>> int N = Integer.parseInt(args[0]);
>>>>>>>>>>>>> System.out.println(fib(N));
>>>>>>>>>>>>> }
>>>>>>>>>>>>> public static int fib(int n) {
>>>>>>>>>>>>> if (n < 2) return(1);
>>>>>>>>>>>>> return( fib(n-2) + fib(n-1) );
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> --- CUT HERE ---
>>>>>>>>>>>>>
>>>>>>>>>>>>> If I run it as follows on my x86 64 bit linux.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>>> -XX:-OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>>
>>>>>>>>>>>>> I get
>>>>>>>>>>>>>
>>>>>>>>>>>>> # {method} {0x00007fc62c97f388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>> # parm0: rsi = int
>>>>>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>>>>>> 0x00007fc625071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>> 0x00007fc625071107: push %rbp
>>>>>>>>>>>>> 0x00007fc625071108: mov %rsp,%rbp
>>>>>>>>>>>>> 0x00007f836907110b: sub $0x20,%rsp ;*synchronization
>>>>>>>>>>>>> entry
>>>>>>>>>>>>>
>>>>>>>>>>>>> which is correct, it is NOT(-) OmitFramePointer, therefore
>>>>>>>>>>>>> it is
>>>>>>>>>>>>> using
>>>>>>>>>>>>> the frame pointer
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now if I try just changing -XX:-OmitFramePointer to
>>>>>>>>>>>>> -XX:+OmitFramePointer in the above I get
>>>>>>>>>>>>>
>>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>>> -XX:+OmitFramePointer -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>>
>>>>>>>>>>>>> I get
>>>>>>>>>>>>>
>>>>>>>>>>>>> # {method} {0x00007f14d3c00388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>> # parm0: rsi = int
>>>>>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>>>>>> 0x00007f14e1071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>> 0x00007f14e1071107: push %rbp
>>>>>>>>>>>>> 0x00007f14e1071108: sub $0x20,%rsp ;*synchronization
>>>>>>>>>>>>> entry
>>>>>>>>>>>>>
>>>>>>>>>>>>> which is correct, it is ID(+) OmitFramePointer, therefore it
>>>>>>>>>>>>> does
>>>>>>>>>>>>> not
>>>>>>>>>>>>> use a frame pointer.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However, if I now delete the -XX:+/-OmitFramePointer
>>>>>>>>>>>>> altogether, IE
>>>>>>>>>>>>>
>>>>>>>>>>>>> /work/images/jdk/bin/java -XX:-TieredCompilation
>>>>>>>>>>>>> -XX:+PrintCompilation
>>>>>>>>>>>>> -XX:CompileOnly=fibo::fib -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>>> -XX:+PrintAssembly fibo 43
>>>>>>>>>>>>>
>>>>>>>>>>>>> I get
>>>>>>>>>>>>>
>>>>>>>>>>>>> # {method} {0x00007f0c4b730388} 'fib' '(I)I' in 'fibo'
>>>>>>>>>>>>> # parm0: rsi = int
>>>>>>>>>>>>> # [sp+0x30] (sp of caller)
>>>>>>>>>>>>> 0x00007f0c75071100: mov %eax,-0x14000(%rsp)
>>>>>>>>>>>>> 0x00007f0c75071107: push %rbp
>>>>>>>>>>>>> 0x00007f0c75071108: sub $0x20,%rsp ;*synchronization
>>>>>>>>>>>>> entry
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is not using a frame pointer which is the equivalent of
>>>>>>>>>>>>> -XX:+OmitFramePointer. However in your description above
>>>>>>>>>>>>> you say
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Therefore I propose to have OmitFramePointer set to false by
>>>>>>>>>>>>>> default
>>>>>>>>>>>>>> on x86_64 (and set to true on all other platforms).
>>>>>>>>>>>>> whereas OmitFramePointer actually seems to be set to true on
>>>>>>>>>>>>> x86_64
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think the problem may be with the declaration and
>>>>>>>>>>>>> definition of
>>>>>>>>>>>>> OmitFramePointer in globals.hpp and globals_x86.hpp
>>>>>>>>>>>>>
>>>>>>>>>>>>> In globals.hpp it does
>>>>>>>>>>>>>
>>>>>>>>>>>>> product(bool, OmitFramePointer, true,
>>>>>>>>>>>>>
>>>>>>>>>>>>> In globals_x86.hpp it does
>>>>>>>>>>>>>
>>>>>>>>>>>>> LP64_ONLY(define_pd_global(bool, OmitFramePointer, false););
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not sure that you can mix product(...) and
>>>>>>>>>>>>> product_pd(...)
>>>>>>>>>>>>> like
>>>>>>>>>>>>> this, so I think it just ends up getting the default from the
>>>>>>>>>>>>> product(...).
>>>>>>>>>>>>
>>>>>>>>>>>> You are right, mixing product and product_pd does not make
>>>>>>>>>>>> sense
>>>>>>>>>>>> at all.
>>>>>>>>>>>> Thank you for doing additional testing and for drawing
>>>>>>>>>>>> attention
>>>>>>>>>>>> to the
>>>>>>>>>>>> problem.
>>>>>>>>>>>>
>>>>>>>>>>>> I updated the code to use product_pd and define_pd_global
>>>>>>>>>>>> on all
>>>>>>>>>>>> relevant platforms.
>>>>>>>>>>>>
>>>>>>>>>>>>> Aside: In general, I do not like options which include a
>>>>>>>>>>>>> negative in
>>>>>>>>>>>>> them because I have to do a double think when I see something
>>>>>>>>>>>>> like,
>>>>>>>>>>>>> -XX:-OmitFramePointer, as in, it is omitting the frame
>>>>>>>>>>>>> pointer,
>>>>>>>>>>>>> therefore it is using a frame pointer. How about FramePointer
>>>>>>>>>>>>> so we
>>>>>>>>>>>>> have -XX:+FramePointer to say I want frame pointers and
>>>>>>>>>>>>> -XX:-FramePointer to say I don't.
>>>>>>>>>>>>
>>>>>>>>>>>> That is a good idea. Double negation is an unnecessary
>>>>>>>>>>>> complication, so
>>>>>>>>>>>> I changed the name of the flag to FramePointer, just as you
>>>>>>>>>>>> suggested.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I did some timing on the above 'fibo' test
>>>>>>>>>>>>>
>>>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>>>> -XX:-OmitFramePointer fibo 43
>>>>>>>>>>>>> 701408733
>>>>>>>>>>>>>
>>>>>>>>>>>>> real 0m1.545s
>>>>>>>>>>>>> user 0m1.571s
>>>>>>>>>>>>> sys 0m0.015s
>>>>>>>>>>>>> [ed at mylittlepony java]$ time /work/images/jdk/bin/java
>>>>>>>>>>>>> -XX:+OmitFramePointer fibo 43
>>>>>>>>>>>>> 701408733
>>>>>>>>>>>>>
>>>>>>>>>>>>> real 0m1.504s
>>>>>>>>>>>>> user 0m1.527s
>>>>>>>>>>>>> sys 0m0.019s
>>>>>>>>>>>>>
>>>>>>>>>>>>> which is ~3% difference on this test case. On aarch64, I
>>>>>>>>>>>>> see ~7%
>>>>>>>>>>>>> difference on this test case.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for the performance measurements!
>>>>>>>>>>>>
>>>>>>>>>>>>> With the above change to fix the logic of OmitFramePointer
>>>>>>>>>>>>> (and
>>>>>>>>>>>>> possible change its name) the patch looks good to me.
>>>>>>>>>>>>
>>>>>>>>>>>> Here is the updated webrev (the same webrev that was already
>>>>>>>>>>>> included
>>>>>>>>>>>> into my reply to Roland):
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~zmajo/8068945/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>>> I will prepare a mirror patch for aarch64.
>>>>>>>>>>>>
>>>>>>>>>>>> That would be great!
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you and best regards,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Zoltán
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> All the best,
>>>>>>>>>>>>> Ed.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>
>
More information about the hotspot-compiler-dev
mailing list