A hotspot patch for stack profiling (frame pointer)

Bertrand Delsart bertrand.delsart at oracle.com
Wed Jan 14 14:42:30 UTC 2015


Hi,

I'm surprised not to see any change related to JSR292 in the webrev.

While the fix looks safe (not breaking hotspot), it probably does not 
work as you expect... and this might make it useless.

RBP is used by the JITs to memorize SP before it may be adapted during a 
call sequence (look for instance for method_handle_invoke_SP_save_opr or 
rbp_mh_SP_save in the code).

Hence the value we push on the stack may not be the RBP of the caller. 
It can be its SP just before the call... which would probably confuse 
your profiler, possibly crashing it if you really assume this is the 
beginning of a frame.

IMHO, the RFE does not break hotspot and profiling seems to work... but 
if you try to profile code which uses invoke dynamic, you will hit bugs 
in the profiler.

I would not prevent the JITs from using RBP as long as the changeset is 
not sufficient to guarantee the profiling will work... and IMHO solving 
the JSR292 issue will be much more intrusive (impacting HotSpot stack 
walking code).

Regards,

Bertrand.


On 14/01/2015 03:06, Vladimir Kozlov wrote:
> Filed RFE:
>
> https://bugs.openjdk.java.net/browse/JDK-8068945
>
> Regards,
> Vladimir
>
> On 12/9/14 2:14 AM, Erik Helin wrote:
>> I should also add that I don't have enough knowledge of the compiler
>> internals to review this patch, sorry.
>>
>> Thanks,
>> Erik
>>
>> On 2014-12-09 10:53, Erik Helin wrote:
>>> I applied the patch on top of jdk9/hs-comp and created a webrev:
>>> http://cr.openjdk.java.net/~ehelin/brendan/frame-pointer/webrev/
>>>
>>> I also successfully run the patch through JPRT.
>>>
>>> Thanks,
>>> Erik
>>>
>>> On 2014-12-05 20:57, Brendan Gregg wrote:
>>>>
>>>>
>>>> On Thu, Dec 4, 2014 at 2:55 PM, Brendan Gregg
>>>> <brendan.d.gregg at gmail.com
>>>> <mailto:brendan.d.gregg at gmail.com>> wrote:
>>>>
>>>>     G'Day,
>>>>
>>>>     I've hacked hotspot to return the frame pointer, in part to see
>>>> what
>>>>     this involves, and also to have a working prototype for analysis.
>>>>     Along with an agent to resolve symbols, this has allowed full stack
>>>>     profiling using Linux perf_events. The following flame graphs show
>>>>     the resulting profiles.
>>>>
>>>>     A mixed mode CPU flame graph of a vert.x benchmark (click to zoom):
>>>>
>>>>     http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg
>>>>
>>>>     Same thing, but this time disabling inlining, to show more frames:
>>>>
>>>>
>>>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-flamegraph.svg
>>>>
>>>>     As expected, performance is worse without inlining. You can compare
>>>>     the flame graphs side by side to see why. Less time spent doing
>>>> work
>>>>     / I/O!
>>>>
>>>>
>>>> https://github.com/brendangregg/Misc/blob/master/java/openjdk8_b132-fp.diff
>>>>
>>>>
>>>>
>>>>     is my patch,
>>>>
>>>>     [...]
>>>>
>>>>
>>>> In case there's problems with the patch URL, the patch is:
>>>>
>>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/x86_64.ad <http://x86_64.ad>
>>>>   2014-03-04 02:52:11.000000000 +0000
>>>> +++ openjdk8/hotspot/src/cpu/x86/vm/x86_64.ad <http://x86_64.ad>
>>>>   2014-11-08 01:10:49.686044933 +0000
>>>> @@ -166,10 +166,9 @@
>>>>   // 3) reg_class stack_slots( /* one chunk of stack-based "registers"
>>>> */ )
>>>>   //
>>>>
>>>> -// Class for all pointer registers (including RSP)
>>>> +// Class for all pointer registers (including RSP, excluding RBP)
>>>>   reg_class any_reg(RAX, RAX_H,
>>>>                     RDX, RDX_H,
>>>> -                  RBP, RBP_H,
>>>>                     RDI, RDI_H,
>>>>                     RSI, RSI_H,
>>>>                     RCX, RCX_H,
>>>> @@ -184,10 +183,9 @@
>>>>                     R14, R14_H,
>>>>                     R15, R15_H);
>>>>
>>>> -// Class for all pointer registers except RSP
>>>> +// Class for all pointer registers except RSP and RBP
>>>>   reg_class ptr_reg(RAX, RAX_H,
>>>>                     RDX, RDX_H,
>>>> -                  RBP, RBP_H,
>>>>                     RDI, RDI_H,
>>>>                     RSI, RSI_H,
>>>>                     RCX, RCX_H,
>>>> @@ -199,9 +197,8 @@
>>>>                     R13, R13_H,
>>>>                     R14, R14_H);
>>>>
>>>> -// Class for all pointer registers except RAX and RSP
>>>> +// Class for all pointer registers except RAX, RSP and RBP
>>>>   reg_class ptr_no_rax_reg(RDX, RDX_H,
>>>> -                         RBP, RBP_H,
>>>>                            RDI, RDI_H,
>>>>                            RSI, RSI_H,
>>>>                            RCX, RCX_H,
>>>> @@ -226,9 +223,8 @@
>>>>                            R13, R13_H,
>>>>                            R14, R14_H);
>>>>
>>>> -// Class for all pointer registers except RAX, RBX and RSP
>>>> +// Class for all pointer registers except RAX, RBX, RSP and RBP
>>>>   reg_class ptr_no_rax_rbx_reg(RDX, RDX_H,
>>>> -                             RBP, RBP_H,
>>>>                                RDI, RDI_H,
>>>>                                RSI, RSI_H,
>>>>                                RCX, RCX_H,
>>>> @@ -260,10 +256,9 @@
>>>>   // Singleton class for TLS pointer
>>>>   reg_class ptr_r15_reg(R15, R15_H);
>>>>
>>>> -// Class for all long registers (except RSP)
>>>> +// Class for all long registers (except RSP and RBP)
>>>>   reg_class long_reg(RAX, RAX_H,
>>>>                      RDX, RDX_H,
>>>> -                   RBP, RBP_H,
>>>>                      RDI, RDI_H,
>>>>                      RSI, RSI_H,
>>>>                      RCX, RCX_H,
>>>> @@ -275,9 +270,8 @@
>>>>                      R13, R13_H,
>>>>                      R14, R14_H);
>>>>
>>>> -// Class for all long registers except RAX, RDX (and RSP)
>>>> -reg_class long_no_rax_rdx_reg(RBP, RBP_H,
>>>> -                              RDI, RDI_H,
>>>> +// Class for all long registers except RAX, RDX (and RSP, RBP)
>>>> +reg_class long_no_rax_rdx_reg(RDI, RDI_H,
>>>>                                 RSI, RSI_H,
>>>>                                 RCX, RCX_H,
>>>>                                 RBX, RBX_H,
>>>> @@ -288,9 +282,8 @@
>>>>                                 R13, R13_H,
>>>>                                 R14, R14_H);
>>>>
>>>> -// Class for all long registers except RCX (and RSP)
>>>> -reg_class long_no_rcx_reg(RBP, RBP_H,
>>>> -                          RDI, RDI_H,
>>>> +// Class for all long registers except RCX (and RSP, RBP)
>>>> +reg_class long_no_rcx_reg(RDI, RDI_H,
>>>>                             RSI, RSI_H,
>>>>                             RAX, RAX_H,
>>>>                             RDX, RDX_H,
>>>> @@ -302,9 +295,8 @@
>>>>                             R13, R13_H,
>>>>                             R14, R14_H);
>>>>
>>>> -// Class for all long registers except RAX (and RSP)
>>>> -reg_class long_no_rax_reg(RBP, RBP_H,
>>>> -                          RDX, RDX_H,
>>>> +// Class for all long registers except RAX (and RSP, RBP)
>>>> +reg_class long_no_rax_reg(RDX, RDX_H,
>>>>                             RDI, RDI_H,
>>>>                             RSI, RSI_H,
>>>>                             RCX, RCX_H,
>>>> @@ -325,10 +317,9 @@
>>>>   // Singleton class for RDX long register
>>>>   reg_class long_rdx_reg(RDX, RDX_H);
>>>>
>>>> -// Class for all int registers (except RSP)
>>>> +// Class for all int registers (except RSP and RBP)
>>>>   reg_class int_reg(RAX,
>>>>                     RDX,
>>>> -                  RBP,
>>>>                     RDI,
>>>>                     RSI,
>>>>                     RCX,
>>>> @@ -340,10 +331,9 @@
>>>>                     R13,
>>>>                     R14);
>>>>
>>>> -// Class for all int registers except RCX (and RSP)
>>>> +// Class for all int registers except RCX (and RSP, RBP)
>>>>   reg_class int_no_rcx_reg(RAX,
>>>>                            RDX,
>>>> -                         RBP,
>>>>                            RDI,
>>>>                            RSI,
>>>>                            RBX,
>>>> @@ -355,8 +345,7 @@
>>>>                            R14);
>>>>
>>>>   // Class for all int registers except RAX, RDX (and RSP)
>>>> -reg_class int_no_rax_rdx_reg(RBP,
>>>> -                             RDI,
>>>> +reg_class int_no_rax_rdx_reg(RDI,
>>>>                                RSI,
>>>>                                RCX,
>>>>                                RBX,
>>>> @@ -718,6 +707,7 @@
>>>>       st->print("# stack bang");
>>>>       st->print("\n\t");
>>>>       st->print("pushq   rbp\t# Save rbp");
>>>> +    // BDG consider: st->print("movq    rbp, rsp\t# ");
>>>>       if (framesize) {
>>>>         st->print("\n\t");
>>>>         st->print("subq    rsp, #%d\t# Create frame",framesize);
>>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>   2014-03-04 02:52:11.000000000 +0000
>>>> +++ openjdk8/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp
>>>> 2014-11-07
>>>> 23:57:11.589593723 +0000
>>>> @@ -5236,6 +5236,7 @@
>>>>       // We always push rbp, so that on return to interpreter rbp,
>>>> will be
>>>>       // restored correctly and we can correct the stack.
>>>>       push(rbp);
>>>> +    mov(rbp, rsp);
>>>>       // Remove word for ebp
>>>>       framesize -= wordSize;
>>>>
>>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp
>>>>   2014-03-04 02:52:10.000000000 +0000
>>>> +++ openjdk8/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp
>>>>   2014-11-07 23:57:21.933257882 +0000
>>>> @@ -358,6 +358,7 @@
>>>>     generate_stack_overflow_check(frame_size_in_bytes);
>>>>
>>>>     push(rbp);
>>>> +  mov(rbp, rsp);
>>>>   #ifdef TIERED
>>>>     // c2 leaves fpu stack dirty. Clean it on entry
>>>>     if (UseSSE < 2 ) {
>>>>
>>>>
>>>> Brendan
>>


-- 
Bertrand Delsart,                     Grenoble Engineering Center
Oracle,         180 av. de l'Europe,          ZIRST de Montbonnot
38330 Montbonnot Saint Martin,                             FRANCE
bertrand.delsart at oracle.com             Phone : +33 4 76 18 81 23

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


More information about the serviceability-dev mailing list