Bug in os::current_frame()/_get_previous_fp() or why parenthesis somtimes really matter

Volker Simonis volker.simonis at gmail.com
Fri Feb 8 10:08:16 PST 2008


Hi,

the implementation of "_get_previous_fp()" which is used by
"os::current_frame()" seems to be buggy on Solaris/x86_64.
"_get_previous_fp()" is implemented as assembler inline function in
"hotspot/src/os_cpu/solaris_amd64/vm/solaris_amd64.il" as follows:

  // Get the frame pointer from previous frame.
      .inline _get_previous_fp,0
      movq     %rbp, %rax
      movq     %rax, %rax
      .end

Although I'm not an x86 expert I think that after locking at
"generate_get_previous_fp()" in
hotspot/src/cpu/amd64/vm/stubGenerator_amd64.cpp which generated a
stub that was used previously in  "os::current_frame()" to query the
frame pointer and looks as follows:

address generate_get_previous_fp() {
  StubCodeMark mark(this, "StubRoutines", "get_previous_fp");
  const Address old_fp(rbp, 0);
  const Address older_fp(rax, 0);
  address start = __ pc();

  __ enter();
  __ movq(rax, old_fp); // callers fp
  __ movq(rax, older_fp); // the frame for ps()
  __ popq(rbp);
  __ ret(0);

  return start;
}

the correct implementation of  "_get_previous_fp()" should read as follows:

  // Get the frame pointer from previous frame.
      .inline _get_previous_fp,0
      movq     (%rbp), %rax
      movq     (%rax), %rax
      .end

The funny thing is that although the current implementation is clearly
wrong, it still worked in the debug build. I think this has to do with
how CC treats inline assembly under optimization. In the debug build,
it generated the following code:

           185. frame os::current_frame() {
                <Function: os::current_frame()>
                [ 185]      780:  pushq   %rbp
                [ 185]      781:  movq    %rsp,%rbp
                [ 185]      784:  subq    $0xc0,%rsp
                [ 185]      78b:  movq    %r12,-0xb8(%rbp)
                [ 185]      792:  movq    %rdi,-8(%rbp)
           186.   intptr_t* fp = _get_previous_fp();
                [ 186]      796:  movl    $0,%eax
                [ 186]      79b:  movq    %rbp,%rax
                [ 186]      79e:  movq    %rax,%rax
                [ 186]      7a1:  movq    %rax,%r8
                [ 186]      7a4:  movq    %r8,-0x40(%rbp)
           187.   frame myframe((intptr_t*)os::current_stack_pointer(),

This code returned the wrong, but still a valid fp which is just good
enough for "os::current_frame()". However if compiled with '-xO1' as
done in the opt build, CC generates the following code:

                <Function: os::current_frame()>
                [  ?]        0:  pushq   %rbp
                [  ?]        1:  movq    %rsp,%rbp
                [  ?]        4:  subq    $0x60,%rsp
                [  ?]        8:  pushq   %r12
                [  ?]        a:  pushq   %r13
                [  ?]        c:  movq    %rdi,%r12
                [  ?]        f:  movq    %rax,%r13
                [  ?]       12:  call    os::current_stack_pointer() [
0x17, .+5 ]

You can see that it elliminated the code from "_get_previous_fp()" and
just used the uninitialized value of %rax as frame pointer, which is
usually wrong.

Notice that this problem exists in the latest JDK 6 build as well as
in OpenJDK (Java 7). It probably hasn't showed up until now because
"os::current_frame()" is only called in the following three places:

1.oop::register_oop() in /hotspot/src/share/vm/oops/oopsHierarchy.cpp
2.ps() in hotspot/src/share/vm/utilities/debug.cpp
3. VMError::report(outputStream* st) in
/hotspot/src/share/vm/utilities/vmError.cpp

The first one is usually disabled, the second one is only called by
developers in the debugger (and probably for a long time nobody called
"ps()" in the debugger for an opt-build) and the last one is during
the creation of an hs_error file where probably nobody noticed the
problem.

Regards,
Volker



More information about the hotspot-dev mailing list