getPCDescNearDbg returns incorrect PCDesc

David Griffiths david.griffiths at gmail.com
Thu Jan 17 11:35:57 UTC 2019


Here is example as promised. First the test program:

  1 public class PcDescTest
  2 {
  3     public static void main(String[] args)
  4     {
  5         new PcDescTest();
  6     }
  7
  8     PcDescTest()
  9     {
 10         for (int i = 0; ; i++) {
 11             run(i);
 12         }
 13     }
 14
 15     int run(int i)
 16     {
 17         int j = i - 1;
 18         int k = j + 23;
 19         if ((j % 1000000) == 0) {
 20             sleep();
 21         }
 22         return k;
 23     }
 24
 25     void sleep()
 26     {
 27         System.out.println("about to sleep...");
 28         try {
 29             Thread.sleep(10000);
 30         } catch (InterruptedException e) {}
 31         System.out.println("...back from sleep");
 32     }
 33 }

run like this:

java -Xmx8m -Xcomp -XX:CompileCommand=dontinline,PcDescTest.run
-XX:CompileCommand=dontinline,PcDescTest.sleep
-XX:CompileCommand=print,PcDescTest.run PcDescTest

and it will print the assembler out which includes:

[Verified Entry Point]
  0x00007fc3a93bd880: mov    %eax,-0x14000(%rsp)
  0x00007fc3a93bd887: push   %rbp
  0x00007fc3a93bd888: sub    $0x10,%rsp         ;*synchronization entry
                                                ; - PcDescTest::run at -1 (line 17)

  0x00007fc3a93bd88c: mov    %edx,%r11d
  0x00007fc3a93bd88f: dec    %r11d              ;*isub
                                                ; - PcDescTest::run at 2 (line 17)

  0x00007fc3a93bd892: mov    %edx,%ebp
  0x00007fc3a93bd894: add    $0x16,%ebp         ;*iadd
                                                ; - PcDescTest::run at 7 (line 18)

  0x00007fc3a93bd897: movslq %r11d,%r10

In another window, attach gdb (gdb --pid nnnn) and then set a
breakpoint at the iadd line which I think we can agree is definitely
the code for line 18:

(gdb) b *0x00007fc3a93bd894

then continue until the breakpoint hits and then take a core dump:

(gdb) c
Continuing.
[Switching to Thread 0x7fc3bfe1a700 (LWP 8354)]
Thread 2 "java" hit Breakpoint 1, 0x00007fc3a93bd894 in ?? ()
(gdb) gcore
warning: target file /proc/8353/cmdline contained unexpected null characters
Saved corefile core.8353
(gdb) quit

finally, use jstack to see the stack:

$ jstack `command -v java` core.8353

and you will see:

 - PcDescTest.run(int) @bci=2, line=17 (Compiled frame; information
may be imprecise)
 - PcDescTest.<init>() @bci=8, line=11 (Compiled frame)
 - PcDescTest.main(java.lang.String[]) @bci=4, line=5 (Interpreted frame)

which is incorrect (admittedly it warns you!)

if I make the change I suggested:

  public PCDesc getPCDescNearDbg(Address pc) {
    PCDesc bestGuessPCDesc = null;
    //long bestDistance = 0;
    long bestDistance = Long.MAX_VALUE;
    for (Address p = scopesPCsBegin(); p.lessThan(scopesPCsEnd()); p =
p.addOffsetTo(pcDescSize)) {
      PCDesc pcDesc = new PCDesc(p);
      // In case pc is null
      //long distance = -pcDesc.getRealPC(this).minus(pc);
      long distance = pcDesc.getRealPC(this).minus(pc) - 1;
      //if ((bestGuessPCDesc == null) ||
          //((distance >= 0) && (distance < bestDistance))) {
      if (distance >= 0 && distance < bestDistance) {
        bestGuessPCDesc = pcDesc;
        bestDistance    = distance;
      }

then it gives:

 - PcDescTest.run(int) @bci=7, line=18 (Compiled frame; information
may be imprecise)
 - PcDescTest.<init>() @bci=12, line=10 (Compiled frame)
 - PcDescTest.main(java.lang.String[]) @bci=4, line=5 (Interpreted frame)

but as you can see the caller line is now incorrect.

Cheers,

David


On Thu, 17 Jan 2019 at 09:12, David Griffiths <david.griffiths at gmail.com> wrote:
>
> Hi Jc, ok thanks, I'll see if I can come up with a simple example.
>
> Cheers,
>
> David
>
> On Wed, 16 Jan 2019 at 17:30, JC Beyler <jcbeyler at google.com> wrote:
>>
>> Hi David,
>>
>> The explanation you are providing is clear to me, though I'm not sure at all what the right fix would be in this case. I would agree that there might be a bug here but it would be easier to see if you could provide an easy reproducer that shows how the initial line is off by 1 and then how it messes up higher in the stack if you try to fix it by your -1.
>>
>> My best guess is that there is a difference between code paths as you are saying and we might have to differentiate top frame and other frames for this calculation but without a reproducer to see it in action, it is hard to tell.
>>
>> Thanks,
>> Jc
>>
>> On Wed, Jan 16, 2019 at 7:58 AM Doerr, Martin <martin.doerr at sap.com> wrote:
>>>
>>> Hi David,
>>>
>>> re-posting to serviceability mailing list since you're referring to a method from
>>> https://docs.oracle.com/javase/jp/8/docs/serviceabilityagent/sun/jvm/hotspot/code/NMethod.html
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev <hotspot-dev-bounces at openjdk.java.net> On Behalf Of David Griffiths
>>> Sent: Mittwoch, 16. Januar 2019 12:08
>>> To: hotspot-dev at openjdk.java.net
>>> Subject: getPCDescNearDbg returns incorrect PCDesc
>>>
>>> Hi, I'd like some help please understanding what appears to be a bug in
>>> getPCDescNearDbg. The problem is caused by the fact that the _pc_offset
>>> value stored in a PcDesc is actually the offset of the code contained in
>>> the _next_ PcDesc rather than the current one. I assume it's done like this
>>> so that obtaining stack traces works correctly. At least on x86, the last
>>> instruction in a PcDesc chunk is the callq which means that the return
>>> address pc points to the next PcDesc. Therefore associating the PcDesc
>>> containing the callq with the address of the next PcDesc chunk means that
>>> the matching works in getPCDescAt.
>>>
>>> But this causes "off by one" errors in getPCDescNearDbg which appears to
>>> expect the PcDesc getRealPC address to be that of the PcDesc itself rather
>>> than the following one. So you sometimes see incorrect top of stack line
>>> numbers when debugging. (And this would presumably also affect profilers).
>>>
>>> I can fix the top of stack issue by changing distance to:
>>>
>>> long distance = pcDesc.getRealPC(this).minus(pc) - 1
>>>
>>> but this then messes up the line numbers further down the stack because
>>> they are trying to match against return pcs.
>>>
>>> Anybody come across this before, is my analysis correct?
>>>
>>> Cheers,
>>>
>>> David
>>
>>
>>
>> --
>>
>> Thanks,
>> Jc


More information about the serviceability-dev mailing list