getPCDescNearDbg returns incorrect PCDesc
David Griffiths
david.griffiths at gmail.com
Mon Jan 21 09:36:08 UTC 2019
Hi, I would be changing the PcDesc returned, yes. I can't comment on
the PcDesc having been mapped incorrectly but seems unsafe to change
that because then the line numbers for callers further down the stack
won't match? But whilst browsing that code I came across
nmethod.match_desc:
if (!approximate)
return pc->pc_offset() == pc_offset;
else
return (pc-1)->pc_offset() < pc_offset && pc_offset <= pc->pc_offset();
and that logic would match "PcDescTest::run at 7 (line 18)" because its
offset gives 0x00007fc3a93bd897 which is >= 0x00007fc3a93bd894 which
in turn is > the previous PcDesc.
(Not so sure about the situation where pc == 0x00007fc3a93bd892. My
version makes it match line 18 but that only works for the imprecise
top of stack case. That's because there's a difference between
matching a return address further down the stack where you are still
executing the line before and matching the same address at the top of
the stack where the call has returned and you are about to execute the
next line.)
Cheers,
David
On Fri, 18 Jan 2019 at 18:42, Jini George <jini.george at oracle.com> wrote:
>
> Thank you for the example and the detailed explanation, David.
>
> I am not sure if the PcDesc being found in getPCDescNearDbg() is
> incorrect or if the PcDesc is being mapped to the incorrect bci and line
> number (or scope?). In the example, the PcDesc returned for
> 0x00007fc3a93bd894 will be the one with real pc = 0x00007fc3a93bd892.
> This seems to be correct since the instruction at 0x00007fc3a93bd892 --
> which is:
>
> mov %edx,%ebp
>
> is technically a part of the 'iadd'. (since, if i understand right, this
> is moving 'i' into 'ebp' before adding 22 (instead of adding i - 1 to
> 23)). But the bci and the line number attributed to 0x00007fc3a93bd892
> should be '7' and '18' respectively, instead of '2' and '17' which
> belong to the 'isub'. Would you be changing the PcDesc returned with
> your proposed fix ?
>
> I have not been able to delve deeper into this yet. Your fix for the
> callers (calling getScopeDescAt() instead of getScopeDescNearDbg())
> seems correct to me.
>
> Thanks,
> Jini.
>
> On 1/17/2019 5:05 PM, David Griffiths wrote:
> > Here is example as promised. First the test program:
> >
> > 1 public class PcDescTest
> > 2 {
> > 3 public static void main(String[] args)
> > 4 {
> > 5 new PcDescTest();
> > 6 }
> > 7
> > 8 PcDescTest()
> > 9 {
> > 10 for (int i = 0; ; i++) {
> > 11 run(i);
> > 12 }
> > 13 }
> > 14
> > 15 int run(int i)
> > 16 {
> > 17 int j = i - 1;
> > 18 int k = j + 23;
> > 19 if ((j % 1000000) == 0) {
> > 20 sleep();
> > 21 }
> > 22 return k;
> > 23 }
> > 24
> > 25 void sleep()
> > 26 {
> > 27 System.out.println("about to sleep...");
> > 28 try {
> > 29 Thread.sleep(10000);
> > 30 } catch (InterruptedException e) {}
> > 31 System.out.println("...back from sleep");
> > 32 }
> > 33 }
> >
> > run like this:
> >
> > java -Xmx8m -Xcomp -XX:CompileCommand=dontinline,PcDescTest.run
> > -XX:CompileCommand=dontinline,PcDescTest.sleep
> > -XX:CompileCommand=print,PcDescTest.run PcDescTest
> >
> > and it will print the assembler out which includes:
> >
> > [Verified Entry Point]
> > 0x00007fc3a93bd880: mov %eax,-0x14000(%rsp)
> > 0x00007fc3a93bd887: push %rbp
> > 0x00007fc3a93bd888: sub $0x10,%rsp ;*synchronization entry
> > ; - PcDescTest::run at -1 (line 17)
> >
> > 0x00007fc3a93bd88c: mov %edx,%r11d
> > 0x00007fc3a93bd88f: dec %r11d ;*isub
> > ; - PcDescTest::run at 2 (line 17)
> >
> > 0x00007fc3a93bd892: mov %edx,%ebp
> > 0x00007fc3a93bd894: add $0x16,%ebp ;*iadd
> > ; - PcDescTest::run at 7 (line 18)
> >
> > 0x00007fc3a93bd897: movslq %r11d,%r10
> >
> > In another window, attach gdb (gdb --pid nnnn) and then set a
> > breakpoint at the iadd line which I think we can agree is definitely
> > the code for line 18:
> >
> > (gdb) b *0x00007fc3a93bd894
> >
> > then continue until the breakpoint hits and then take a core dump:
> >
> > (gdb) c
> > Continuing.
> > [Switching to Thread 0x7fc3bfe1a700 (LWP 8354)]
> > Thread 2 "java" hit Breakpoint 1, 0x00007fc3a93bd894 in ?? ()
> > (gdb) gcore
> > warning: target file /proc/8353/cmdline contained unexpected null characters
> > Saved corefile core.8353
> > (gdb) quit
> >
> > finally, use jstack to see the stack:
> >
> > $ jstack `command -v java` core.8353
> >
> > and you will see:
> >
> > - PcDescTest.run(int) @bci=2, line=17 (Compiled frame; information
> > may be imprecise)
> > - PcDescTest.<init>() @bci=8, line=11 (Compiled frame)
> > - PcDescTest.main(java.lang.String[]) @bci=4, line=5 (Interpreted frame)
> >
> > which is incorrect (admittedly it warns you!)
> >
> > if I make the change I suggested:
> >
> > public PCDesc getPCDescNearDbg(Address pc) {
> > PCDesc bestGuessPCDesc = null;
> > //long bestDistance = 0;
> > long bestDistance = Long.MAX_VALUE;
> > for (Address p = scopesPCsBegin(); p.lessThan(scopesPCsEnd()); p =
> > p.addOffsetTo(pcDescSize)) {
> > PCDesc pcDesc = new PCDesc(p);
> > // In case pc is null
> > //long distance = -pcDesc.getRealPC(this).minus(pc);
> > long distance = pcDesc.getRealPC(this).minus(pc) - 1;
> > //if ((bestGuessPCDesc == null) ||
> > //((distance >= 0) && (distance < bestDistance))) {
> > if (distance >= 0 && distance < bestDistance) {
> > bestGuessPCDesc = pcDesc;
> > bestDistance = distance;
> > }
> >
> > then it gives:
> >
> > - PcDescTest.run(int) @bci=7, line=18 (Compiled frame; information
> > may be imprecise)
> > - PcDescTest.<init>() @bci=12, line=10 (Compiled frame)
> > - PcDescTest.main(java.lang.String[]) @bci=4, line=5 (Interpreted frame)
> >
> > but as you can see the caller line is now incorrect.
> >
> > Cheers,
> >
> > David
> >
> >
> > On Thu, 17 Jan 2019 at 09:12, David Griffiths <david.griffiths at gmail.com> wrote:
> >>
> >> Hi Jc, ok thanks, I'll see if I can come up with a simple example.
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On Wed, 16 Jan 2019 at 17:30, JC Beyler <jcbeyler at google.com> wrote:
> >>>
> >>> Hi David,
> >>>
> >>> The explanation you are providing is clear to me, though I'm not sure at all what the right fix would be in this case. I would agree that there might be a bug here but it would be easier to see if you could provide an easy reproducer that shows how the initial line is off by 1 and then how it messes up higher in the stack if you try to fix it by your -1.
> >>>
> >>> My best guess is that there is a difference between code paths as you are saying and we might have to differentiate top frame and other frames for this calculation but without a reproducer to see it in action, it is hard to tell.
> >>>
> >>> Thanks,
> >>> Jc
> >>>
> >>> On Wed, Jan 16, 2019 at 7:58 AM Doerr, Martin <martin.doerr at sap.com> wrote:
> >>>>
> >>>> Hi David,
> >>>>
> >>>> re-posting to serviceability mailing list since you're referring to a method from
> >>>> https://docs.oracle.com/javase/jp/8/docs/serviceabilityagent/sun/jvm/hotspot/code/NMethod.html
> >>>>
> >>>> Best regards,
> >>>> Martin
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: hotspot-dev <hotspot-dev-bounces at openjdk.java.net> On Behalf Of David Griffiths
> >>>> Sent: Mittwoch, 16. Januar 2019 12:08
> >>>> To: hotspot-dev at openjdk.java.net
> >>>> Subject: getPCDescNearDbg returns incorrect PCDesc
> >>>>
> >>>> Hi, I'd like some help please understanding what appears to be a bug in
> >>>> getPCDescNearDbg. The problem is caused by the fact that the _pc_offset
> >>>> value stored in a PcDesc is actually the offset of the code contained in
> >>>> the _next_ PcDesc rather than the current one. I assume it's done like this
> >>>> so that obtaining stack traces works correctly. At least on x86, the last
> >>>> instruction in a PcDesc chunk is the callq which means that the return
> >>>> address pc points to the next PcDesc. Therefore associating the PcDesc
> >>>> containing the callq with the address of the next PcDesc chunk means that
> >>>> the matching works in getPCDescAt.
> >>>>
> >>>> But this causes "off by one" errors in getPCDescNearDbg which appears to
> >>>> expect the PcDesc getRealPC address to be that of the PcDesc itself rather
> >>>> than the following one. So you sometimes see incorrect top of stack line
> >>>> numbers when debugging. (And this would presumably also affect profilers).
> >>>>
> >>>> I can fix the top of stack issue by changing distance to:
> >>>>
> >>>> long distance = pcDesc.getRealPC(this).minus(pc) - 1
> >>>>
> >>>> but this then messes up the line numbers further down the stack because
> >>>> they are trying to match against return pcs.
> >>>>
> >>>> Anybody come across this before, is my analysis correct?
> >>>>
> >>>> Cheers,
> >>>>
> >>>> David
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Thanks,
> >>> Jc
More information about the serviceability-dev
mailing list