gdb and OpenJDK

Mon Feb 16 10:43:41 UTC 2015

Hi everybody,

I really don't want to prevent the "Good Enough" solution and as far
as I understand, this solution doesn't require any code changes to
HotSpot, right? It will just add an additional Python artifact to the
OpenJDK delivery which will be used by gdb.

But in general I have to agree with Erik and Staffan. Getting a mixed
stack trace of a running Java process or a core file is notoriously
hard. The best we have today is the native, built-in stack walking
code in HotSpot which is used for hs_err files and which can be also
called from within gdb (see print_native_stack() in
src/share/vm/utilities/debug.cpp). If you look at that code (and at
the functions it calls like frame.sender(),
os::get_sender_for_C_frame(&fr), fraame.is_java_frame(),
frame.is_native_frame(), frame.is_runtime_frame(), etc...) you will
see that there are a lot of special cases to handle. And even that
code is not perfect. I can easily show you examples where it doesn't
work (mostly at the beginning of methods/stubs when the new frame is
being set up but still not complete).

All this complicated (and platform dependent) code is replicated in
Java in the SA agent. You can easily verify that it isn't 100% perfect
by running "jstack -m -F <java_pid>" against a running Java VM.
Besides the problem of having frames in inconsistent state another big
problem is the fact that we can only reliably unwind inlined Java
frames from a native frame at safepoints. But that's of course not
guaranteed if we use "jstack -F" or if we ask for a stack-trace in gdb
at an arbitrary PC.

Now if we replicate this SA code one more time in a Python library for
GDB, you'll probably agree that it can't work more reliably than the
original SA code. This may be good enough for some use cases, but it
won't be perfect. I'm not a gdb/DWARF expert but I think what we
really need is to generate debug information for all the generated
code. We need to know for every single PC of generated code the
corresponding frame information and how to get to the previous frame.
I know it's possible and I know that gdb has callbacks to consume this
debug information which is generated at runtime (see [1]) although
I've never programmed it myself. LLVM seems to use this technique and
has some documentation available ([2,3]). I suppose this is the
direction Erik wanted to go and I think that would be the right way.

Regards,
Volker

[1] https://sourceware.org/gdb/onlinedocs/gdb/JIT-Interface.html
[2] http://llvm.org/docs/DebuggingJITedCode.html
[3] http://llvm.org/releases/2.9/docs/DebuggingJITedCode.html

On Mon, Feb 16, 2015 at 10:40 AM, Andrew Haley <aph at redhat.com> wrote:
> On 15/02/15 19:55, Staffan Larsen wrote:
>
>> I think what Erik suggested was if there was some way the JVM could
>> expose data in a format that is easy to interpret by other tools
>> (such as the python gdb plugin, but also plugins for other
>> debuggers, or SA). Of course this would have to be data, not code,
>> so that it would be available in core files as well. I haven’t seen
>> the python module you have written so I don’t know how complex is
>> it, but we should think of ways to make such code even simpler if
>> possible. If we had data exposed in an easy-to-read format it would
>> perhaps make maintenance of these tools simpler. We have a problem
>> with SA today that it is way to dependent on the code in the JVM - a
>> small change in data structures in the JVM will break SA, something
>> we hare looking for solutions to.
>
> I'm sure that's true, but let's not allow the Best to be the enemy of
> the Good Enough; this is a contribution that we can use today.
>
> Andrew.