gdb and OpenJDK

Wed Feb 18 04:15:04 UTC 2015

I considered and even implemented something similar to some of the proposals
discussed (see https://sourceware.org/ml/gdb-patches/2013-12/msg00964.html
for the attempt based on JIT reader; the discussion lasted until next June and
patch was eventually rejected).
It is definitely doable to generate DWARF unwind info, although it will cost you
memory (about 7% of the emitted code), and DWARF symbol info, which is going
to cost you a lot more memory (for the estimate, look at the size of
the .debuginfo
for libjvm.so).
Note also that although Python has a number of drawbacks, verbosity is not one
of them, so the implementation in C++ will require at least as much
code. And this
code not being essential for running HotSpot, its will eventually
receive about as
much love as SA; IMHO a better way to address this problem is to have
unit tests,
and this requires the same effort for Python as for SA.
I will get back with the answer about license required tomorrow.

On Mon, Feb 16, 2015 at 4:48 AM, Erik Helin <erik.helin at oracle.com> wrote:
> On 2015-02-16, Andrew Haley wrote:
>> On 02/16/2015 12:06 PM, Erik Helin wrote:
>> > On 2015-02-16, Andrew Haley wrote:
>> >> On 02/16/2015 10:43 AM, Volker Simonis wrote:
>> >>> Now if we replicate this SA code one more time in a Python library for
>> >>> GDB, you'll probably agree that it can't work more reliably than the
>> >>> original SA code. This may be good enough for some use cases, but it
>> >>> won't be perfect. I'm not a gdb/DWARF expert but I think what we
>> >>> really need is to generate debug information for all the generated
>> >>> code. We need to know for every single PC of generated code the
>> >>> corresponding frame information and how to get to the previous frame.
>> >>
>> >> It would be nice.  We don't actually need it, given that we've done
>> >> without for years, and generating e.g. full DWARF unwinder data for
>> >> every instruction is something that even GCC doesn't always attempt to
>> >> do.  (And, of course, there's a lot of hand-written assembly code in
>> >> HotSpot.  Annotating this is a significant effort.)
>> >
>> > Do we really need to use DWARF though? The gdbjit interface seems to
>> > support a custom debug format if you also implement a reader for
>> > your custom debug format. I've never done this, so I can't say if
>> > there is something missing from the gdbjit API that HotSpot requires.
>>
>> Well, it would have to be able to convey the same information as DWARF
>> unwinder data; the GDB people tell me that generating some DWARF is
>> the right way to do it.  But of course I'm not wedded to any
>> particular format.
>
> I agree that DWARF would be a very nice thing to have, it would (most
> likely) allow us to print names of variables, arguments etc in a frame.
> However, as you mentioned, making HotSpot output DWARF in-memory for the
> assembly it produces would be a massive effort.
>
> I guess what I wonder is, how little debug information can we get away
> with if we only want to traverse the stack and print the name of each
> frame? This is why I was interested in the support from gdbjit for a
> custom debug format.
>
> An alternative to using gdbjit, as mentioned earlier in this thread,
> would be to generate data structures (structs) at a well-known
> symbol/address that can easily be consumed from various plugins/tools.
> The reason for using such approach is to try to keep the maintenance
> work for each plugin/tool as low as possible.
>
> Thanks,
> Erik