RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2]
Christian Hagedorn
chagedorn at openjdk.java.net
Fri Jan 28 09:22:14 UTC 2022
On Thu, 27 Jan 2022 09:26:50 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> Hi Christian, this is very nice and useful!
Thanks Thomas!
> Two general remarks. One concern I have is that the new functionality should be super stable, since nothing is more annoying than to crash during stack dumping in hs-err file; I much rather have a call stack without bells and whistles than an abridged one. Maybe we could, in hs-err printing, if we got secondary crashes during callstack dumping, repeat the step with all optional features (also name demangling) disabled? This could also be done in a separate RFE. We'll know when this happens, we can react then.
I absolutely agree - stability should be the primary concern. An incomplete hs-err file should be avoided at any cost. Doing an additional "catch and repeat without optional features" sounds interesting to get more safety. Would such a thing be easy to add? Yes, it might be better to do that in a separate RFE.
> Another small concern, we parse the Elf file while dumping the stack, right? I remember having a lot of problems on Solaris when dumping callstacks, because there parsing the elf file was really slow. And that delayed call stack printing by a lot, so much that the ErrorCrashTimeout often kicked in and spoiled the crash logs for us.
Yes, a pc for a frame is directly parsed when printing the corresponding frame. It takes some more time to do the additional parsing but not that much. These are the timestamps from a quick `-XX:CICrashAt=1` run with `-Xlog:dwarf=info` on my local machine on `Ubuntu 20.04` with a `fastdebug` build:
[1.862s][info][dwarf] Open DWARF file: /home/christian/Downloads/test/jdk-19/fastdebug/lib/server/libjvm.debuginfo
[1.867s][info][dwarf] pc: 0x00007ffa35c8a9cf, offset: 0x007749cf, filename: c1_Compiler.cpp, line: 250
[1.871s][info][dwarf] pc: 0x00007ffa35fbfb28, offset: 0x00aa9b28, filename: compileBroker.cpp, line: 2291
[1.876s][info][dwarf] pc: 0x00007ffa35fc08e8, offset: 0x00aaa8e8, filename: compileBroker.cpp, line: 1966
[1.881s][info][dwarf] pc: 0x00007ffa36e50cca, offset: 0x0193acca, filename: thread.cpp, line: 1297
[1.890s][info][dwarf] pc: 0x00007ffa36e59010, offset: 0x01943010, filename: thread.cpp, line: 358
[1.897s][info][dwarf] pc: 0x00007ffa36b3c524, offset: 0x01626524, filename: os_linux.cpp, line: 705
The parsing of a single pc takes a little less than 0.01s. Of course, this is not a great way to measure performance. It also highly depends on the source files themselves, the machine setup etc. Thus, this cannot be considered a valid performance test. But still, I think these numbers can give us some indication of the order of magnitude. Compared to the current `ErrorLogTimeout` default value of 2min this looks promising.
-------------
PR: https://git.openjdk.java.net/jdk/pull/7126
More information about the build-dev
mailing list