RFR(M): 8247272: SA ELF file support has never worked for 64-bit causing address to symbol name mapping to fail

Chris Plummer chris.plummer at oracle.com
Wed Jul 8 06:20:59 UTC 2020


Hello,

Please help review the following:

http://cr.openjdk.java.net/~cjplummer/8247272/webrev.00/index.html
https://bugs.openjdk.java.net/browse/JDK-8247272

The short story is that SA address to native symbol name mapping/lookup 
has never worked on 64-bit, and this is due to the java level ELF file 
support only supporting 32-bit. This CR fixes that, and I believe also 
maintains 32-bit compatibility, although I have no way of testing that.

There is more to the story however on how we got here. Before going into 
the gory detail below, I just want to point out that currently nothing 
is using this support, and therefore it is technically not fixing 
anything, although I did verify that the fixes work (see details below). 
Also, I intend to remove all the java level ELF file support as part of 
JDK-8247516 [1]. The only reason I want to push these changes first is 
because I already did the work to get it working with 64-bit, and would 
like to get it archived before removing it in case for some reason it is 
revived in the future.

Now for the ugly details on how we got here (and you really don't need 
to read this unless you have any concerns with what I stated above). It 
starts with the clhsdb "whatis" command, which was the only (indirect) 
user of this java level ELF file support. It's implementation is in 
javascript, so we have not had access to it ever since JDK9 module 
support broke the SA javascript support (and javascript support is now 
removed). I started the process of converting "whatis" to java. It is 
basically the same as the clhsdb "findpc" command, except it also checks 
for native symbols, which it does with the following code:

   var dso = loadObjectContainingPC(addr);
   var sym = dso.closestSymbolToPC(addr);
   return sym.name + '+' + sym.offset;

Converting this to java was trivial. I just stuck support for it in the 
PointerFinder class, which is what findpc relies on. However, it always 
failed to successfully lookup a symbol. I found that 
DSO.closestSymbolToPC() called into the java level ELF support, and that 
was failing badly. After some debugging I noticed that the values read 
in for various ELF headers were mostly garbage. It then occurred to me 
that it was reading in 32-bit values that probably needed to be 64-bit. 
Sure enough, this code was never converted to 64-bit support. I then 
went and tried "whatis" on JDK8, the last version where it was 
available, and it failed there also with 64-bit binaries. So this is why 
I initially fixed it to work with 64-bit, and also how I tested it 
(using the modified findpc on a native symbol). But the story continues...

DSO.java, and as a consequence the java ELF file support, is used by all 
our posix ports to do address to symbol lookups. So I figured that after 
fixing the java level ELF file support for 64-bit, my improved findpc 
would start working on OSX also. No such luck, and for obvious reasons. 
OSX uses mach-o files. This ELF code should never have been used for it, 
and of course has never worked.

So I was left trying to figure out how to do OSX address to native 
symbol lookups. I then recalled that there was a 
CFrame.closestSymbolToPC() API that did address to native symbol lookups 
for native stack traces, and wondered how it was ever working (even on 
linux with the broken ELF 64-bit support). It turns out this takes a 
very different path to do the lookups, ending up in native code in 
libsaproc, where we also have ELF file support. I then converted 
DSO.closestSymbolToPC(addr) to use this libsaproc code instead, and it 
worked fine. So now there was no need for the java level ELF file 
support since its only user was DSO.closestSymbolToPC(addr). I should 
also add that this is the approach that has always been used on windows, 
with both CFrame.closestSymbolToPC() and DSO.closestSymbolToPC(addr) 
using the same libsaproc support.

There is still a bit more to the story. After diverting 
DSO.closestSymbolToPC(addr) to the libsaproc lookup code, it still 
didn't work for OSX. I thought it would just work since the native 
BsdDebuggerLocal.lookupByName0() is implemented, and it seems to trickle 
down to the proper lower level APIs to find the symbol, but there were 
two issues. The first is that for processes there is no support for 
looking up all the libraries and populating the list of ps_prochandle 
structures that are used to do the symbol lookups. This was just never 
implemented (also is why PMap does not work for OSX processes). For core 
files the ps_prochandle structs are there, but the lookup code was badly 
broken. That has now been fixed by JDK-8247515 [2], currently out for 
review. So the end result is we'll have address to native symbol lookup 
for everything but OSX processes.

If  your still here, thanks for listening!

Chris

[1] https://bugs.openjdk.java.net/browse/JDK-8247516
[2] https://bugs.openjdk.java.net/browse/JDK-8247515






More information about the serviceability-dev mailing list