RFR: 8369994: Mixed mode jhsdb jstack cannot resolve symbol with cold attribute [v2]
Kevin Walls
kevinw at openjdk.org
Thu Oct 30 12:53:27 UTC 2025
On Mon, 20 Oct 2025 01:23:45 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:
>> `jhsdb jstack --mixed` with coredump cannot resolve function symbol which has `.cold` attribute.
>>
>>
>> ----------------- 120485 -----------------
>> "Thread-0" #24 prio=5 tid=0x00007f50dc1aa7c0 nid=120485 waiting on condition [0x00007f50c0d1a000]
>> java.lang.Thread.State: TIMED_WAITING (sleeping)
>> JavaThread state: _thread_blocked
>> 0x00007f50e4710735 __GI_abort + 0x8b
>> 0x00007f50e1e01f33 ????????
>>
>>
>> 0x7f50e1e01f33 was `os::abort(bool, void const*, void const*) [clone .cold]` and I could see it in GDB. However it has `.cold` suffix, it means the code has been relocated as ["cold" function](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute). In GDB, we can see the code in another area from function body as following:
>>
>>
>> (gdb) disas 0x7f50e1e01f2e, 0x7f50e1e01f34
>> Dump of assembler code from 0x7f50e1e01f2e to 0x7f50e1e01f34:
>> 0x00007f50e1e01f2e <_ZN2os5abortEbPKvS1_.cold+0>: call 0x7f50e1e01010 <abort at plt>
>> => 0x00007f50e1e01f33: nop
>> End of assembler dump.
>>
>>
>> libsaproc.so checks address range to resolve symbol whether the address is in between `start` and `start + size - 1`. As you can see in assembler dump, the code in `.cold` section is `call` instruction, thus IP points next `nop`, thus we should allow address range between `start` and `start + size`.
>>
>> After this PR, you can see the right symbol as following:
>>
>>
>> ----------------- 120485 -----------------
>> "Thread-0" #24 prio=5 tid=0x00007f50dc1aa7c0 nid=120485 waiting on condition [0x00007f50c0d1a000]
>> java.lang.Thread.State: TIMED_WAITING (sleeping)
>> JavaThread state: _thread_blocked
>> 0x00007f50e4710735 __GI_abort + 0x8b
>> 0x00007f50e1e01f33 os::abort(bool, void const*, void const*) [clone .cold] + 0x5
>
> Yasumasa Suenaga has updated the pull request incrementally with two additional commits since the last revision:
>
> - Add fallback code to process DWARF with RIP-1 in Linux AMD64
> - Revert "8369994: Mixed mode jhsdb jstack cannot resolve symbol with cold attribute"
>
> This reverts commit 570b65c6b56ba3378d4f532fa0874ff08ff18451.
Sorry, when I wrote: "If the DWARF lookup works at RIP-1, make the closestSymbol call always use RIP-1."
I was just thinking out loud and summarising, that the change tries RIP first, and on failure tries DWARF using RIP-1, and if that works then it always and only uses RIP-1 for the symbol.
This is why I was thinking if there was a danger of dwarf resolution and symbol resolution being to closely tied together.
i.e. maybe making the closest symbol lookup do its own fallback, independent of how dwarf was resolved.
This is probably not a big deal.
Yes, it's essential to lookup with the real RIP first, we really should get dwarf and symbol resolved using that pc address if it's anywhere earlier in the function, including the prologue.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/27846#issuecomment-3467835129
More information about the serviceability-dev
mailing list