RFR: 8350111: [PPC] AsyncGetCallTrace crashes when called while handling SIGTRAP [v3]
Andrei Pangin
apangin at openjdk.org
Thu Feb 27 01:55:00 UTC 2025
On Wed, 26 Feb 2025 14:06:20 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>> With this change `JavaThread::pd_get_top_frame_for_profiling()` fails if the current thread is found to be `_thread_in_Java` but the CodeCache does not contain its pc.
>>
>> This will prevent crashes as described by the JBS item.
>>
>> The fix might be too conservative for situations where a thread doen't change its thread state when calling native code, e.g. using the Foreign Function & Memory API. The difficulty finding a less defensive fix is that one must detect if a valid pc can be found in the caller's ABI before constructing that frame.
>>
>> Testing:
>>
>> * DaCapo Tomcat with async-profiler on a fastdebug build.
>> * Tier 1-4 of hotspot and jdk on the main platforms and also on Linux/PPC64le and AIX.
>
> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision:
>
> Improve whitespace
A couple of comments for the record.
Detecting another signal handler on the stack or blocking SIGPROF inside a handler is not a solution: a signal number that profiler uses is configurable; there may be multiple profilers working at the same time or one profiler working in dual mode (cpu + wall clock).
In any case, the problem is not specific to signal handlers: it may happen with any frame that does not store frame pointer at a known location. A typical example is `clock_gettime` function called from `System.currentTimeMillis` and `System.nanoTime`. If libc is compiled without frame pointers, JVM fails to unwind `clock_gettime`. Note that `currentTimeMillis` and `nanoTime` are JVM intrinsics: they do not do regular state transition; a thread remains `in_Java` while executing `clock_gettime`. A signal trampoline is just another example of code with uncommon frame layout (not only on PPC).
I'm OK with the proposed fix as long as it reduces possibility of crashes, but it's likely not a bullet-proof solution. Any native frame that does not belong to `libjvm.so` is potentially dangerous to walk.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23641#issuecomment-2686603796
More information about the hotspot-runtime-dev
mailing list