RFR: 8293166: jdk/jfr/jvm/TestDumpOnCrash.java fails on Linux ppc64le and Linux aarch64 [v2]

Ralf Schmelter rschmelter at openjdk.org
Wed Nov 9 19:28:31 UTC 2022


On Wed, 9 Nov 2022 14:42:11 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> IIUC the problem is that JFR dumper attempts a SafePoint while in fatal error reporting.

It doesn't attempt a safepoint, it is stopped at the native to "in VM" transition because a safepoint is pending. And the safepoint will never begin, since the sampler thread has crashed, leaving the thread in the "in VM" state, which is not safepoint safe.

> Another thread (the JFR sampler thread) happens to be in Java, attempts to enter the safepoint, but gets stuck because we switched the signal handler and nobody is there to handle SIGTRAP.

The sampler thread crashes because it calls a tier 1 compiled method, for which we already have created a higher tier one. And to make the switch to the higher tier method, a trap instruction is written to the verified entry point of the tier 1 method on aarch64 and ppc64 (technically on ppc64 a trap is used in debug VMs and when the jump offset is too large and on aarch64 only if the jump offset is too large [128 M in a product build and only 2 M in a debug build]).

 > The underlying problem is that we don't handle SafePoint faults in the crash handler.

And we don't handle implicit null checks and similar things where we use the signal handler.

Cheers,
Ralf

-------------

PR: https://git.openjdk.org/jdk/pull/10943


More information about the hotspot-jfr-dev mailing list