RFR: 8371014: Dump JFR recording on CrashOnOutOfMemoryError is incorrectly implemented
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms. JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting. Passed all of jdk_jfr tests on Linux AMD64. ------------- Commit messages: - Fix typo - Delete TestEmergencyDumpAtOOM.java from ProblemList - 8371014: Dump JFR recording on CrashOnOutOfMemoryError is incorrectly implemented Changes: https://git.openjdk.org/jdk/pull/28563/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28563&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371014 Stats: 31 lines in 8 files changed: 23 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28563.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28563/head:pull/28563 PR: https://git.openjdk.org/jdk/pull/28563
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
With your PR added, we do not observe the error in test TestEmergencyDumpAtOOM any more. src/hotspot/share/jfr/jfr.cpp line 159:
157: 158: void Jfr::on_vm_error_report(outputStream* st) { 159: assert(!JfrRecorder::is_recording(), "JFR should be stopped at erorr reporting");
'erorr' - please fix the little typo ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3605855107 PR Review Comment: https://git.openjdk.org/jdk/pull/28563#discussion_r2584283966
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
I think this makes sense, but should also be reviewed by JFR folks. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28563#pullrequestreview-3537010567
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
I'm still waiting for second reviewer. @mgronlun Can you take a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3703460604
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28563#pullrequestreview-3622449994
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
This will not work because there is still a race against the JFR Recorder Thread flushing concurrently with LeakProfiler::emit_events(). This can place the checkpoints and events in a segment before the corresponding classes and methods that were tagged as part of emit_events(). This will break the parser, since constant artifacts will not be resolvable (an invariant is that a flushed segment is self-contained). ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3709513877
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
This is a very tricky problem to solve correctly, because a VM operation has been introduced as part of error reporting and the VM shutdown sequence. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3709521319
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
Would it be a better solution to avoid replacing the signal handler? We could keep the Java compatible handler and change it such that it calls `crash_handler` only for the thread which is reporting the error. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3709875107
On Mon, 5 Jan 2026 10:37:51 GMT, Martin Doerr <mdoerr@openjdk.org> wrote:
Would it be a better solution to avoid replacing the signal handler? We could keep the Java compatible handler and change it such that it calls `crash_handler` only for the thread which is reporting the error.
I am thinking about some alternatives. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3709957825
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
Alternative implementation suggestion PR (in draft state) https://github.com/openjdk/jdk/pull/29094 Includes also a solution to [JDK-8373257](https://bugs.openjdk.org/browse/JDK-8373257) @tstuefe Please take a look, and also if you can, submit for testing on your platforms. Markus ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3719105625
On Wed, 7 Jan 2026 14:20:34 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
Alternative implementation suggestion PR (in draft state) https://github.com/openjdk/jdk/pull/29094
Includes also a solution to [JDK-8373257](https://bugs.openjdk.org/browse/JDK-8373257) @tstuefe
Please take a look, and also if you can, submit for testing on your platforms.
Markus
Thanks a lot @mgronlun ! I think JDK-8371014 (and JDK-8373257) should be tackled in #29094 . So should I close this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3721527637
On Wed, 7 Jan 2026 14:20:34 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
Alternative implementation suggestion PR (in draft state) https://github.com/openjdk/jdk/pull/29094
Includes also a solution to [JDK-8373257](https://bugs.openjdk.org/browse/JDK-8373257) @tstuefe
Please take a look, and also if you can, submit for testing on your platforms.
Markus
Thanks a lot @mgronlun ! I think JDK-8371014 (and JDK-8373257) should be tackled in #29094 . So should I close this PR?
Yes, I think we should do it in https://github.com/openjdk/jdk/pull/29094. You can close this one, and I will officially publish https://github.com/openjdk/jdk/pull/29094. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28563#issuecomment-3723622934
On Sat, 29 Nov 2025 06:06:16 GMT, Yasumasa Suenaga <ysuenaga@openjdk.org> wrote:
The jtreg test TestEmergencyDumpAtOOM.java runs into the following error on ppc64 platforms.
JFR emergency dump would be kicked at `VMError::report_and_die()`, then Java thread for JFR would not work due to secondary signal handler for error reporting.
Passed all of jdk_jfr tests on Linux AMD64.
This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/28563
participants (4)
-
Markus Grönlund
-
Martin Doerr
-
Matthias Baesken
-
Yasumasa Suenaga