From inakonechnyy at openjdk.org Mon Aug 1 13:45:53 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Mon, 1 Aug 2022 13:45:53 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v17] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with three additional commits since the last revision: - Skip an assert in LinuxAttachOperation::effectively_complete if it was called from crac checkpoint handling - corrected initialization ordering - review notes ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/ccf84467..fe6fab9e Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=16 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=15-16 Stats: 28 lines in 4 files changed: 9 ins; 4 del; 15 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Tue Aug 2 22:04:18 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 2 Aug 2022 22:04:18 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v18] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with one additional commit since the last revision: corrections for ThreadBlockInVM in effectively_complete ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/fe6fab9e..5978c933 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=17 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=16-17 Stats: 18 lines in 3 files changed: 7 ins; 2 del; 9 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Wed Aug 3 05:36:16 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Wed, 3 Aug 2022 05:36:16 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v19] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with one additional commit since the last revision: corrections ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/5978c933..c096d94d Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=18 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=17-18 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Fri Aug 5 11:47:47 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 5 Aug 2022 11:47:47 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v20] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with one additional commit since the last revision: java core crac corections ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/c096d94d..6a2b03b7 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=19 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=18-19 Stats: 54 lines in 1 file changed: 16 ins; 36 del; 2 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Wed Aug 10 15:04:03 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Wed, 10 Aug 2022 15:04:03 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: - implement checkpointRestoreLocked - Revert "java core crac corections" This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39. ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/6a2b03b7..c58902f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=20 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=19-20 Stats: 43 lines in 1 file changed: 25 ins; 16 del; 2 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From akozlov at openjdk.org Thu Aug 11 15:47:04 2022 From: akozlov at openjdk.org (Anton Kozlov) Date: Thu, 11 Aug 2022 15:47:04 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: On Wed, 10 Aug 2022 15:04:03 GMT, Ilarion Nakonechnyy wrote: >> pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() > > Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: > > - implement checkpointRestoreLocked > - Revert "java core crac corections" > > This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39. Changes requested by akozlov (Lead). src/hotspot/os/linux/attachListener_linux.cpp line 68: > 66: volatile int LinuxAttachListener::_listener = -1; > 67: bool LinuxAttachListener::_atexit_registered = false; > 68: AttachOperation* LinuxAttachListener::_jcmdOperation = NULL; This is not necessary jcmd. Let's call it `_attach_op` and rename corresponding getter/setter. src/hotspot/os/linux/attachListener_linux.cpp line 359: > 357: > 358: if (_effectively_completed) { > 359: return; An assert here would catch lost output that won't be reported anywhere. Suggestion: assert(st->size() == 0, "no lost output"); return; src/hotspot/os/linux/os_linux.cpp line 5731: > 5729: > 5730: static void print_resources(outputStream * ostream, const char* msg, ... ) { > 5731: outputStream * ou = (ostream == NULL) ? tty : ostream; You can probably move this check into VM_Crac, then get rid of these helper functions as trivial ones src/hotspot/os/linux/os_linux.cpp line 6213: > 6211: > 6212: if (_vm_inited_fds.get_state(i, FdsInfo::CLOSED) != FdsInfo::CLOSED) { > 6213: print_resources(ostream, "OK: inherited from process env"); There is an extra newline between this outputs. This code produced a single line for each FD, now there are two lines. src/hotspot/os/linux/os_linux.cpp line 6243: > 6241: } > 6242: details = sock_details(details, detailsbuf, sizeof(detailsbuf)); > 6243: print_resources(ostream, "issock, details2=\"%s\" ", details); "issock" is redundant, we have already printed "type=socket" src/hotspot/os/linux/os_linux.cpp line 6272: > 6270: trace_cr(ostream, "Checkpoint ..."); > 6271: // If execution comes here, assumme that further all be ok. > 6272: report_ok_to_jcmd(); Starting from this point all output should go somewhere, e.g. `tty`. src/hotspot/share/services/diagnosticCommand.cpp line 1047: > 1045: JavaCallArguments args; > 1046: args.push_long((jlong )output()); > 1047: args.push_long((jlong )LinuxAttachListener::get_jcmdOperation()); You apparently don't need to pass the attach operation through java code to JVM_Checkpoint and eventually os::Linux::checkpoint -- you can get the operation right there with the same call. src/hotspot/share/services/diagnosticCommand.cpp line 1052: > 1050: vmSymbols::checkpointRestereInternal_signature(), &args, CHECK); > 1051: jvalue* jv = (jvalue*) result.get_value_addr(); > 1052: oop str = cast_to_oop(jv->l); oop str = result.get_oop() src/hotspot/share/services/diagnosticCommand.cpp line 1055: > 1053: if (str != NULL) { > 1054: char* out = java_lang_String::as_utf8_string(str); > 1055: if (out) { I believe these conditions are always true. When I run with CRAllowToSkipCheckpoint, I get $ jcmd Main JDK.checkpoint 787055: CR: Skip Checkpoint An exception during a checkpoint operation: Although no exceptions are thrown. src/java.base/share/classes/jdk/crac/Core.java line 266: > 264: try { > 265: checkpointRestoreLocked(outputStream_p, jcmd_p); > 266: } catch (CheckpointException | RestoreException e) { RestoreException may appear only after restore, when the jcmd channel does not exist anymore. Let's report all RestoreExceptions to the console. Can be checked on class TestResource implements Resource { @Override public void afterRestore(Context context) { throw new RuntimeException("restore"); } } ------------- PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Fri Aug 12 11:46:16 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 12 Aug 2022 11:46:16 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: On Wed, 10 Aug 2022 18:33:32 GMT, Anton Kozlov wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - implement checkpointRestoreLocked >> - Revert "java core crac corections" >> >> This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39. > > src/hotspot/os/linux/os_linux.cpp line 6272: > >> 6270: trace_cr(ostream, "Checkpoint ..."); >> 6271: // If execution comes here, assumme that further all be ok. >> 6272: report_ok_to_jcmd(); > > Starting from this point all output should go somewhere, e.g. `tty`. Further it goes to `tty` now: static int checkpoint_restore(int *shmid) { ... if (CRTraceStartupTime) { tty->print_cr("STARTUPTIME " JLONG_FORMAT " restore-native", os::javaTimeNanos()); } if (info.si_code != SI_QUEUE || info.si_int < 0) { tty->print("JVM: invalid info for restore provided: %s", info.si_code == SI_QUEUE ? "queued" : "not queued"); if (info.si_code == SI_QUEUE) { tty->print(" code %d", info.si_int); ... ------------- PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Fri Aug 12 13:47:44 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 12 Aug 2022 13:47:44 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: On Wed, 10 Aug 2022 17:36:38 GMT, Anton Kozlov wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - implement checkpointRestoreLocked >> - Revert "java core crac corections" >> >> This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39. > > src/hotspot/os/linux/os_linux.cpp line 5731: > >> 5729: >> 5730: static void print_resources(outputStream * ostream, const char* msg, ... ) { >> 5731: outputStream * ou = (ostream == NULL) ? tty : ostream; > > You can probably move this check into VM_Crac, then get rid of these helper functions as trivial ones Moved the output stream selection into VM_Crac constructor, but propose still use these functions for hiding check on `CRPrintResourcesOnCheckpoint`, `CRTrace`. ------------- PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Fri Aug 12 16:56:44 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 12 Aug 2022 16:56:44 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: On Thu, 11 Aug 2022 15:06:25 GMT, Anton Kozlov wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - implement checkpointRestoreLocked >> - Revert "java core crac corections" >> >> This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39. > > src/hotspot/os/linux/attachListener_linux.cpp line 359: > >> 357: >> 358: if (_effectively_completed) { >> 359: return; > > An assert here would catch lost output that won't be reported anywhere. > Suggestion: > > assert(st->size() == 0, "no lost output"); > return; A straightforward implementation of this assertion leads to triggering it at restore. Seems like `LinuxAttachListener::write_fully()` doesnt reset the buffer_pos in `bufferedStream`, so the `bufferedStream->size() `is non-zero. Probably, it worth to add `bufferedStream->reset()` to `LinuxAttachOperation::write_operation_result()` ------------- PR: https://git.openjdk.org/crac/pull/10 From akozlov at openjdk.org Mon Aug 15 11:35:12 2022 From: akozlov at openjdk.org (Anton Kozlov) Date: Mon, 15 Aug 2022 11:35:12 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v21] In-Reply-To: References: Message-ID: On Fri, 12 Aug 2022 16:53:12 GMT, Ilarion Nakonechnyy wrote: >> src/hotspot/os/linux/attachListener_linux.cpp line 359: >> >>> 357: >>> 358: if (_effectively_completed) { >>> 359: return; >> >> An assert here would catch lost output that won't be reported anywhere. >> Suggestion: >> >> assert(st->size() == 0, "no lost output"); >> return; > > A straightforward implementation of this assertion leads to triggering it at restore. > Seems like `LinuxAttachListener::write_fully()` doesnt reset the buffer_pos in `bufferedStream`, so the `bufferedStream->size() `is non-zero. > > Probably, it worth to add `bufferedStream->reset()` to `LinuxAttachOperation::write_operation_result()` I believe the data there is the text I observe with the CRAllowToSkipCheckpoint. So the root casue should be fixed (the text that won't be shown). >> src/hotspot/os/linux/os_linux.cpp line 6272: >> >>> 6270: trace_cr(ostream, "Checkpoint ..."); >>> 6271: // If execution comes here, assumme that further all be ok. >>> 6272: report_ok_to_jcmd(); >> >> Starting from this point all output should go somewhere, e.g. `tty`. > > Further it goes to `tty` now: > > > static int checkpoint_restore(int *shmid) { > ... > if (CRTraceStartupTime) { > tty->print_cr("STARTUPTIME " JLONG_FORMAT " restore-native", os::javaTimeNanos()); > } > > if (info.si_code != SI_QUEUE || info.si_int < 0) { > tty->print("JVM: invalid info for restore provided: %s", info.si_code == SI_QUEUE ? "queued" : "not queued"); > if (info.si_code == SI_QUEUE) { > tty->print(" code %d", info.si_int); > ... It's easy to mess up and call `trace_cr(ostream, ...)` accidentally. Let's do here: ostream = tty ------------- PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Tue Aug 16 12:16:30 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 16 Aug 2022 12:16:30 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v22] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: - review notes address - style corrections ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/c58902f3..ba53bdb6 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=21 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=20-21 Stats: 82 lines in 10 files changed: 10 ins; 5 del; 67 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Tue Aug 16 13:01:40 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 16 Aug 2022 13:01:40 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v23] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with one additional commit since the last revision: corrections ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/ba53bdb6..3ec73c92 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=22 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=21-22 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10 From inakonechnyy at openjdk.org Thu Aug 18 16:56:52 2022 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Thu, 18 Aug 2022 16:56:52 GMT Subject: [crac] RFR: Report checkpoint processing to jcmd [v24] In-Reply-To: References: Message-ID: > pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit() Ilarion Nakonechnyy has updated the pull request incrementally with one additional commit since the last revision: replaced strlen with termination character check ------------- Changes: - all: https://git.openjdk.org/crac/pull/10/files - new: https://git.openjdk.org/crac/pull/10/files/3ec73c92..f64b661e Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=10&range=23 - incr: https://webrevs.openjdk.org/?repo=crac&pr=10&range=22-23 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/crac/pull/10.diff Fetch: git fetch https://git.openjdk.org/crac pull/10/head:pull/10 PR: https://git.openjdk.org/crac/pull/10