[crac] RFR: Report checkpoint processing to jcmd [v21]

Anton Kozlov akozlov at openjdk.org
Mon Aug 15 11:35:12 UTC 2022


On Fri, 12 Aug 2022 16:53:12 GMT, Ilarion Nakonechnyy <inakonechnyy at openjdk.org> wrote:

>> src/hotspot/os/linux/attachListener_linux.cpp line 359:
>> 
>>> 357: 
>>> 358:   if (_effectively_completed) {
>>> 359:     return;
>> 
>> An assert here would catch lost output that won't be reported anywhere.
>> Suggestion:
>> 
>>     assert(st->size() == 0, "no lost output");
>>     return;
>
> A straightforward implementation of this assertion leads to triggering it at restore. 
> Seems like `LinuxAttachListener::write_fully()` doesnt reset the buffer_pos in `bufferedStream`, so the `bufferedStream->size() `is non-zero.  
> 
> Probably, it worth to add `bufferedStream->reset()` to `LinuxAttachOperation::write_operation_result()`

I believe the data there is the text I observe with the CRAllowToSkipCheckpoint. So the root casue should be fixed (the text that won't be shown).

>> src/hotspot/os/linux/os_linux.cpp line 6272:
>> 
>>> 6270:     trace_cr(ostream, "Checkpoint ...");
>>> 6271:     // If execution comes here, assumme that further all be ok.
>>> 6272:     report_ok_to_jcmd();
>> 
>> Starting from this point all output should go somewhere, e.g. `tty`.
>
> Further it goes to `tty` now: 
> 
> 
> static int checkpoint_restore(int *shmid) {
> ...
>   if (CRTraceStartupTime) {
>     tty->print_cr("STARTUPTIME " JLONG_FORMAT " restore-native", os::javaTimeNanos());
>   }
> 
>   if (info.si_code != SI_QUEUE || info.si_int < 0) {
>     tty->print("JVM: invalid info for restore provided: %s", info.si_code == SI_QUEUE ? "queued" : "not queued");
>     if (info.si_code == SI_QUEUE) {
>       tty->print(" code %d", info.si_int);
> ...

It's easy to mess up and call `trace_cr(ostream, ...)` accidentally. Let's do here:

ostream = tty

-------------

PR: https://git.openjdk.org/crac/pull/10


More information about the crac-dev mailing list