[crac] RFR: Report checkpoint processing to jcmd [v21]

Anton Kozlov akozlov at openjdk.org
Thu Aug 11 15:47:04 UTC 2022


On Wed, 10 Aug 2022 15:04:03 GMT, Ilarion Nakonechnyy <inakonechnyy at openjdk.org> wrote:

>> pass output stream from diagnosticCommand.cpp through java code into os_linux.cpp::VM_crac::doit()
>
> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - implement checkpointRestoreLocked
>  - Revert "java core crac corections"
>    
>    This reverts commit 6a2b03b77f1ee4e6d48a120a14694d8dff9dbf39.

Changes requested by akozlov (Lead).

src/hotspot/os/linux/attachListener_linux.cpp line 68:

> 66: volatile int LinuxAttachListener::_listener = -1;
> 67: bool LinuxAttachListener::_atexit_registered = false;
> 68: AttachOperation* LinuxAttachListener::_jcmdOperation = NULL;

This is not necessary jcmd. Let's call it `_attach_op` and rename corresponding getter/setter.

src/hotspot/os/linux/attachListener_linux.cpp line 359:

> 357: 
> 358:   if (_effectively_completed) {
> 359:     return;

An assert here would catch lost output that won't be reported anywhere.
Suggestion:

    assert(st->size() == 0, "no lost output");
    return;

src/hotspot/os/linux/os_linux.cpp line 5731:

> 5729: 
> 5730: static void print_resources(outputStream * ostream, const char* msg, ... ) {
> 5731:   outputStream * ou = (ostream == NULL) ? tty : ostream;

You can probably move this check into VM_Crac, then get rid of these helper functions as trivial ones

src/hotspot/os/linux/os_linux.cpp line 6213:

> 6211: 
> 6212:     if (_vm_inited_fds.get_state(i, FdsInfo::CLOSED) != FdsInfo::CLOSED) {
> 6213:       print_resources(ostream, "OK: inherited from process env");

There is an extra newline between this outputs. This code produced a single line for each FD, now there are two lines.

src/hotspot/os/linux/os_linux.cpp line 6243:

> 6241:       }
> 6242:       details = sock_details(details, detailsbuf, sizeof(detailsbuf));
> 6243:       print_resources(ostream, "issock, details2=\"%s\" ", details);

"issock" is redundant, we have already printed "type=socket"

src/hotspot/os/linux/os_linux.cpp line 6272:

> 6270:     trace_cr(ostream, "Checkpoint ...");
> 6271:     // If execution comes here, assumme that further all be ok.
> 6272:     report_ok_to_jcmd();

Starting from this point all output should go somewhere, e.g. `tty`.

src/hotspot/share/services/diagnosticCommand.cpp line 1047:

> 1045:   JavaCallArguments args;
> 1046:   args.push_long((jlong )output());
> 1047:   args.push_long((jlong )LinuxAttachListener::get_jcmdOperation());

You apparently don't need to pass the attach operation through java code to JVM_Checkpoint and eventually os::Linux::checkpoint -- you can get the operation right there with the same call.

src/hotspot/share/services/diagnosticCommand.cpp line 1052:

> 1050:                          vmSymbols::checkpointRestereInternal_signature(), &args, CHECK);
> 1051:   jvalue* jv = (jvalue*) result.get_value_addr();
> 1052:   oop str = cast_to_oop(jv->l);

oop str = result.get_oop()

src/hotspot/share/services/diagnosticCommand.cpp line 1055:

> 1053:   if (str != NULL) {
> 1054:       char* out = java_lang_String::as_utf8_string(str);
> 1055:       if (out) {

I believe these conditions are always true. When I run with CRAllowToSkipCheckpoint, I get

$ jcmd Main JDK.checkpoint
787055:
CR: Skip Checkpoint
An exception during a checkpoint operation: 



Although no exceptions are thrown.

src/java.base/share/classes/jdk/crac/Core.java line 266:

> 264:         try {
> 265:             checkpointRestoreLocked(outputStream_p, jcmd_p);
> 266:         } catch (CheckpointException | RestoreException e) {

RestoreException may appear only after restore, when the jcmd channel does not exist anymore. Let's report all RestoreExceptions to the console. 

Can be checked on

class TestResource implements Resource {
    @Override
    public void afterRestore(Context<?> context) {
        throw new RuntimeException("restore");
    }
}

-------------

PR: https://git.openjdk.org/crac/pull/10


More information about the crac-dev mailing list