[crac] RFR: Support repeated checkpoint and restore operations
Anton Kozlov
akozlov at openjdk.org
Thu Apr 13 15:12:12 UTC 2023
On Thu, 6 Apr 2023 11:51:31 GMT, Radim Vansa <duke at openjdk.org> wrote:
> * VM option CRaCCheckpointTo is recognized when restoring the application (destination can be changed)
> * The main problem for checkpoint after restore was old checkpoint image mmapped to files (CRaC-specific CRIU optimization for faster boot). Before performing checkpoint we transparently swap this with memory using anonymous mapping.
src/hotspot/os/linux/os_linux.cpp line 392:
> 390: next_checkpoint = "";
> 391: }
> 392: return write_check_error(fd, next_checkpoint, strlen(next_checkpoint) + 1);
As CRaCCheckpointTo is not the only option that may require this, we probably want a more generic approach.
Like a new option attribute (RESTORE_UPDATEABLE?)
https://github.com/openjdk/crac/blob/master/src/hotspot/share/runtime/globals.hpp#L57
The implementation should check only UPDATEABLE options are provided along -XX:CRaCRestoreFrom.
This be better done in a separate PR.
src/hotspot/os/linux/os_linux.cpp line 6383:
> 6381: bool ok = !_dry_run;
> 6382:
> 6383: remap_old_imagedir();
VM was not bothered the way CREngine saved the memory content. The mmaping is an implementation detail of the CR mechnism.
Have you considered switching off the mmaping in CRIU in this repeated checkpoint-restore sequence? Assuming we would be able communicate that to CREngine (in CRIU mmaping is an option).
Semantically, this patch propopses to handle a mapping twice, once in CRIU with mmaping and another time in the VM. There are some benefits of doing everything in the VM and having better control over the process. So it would be cleaner to do a practically big part of the memory management in the VM and leaving bootstraping only to the CRIU.
src/hotspot/os/linux/os_linux.cpp line 6707:
> 6705: }
> 6706:
> 6707: // Since putenv does not do its own copy of the strings we need to keep
What is the point of these putenv changes?
src/java.base/unix/native/criuengine/criuengine.c line 304:
> 302: if (WIFEXITED(status)) {
> 303: return WEXITSTATUS(status);
> 304: } else if (WIFSIGNALED(status)) {
WIFSIGNALED is handled on line 306 (310) below. Something looks unnecessary here or there.
-------------
PR Review Comment: https://git.openjdk.org/crac/pull/57#discussion_r1165535817
PR Review Comment: https://git.openjdk.org/crac/pull/57#discussion_r1165677722
PR Review Comment: https://git.openjdk.org/crac/pull/57#discussion_r1165527689
PR Review Comment: https://git.openjdk.org/crac/pull/57#discussion_r1165525555
More information about the crac-dev
mailing list