[crac] RFR: Portable C/R [v2]
Timofei Pushkin
duke at openjdk.org
Fri Jun 7 14:17:45 UTC 2024
On Tue, 4 Jun 2024 17:20:59 GMT, Timofei Pushkin <duke at openjdk.org> wrote:
>> Implements a proof-of-concept "portable mode" for CRaC: a checkpoint-restore mechanism that does not rely on platform-dependent tools like CRIU instead saving VM state in terms of the Java specification (with some HotSpot specifics) — this allows to restore the saved state on machines with different CPU architecture and OS. A demo is available [here](https://github.com/TimPushkin/portable-crac-demo).
>>
>> Expected downsides compared to the traditional CRaC are restrictions on platform-dependent code usage (e.g. at the moment of checkpoint no native methods can be executing, off-heap memory obtained via `sun.misc.Unsafe` should be released) and somewhat slower restoration speeds (because platform-dependent state, including JIT-compiled code, should be re-created). In the future, Project Leyden may help with the latter.
>>
>> The mechanism is implemented as an internal part of HotSpot, it gets activated when an empty `CREngine` VM option is passed (i.e. `-XX:CREngine=""`, this is a temporary solution). Main implementation details are described in [this doc](https://github.com/TimPushkin/crac/blob/portable-cr/trimmed/doc/portable-cr.md).
>>
>> Since this is a proof-of-concept implementation, it currently lacks some important features. E.g. at the moment some early-initialized classes are not restored, most of JDK classes have not yet been properly adapted, checkpointing via `jcmd` is not fully supported, additional tests and optimizations are needed.
>
> Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision:
>
> Update full name
Thanks for the recommendations.
Regarding the problems encountered with Spring, these are not caused by the approach itself but by the fact that some features are yet unfinished. For example, states of some system classes used very early in the VM initiation process, like `java.lang.System`, are not yet being restored (i.e. they get a new state) — this does not cause issues for many programs including Jetty, but in general this is of course incorrect and breaks Spring in particular. This is the thing I'm planning to fix first, but wanted to share the work in the current state since a lot has been done already.
-------------
PR Comment: https://git.openjdk.org/crac/pull/155#issuecomment-2154937106
More information about the crac-dev
mailing list