Call for Discussion: New Project: CRaC

Ruslan Synytsky rs at jelastic.com
Wed Jul 28 08:16:53 UTC 2021


Hi Anton, thank you for bringing this discussion up. At Jelastic, we have
been using CRIU technology for many years in combination with various
application runtimes, mainly for live migration. The ability to speed up
the startup time of the Java runtime is certainly an interesting feature
for cloud service providers and their customers. We would like to
participate in the development and testing of this improvement. I also
forwarded this thread to Virtuozzo, the team that invented CRIU,
for getting their support if needed.

Tech question: what do you think about the need to adjust the heap size
after restoration from a checkpointed runtime? As I understand, in some
cases, the restored runtimes may need different heap size compared to the
initial runtime from which the state was saved. There is a JEP
https://openjdk.java.net/jeps/8204088 that might be relevant to this
discussion.

Regards
--
Ruslan Synytsky
CEO @ Jelastic Multi-Cloud PaaS


On 7/18/21 7:48 AM, Anton Kozlov wrote:
>* Hi,
*> >* It's been a while since we presented Coordinated Restore at
Checkpoint for the
*>* first time [0].  We are still committed to the idea and
researching this topic.
*> >* Java applications can avoid the long start-up and warm-up by
saving the state
*>* of the Java runtime (snapshot, checkpoint).  The saved state is then used to
*>* start instances fast (restored).  But after the state was saved,
the execution
*>* environment could change.  Also, if multiple instances are started from the
*>* saved state simultaneously, they should obtain some uniqueness, and their
*>* executions should diverge at some point.
*> >* We believe that the practical way to solve these problems is to make Java
*>* applications aware of when the state is saved and restored.  Then an
*>* application will be able to handle environmental changes.  The
application will
*>* also be able to obtain uniqueness from the environment.
*> >* The CRaC project aims to research Java API for coordination
between application
*>* and runtime to save and restore the state.  Runtime should support multiple
*>* ways to save the state: virtual machine snapshot, container snapshot, CRIU
*>* project on Linux, etc.  We hope to come with an API that is
general enough for
*>* any underlying mechanism.  We also plan to explore safety checks
in the API and
*>* runtime, which prevent saving the state if it may not be restored or work
*>* correctly after the restore.
*> >* I propose myself as a Project Lead of the CRaC Project.  If
you're interested
*>* or want to be the committer, please drop me a message.
*> >* A fork of JDK [1] would be a starting point of this project.
*> >* Thanks,
*>* Anton
*> >* [0] https://mail.openjdk.java.net/pipermail/discuss/2020-September/005594.html
<https://mail.openjdk.java.net/pipermail/discuss/2020-September/005594.html>
*>* [1] https://github.com/CRaC/jdk <https://github.com/CRaC/jdk>*


More information about the discuss mailing list