[crac] RFR: Add Checkpoint timeout

KIRIYAMA Takuya duke at openjdk.org
Fri Dec 8 13:57:52 UTC 2023


On Fri, 8 Dec 2023 09:14:19 GMT, KIRIYAMA Takuya <duke at openjdk.org> wrote:

> Java process sometimes hangs when checkpoint for some reasons.
> For example, this problems occurs if you specify certain options for CRAC_CRIU_OPTS.
> 
> 
> # export CRAC_CRIU_OPTS=-V
> # java -XX:CRaCCheckpointTo=/work/cp CRACTest
> CR: Checkpoint ...
> 
> CRACTest process is not killed and is waiting for checkpoint.
> 
> 
> # ls /work/cp
> cppath  perfdata
> 
> 
> To avoid this problem, I want to add the checkpoint timeout.
> Can I submit a pull request to this repository? I would like you to review this change.

Thank you for your reply. 
As you commented, there are cases in which the checkpointed JVM waits indefinitely. At that time, the JVM does not accept jstack and other tools, so it is difficult to know its status from the outside. For a typical checkpoint failure, criu returns non-zero and Java throws an exception. Java applications can handle checkpoint failures by handling exceptions. However, the reported case are difficult to handle in the application and I think it should be avoided.
Other than specifying CRAC_CRIU_OPTS, I have not found any cases where this problem occurs. I'll see if there are any risks with more practical use.

-------------

PR Comment: https://git.openjdk.org/crac/pull/147#issuecomment-1847198767


More information about the crac-dev mailing list