Call for Discussion: New Project: CRaC
mbien42 at gmail.com
Thu Jul 22 20:51:53 UTC 2021
On 22.07.21 21:17, Anton Kozlov wrote:
>> - How to make the JVM/JDK behave gracefully after "time-jumps".
> I assume there should be no correctness problems, as the time-jump
> does not
> substantially differ from a time spent off-CPU due to OS scheduling.
> internal counters could overflow, but this does not look more than
> just a bug
> that needs fixing.
this might certainly cause some interesting issues, e.g GC ergonomics
getting confused after thinking the last pause lasted 5 days :)
That is another aspect why I believe the only way to properly implement
this is with cooperation of the JVM. CRIU via panama was nice for
experiments but it would never be reliable.
> However, I saw cases when CRIU did restore monotonic clock that broke
> waits, causing 100% of CPU loaded with an improper time limit. After not
> restoring the clock completely, the issue has gone away. That brought
> us again
> to the time jump, which was correctly handled.
if we are thinking of the same bug, this was fixed in linux 5.10
(https://lkml.org/lkml/2020/10/15/582 ) - possibly also backported.
After 5.10 I never encountered 100% load after restoring JVMs again.
More information about the discuss