[crac] RFR: Correct System.nanotime() value after restore

Radim Vansa duke at openjdk.org
Tue Mar 28 07:17:04 UTC 2023


On Mon, 27 Mar 2023 19:18:19 GMT, Ashutosh Mehra <duke at openjdk.org> wrote:

>> There are various places both inside JDK and in libraries that rely on monotonicity of `System.nanotime()`. When the process is restored on a different machine the value will likely differ as the implementation provides time since machine boot. This PR records wall clock time before checkpoint and after restore and tries to adjust the value provided by nanotime() to reasonably correct value.
>
> I understand this change is trying to adjust the return value of any calls made to System.nanoTime() after restore to take into account the elapsed time between checkpoint and restore.
> In principle this idea is very similar to CLOCK_BOOTTIME [0] which takes into account the time system has spent in suspend state.
> I came across an issue [1] in golang which was suggested to replace CLOCK_MONOTONIC with CLOCK_BOOTTIME but it was considered ill-advised and was closed. There was even a linux kernel patch [2] to make CLOCK_MONOTONIC behave as CLOCK_BOOTTIME which was reverted [3] immediately because it broke many of the existing userland softwares.
> Considering this precedent, I think we should also consider the impact of this change on existing frameworks and libraries, that is, can this change break the existing code patterns that use System.nanoTime()?
> 
> [0] https://man7.org/linux/man-pages/man3/clock_gettime.3.html
> [1] https://github.com/golang/go/issues/24595
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6ed449afdb38f89a7b38ec50e367559e1b8f71f
> [3] https://www.spinics.net/lists/linux-tip-commits/msg43709.html

@ashu-mehra The main point of this change is *not* about whether the time being suspended should be observed or not; I am rather worried about moving the process to another system and getting totally nonsense results from nanotime diffs, and broken code.

I understand that observing the suspended can be a subject to further discussion, though I incline towards the visibility of such interval, as implemented here. Since this fixes some use cases and does not change what wouldn't be broken (on a single system the paused time with system running would be observed anyway unless the whole machine was suspended) I suggest to merge this as-is without considering the topic resolved forever.

The fact that some timers use this as the time source rather than wall clock time is an implementation detail. Applications performing checkpoint and restore will require some tweaks to perform correctly and I intend to work on ways to deal with timers.

-------------

PR Comment: https://git.openjdk.org/crac/pull/53#issuecomment-1486336186


More information about the crac-dev mailing list