[crac] RFR: 8362837: [CRaC] jdk/crac/MXBean.java can fail on macOS
Timofei Pushkin
tpushkin at openjdk.org
Mon Jul 21 09:41:13 UTC 2025
On Mon, 21 Jul 2025 09:12:21 GMT, Radim Vansa <rvansa at openjdk.org> wrote:
>> GitHub Actions should run in your repo just like on PRs here, so you should not need to create PRs to test CI. Except for whitespace checking...
>
> @TimPushkin I don't really get the comment
>
>> But looking at the test I would expect it to be fragile: it measures the time from the start of the checkpointed process to the start of its restore and wants it to be 0, and since this is not a reasonable thing to expect it sets a huge tolerance of 10 seconds.
>
> On platforms that don't propagate the 'restore time', it is expected that the current time would be used (which is somewhat later than 'restore initiation'). The test asserts that from the point where we really invoke the restore up to this point it's less than 10 seconds - so asking a trivial process to be restored withing 10 seconds. I thought that this is a reasonable time, though we can make it even 60 seconds if we know that CI can be extra sluggish.
@rvansa In our downstream CI it looks like 10 seconds is not enough for MacOS, that is why I proposed to rework the test to use an engine that allows to propagate restore time on platforms where such engine exists (Linux) and not use `TIME_TOLERANCE` at all for `getRestoreTime()` on platforms where there is no such engine (MacOS, Windows). To me it looks like a better solution than just increasing the tolerance because that would make the `getUptimeSinceRestore()` check on all platforms and `getRestoreTime()` check on non-MacOS platforms more tolerant then necessary (even 10s is too tolerant for Linux in my opinion)
-------------
PR Comment: https://git.openjdk.org/crac/pull/246#issuecomment-3095922923
More information about the crac-dev
mailing list