[crac] RFR: 8362837: [CRaC] jdk/crac/MXBean.java can fail on macOS [v2]
Timofei Pushkin
tpushkin at openjdk.org
Sun Jul 20 17:31:01 UTC 2025
On Sat, 19 Jul 2025 21:06:38 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:
>> The CI runs fine but locally (Linux Fedora 42 x86_64) it is failing for me - for any testcase trying to use OutputAnalyzer for the restore-only part. Such testcases failing for me are for example:
>> - test/jdk/jdk/crac/fileDescriptors/CheckpointWithOpenFdsTest.java
>> - test/jdk/jdk/crac/fileDescriptors/LoggingFileOpenTest.java
>> - test/jdk/jdk/crac/SecureRandom/ReseedTest.java
>>
>> It does not implement the part
>>
>>> close to 0
>>
>> as that is the problem being fixed.
>
> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision:
>
> update copyright year
Maybe I do not understand the changes but to me it looks like they won't function right.
<details>
<summary>Here is how I envisioned it in my original suggestion</summary>
I reduced the tolerance since the measured time should be less now, but maybe it is too bold of a reduction.
diff --git a/test/jdk/jdk/crac/MXBean.java b/test/jdk/jdk/crac/MXBean.java
index 336e3eecd12..0bb6605432e 100644
--- a/test/jdk/jdk/crac/MXBean.java
+++ b/test/jdk/jdk/crac/MXBean.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2022, Azul Systems, Inc. All rights reserved.
+ * Copyright (c) 2022, 2025, Azul Systems, Inc. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -26,6 +26,7 @@
import jdk.crac.*;
import jdk.crac.management.*;
+import jdk.test.lib.Platform;
import jdk.test.lib.crac.CracBuilder;
import jdk.test.lib.crac.CracEngine;
import jdk.test.lib.crac.CracTest;
@@ -35,7 +36,7 @@
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
-import static jdk.test.lib.Asserts.assertLT;
+import static jdk.test.lib.Asserts.*;
/**
* @test
@@ -44,7 +45,12 @@
* @run driver jdk.test.lib.crac.CracTest
*/
public class MXBean implements CracTest {
- static final long TIME_TOLERANCE = 10_000; // ms
+ static final long TIME_TOLERANCE = 1000; // ms
+
+ private static String formatTime(long t) {
+ return DateTimeFormatter.ofPattern("E dd LLL yyyy HH:mm:ss.n").format(
+ Instant.ofEpochMilli(t).atZone(ZoneId.systemDefault()));
+ }
@Override
public void exec() throws CheckpointException, RestoreException {
@@ -55,28 +61,36 @@ public void exec() throws CheckpointException, RestoreException {
System.out.println("UptimeSinceRestore " + cracMXBean.getUptimeSinceRestore());
long restoreTime = cracMXBean.getRestoreTime();
- System.out.println("RestoreTime " + restoreTime + " " +
- DateTimeFormatter.ofPattern("E dd LLL yyyy HH:mm:ss.n").format(
- Instant.ofEpochMilli(restoreTime)
- .atZone(ZoneId.systemDefault())));
+ System.out.println("RestoreTime " + restoreTime + " " + formatTime(restoreTime));
}
@Override
public void test() throws Exception {
- long start = System.currentTimeMillis();
+ final var builder = new CracBuilder().captureOutput(true);
- OutputAnalyzer output = new CracBuilder().engine(CracEngine.SIMULATE)
- .captureOutput(true)
- .startCheckpoint().waitForSuccess().outputAnalyzer();
-
- long restoreUptime = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
- if (restoreUptime < 0 || TIME_TOLERANCE < restoreUptime) {
- throw new Error("bad UptimeSinceRestore: " + restoreUptime);
+ final OutputAnalyzer output;
+ final long restoreStartTime;
+ if (Platform.isLinux()) { // Linux is currently the only platform supporting non-immediate restore
+ final var process = builder.engine(CracEngine.PAUSE).startCheckpoint();
+ output = process.outputAnalyzer();
+ process.waitForPausePid();
+ restoreStartTime = System.currentTimeMillis();
+ builder.doRestore();
+ System.out.println("RestoreStartTime " + restoreStartTime + " " + formatTime(restoreStartTime));
+ } else {
+ output = builder.engine(CracEngine.SIMULATE).startCheckpoint().waitForSuccess().outputAnalyzer();
+ restoreStartTime = -1; // Unknown
}
- long restoreTime = Long.parseLong(output.firstMatch("RestoreTime ([0-9-]+)", 1));
- restoreTime -= start;
+ final long uptimeSinceRestore = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
+ assertGTE(uptimeSinceRestore, 0L, "Bad UptimeSinceRestore");
+ assertLT(uptimeSinceRestore, TIME_TOLERANCE, "UptimeSinceRestore should be close to 0");
- assertLT(Math.abs(restoreTime), TIME_TOLERANCE, "bad RestoreTime: " + restoreTime);
+ final long restoreTime = Long.parseLong(output.firstMatch("RestoreTime ([0-9-]+)", 1));
</details>
test/jdk/jdk/crac/MXBean.java line 79:
> 77: long restoreTimePassed = System.currentTimeMillis() - restoreStart;
> 78: System.err.println("restoreTimePassed="+restoreTimePassed);
> 79: if (restoreTimePassed < 0 || TIME_TOLERANCE < restoreTimePassed) {
What is the purpose of this check? As I understand it, it checks that restore + restored code execution takes time close to zero which I don't think is a reasonable expectation (it passes because the tolerance is high).
Also the test is supposed to be about `CRaCMXBean` which is not involved here.
test/jdk/jdk/crac/MXBean.java line 90:
> 88: long restoreUptime = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
> 89: System.err.println("restoreUptime=" + restoreUptime);
> 90: if (restoreUptime < 0) {
AFAIK we have not had problems with this check, only with the below one. "UptimeSinceRestore" should be close to 0 on all platforms
-------------
PR Review: https://git.openjdk.org/crac/pull/246#pullrequestreview-3036087307
PR Review Comment: https://git.openjdk.org/crac/pull/246#discussion_r2217875547
PR Review Comment: https://git.openjdk.org/crac/pull/246#discussion_r2217876080
More information about the crac-dev
mailing list