[crac] RFR: 8362837: [CRaC] jdk/crac/MXBean.java can fail on macOS [v2]

Timofei Pushkin tpushkin at openjdk.org
Sun Jul 20 17:31:01 UTC 2025


On Sat, 19 Jul 2025 21:06:38 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:

>> The CI runs fine but locally (Linux Fedora 42 x86_64) it is failing for me - for any testcase trying to use OutputAnalyzer for the restore-only part. Such testcases failing for me are for example:
>> - test/jdk/jdk/crac/fileDescriptors/CheckpointWithOpenFdsTest.java
>> - test/jdk/jdk/crac/fileDescriptors/LoggingFileOpenTest.java
>> - test/jdk/jdk/crac/SecureRandom/ReseedTest.java
>> 
>> It does not implement the part
>> 
>>> close to 0
>> 
>> as that is the problem being fixed.
>
> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update copyright year

Maybe I do not understand the changes but to me it looks like they won't function right.

<details>

<summary>Here is how I envisioned it in my original suggestion</summary>

I reduced the tolerance since the measured time should be less now, but maybe it is too bold of a reduction.


diff --git a/test/jdk/jdk/crac/MXBean.java b/test/jdk/jdk/crac/MXBean.java
index 336e3eecd12..0bb6605432e 100644
--- a/test/jdk/jdk/crac/MXBean.java
+++ b/test/jdk/jdk/crac/MXBean.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2022, Azul Systems, Inc. All rights reserved.
+ * Copyright (c) 2022, 2025, Azul Systems, Inc. All rights reserved.
  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -26,6 +26,7 @@
 import jdk.crac.*;
 import jdk.crac.management.*;
 
+import jdk.test.lib.Platform;
 import jdk.test.lib.crac.CracBuilder;
 import jdk.test.lib.crac.CracEngine;
 import jdk.test.lib.crac.CracTest;
@@ -35,7 +36,7 @@
 import java.time.ZoneId;
 import java.time.format.DateTimeFormatter;
 
-import static jdk.test.lib.Asserts.assertLT;
+import static jdk.test.lib.Asserts.*;
 
 /**
  * @test
@@ -44,7 +45,12 @@
  * @run driver jdk.test.lib.crac.CracTest
  */
 public class MXBean implements CracTest {
-    static final long TIME_TOLERANCE = 10_000; // ms
+    static final long TIME_TOLERANCE = 1000; // ms
+
+    private static String formatTime(long t) {
+        return DateTimeFormatter.ofPattern("E dd LLL yyyy HH:mm:ss.n").format(
+                Instant.ofEpochMilli(t).atZone(ZoneId.systemDefault()));
+    }
 
     @Override
     public void exec() throws CheckpointException, RestoreException {
@@ -55,28 +61,36 @@ public void exec() throws CheckpointException, RestoreException {
         System.out.println("UptimeSinceRestore " + cracMXBean.getUptimeSinceRestore());
 
         long restoreTime = cracMXBean.getRestoreTime();
-        System.out.println("RestoreTime " + restoreTime + " " +
-            DateTimeFormatter.ofPattern("E dd LLL yyyy HH:mm:ss.n").format(
-                Instant.ofEpochMilli(restoreTime)
-                    .atZone(ZoneId.systemDefault())));
+        System.out.println("RestoreTime " + restoreTime + " " + formatTime(restoreTime));
     }
 
     @Override
     public void test() throws Exception {
-        long start = System.currentTimeMillis();
+        final var builder = new CracBuilder().captureOutput(true);
 
-        OutputAnalyzer output = new CracBuilder().engine(CracEngine.SIMULATE)
-                .captureOutput(true)
-                .startCheckpoint().waitForSuccess().outputAnalyzer();
-
-        long restoreUptime = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
-        if (restoreUptime < 0 || TIME_TOLERANCE < restoreUptime) {
-            throw new Error("bad UptimeSinceRestore: " + restoreUptime);
+        final OutputAnalyzer output;
+        final long restoreStartTime;
+        if (Platform.isLinux()) { // Linux is currently the only platform supporting non-immediate restore
+            final var process = builder.engine(CracEngine.PAUSE).startCheckpoint();
+            output = process.outputAnalyzer();
+            process.waitForPausePid();
+            restoreStartTime = System.currentTimeMillis();
+            builder.doRestore();
+            System.out.println("RestoreStartTime " + restoreStartTime + " " + formatTime(restoreStartTime));
+        } else {
+            output = builder.engine(CracEngine.SIMULATE).startCheckpoint().waitForSuccess().outputAnalyzer();
+            restoreStartTime = -1; // Unknown
         }
 
-        long restoreTime = Long.parseLong(output.firstMatch("RestoreTime ([0-9-]+)", 1));
-        restoreTime -= start;
+        final long uptimeSinceRestore = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
+        assertGTE(uptimeSinceRestore, 0L, "Bad UptimeSinceRestore");
+        assertLT(uptimeSinceRestore, TIME_TOLERANCE, "UptimeSinceRestore should be close to 0");
 
-        assertLT(Math.abs(restoreTime), TIME_TOLERANCE, "bad RestoreTime: " + restoreTime);
+        final long restoreTime = Long.parseLong(output.firstMatch("RestoreTime ([0-9-]+)", 1));



</details>

test/jdk/jdk/crac/MXBean.java line 79:

> 77:             long restoreTimePassed = System.currentTimeMillis() - restoreStart;
> 78:             System.err.println("restoreTimePassed="+restoreTimePassed);
> 79:             if (restoreTimePassed < 0 || TIME_TOLERANCE < restoreTimePassed) {

What is the purpose of this check? As I understand it, it checks that restore + restored code execution takes time close to zero which I don't think is a reasonable expectation (it passes because the tolerance is high).

Also the test is supposed to be about `CRaCMXBean` which is not involved here.

test/jdk/jdk/crac/MXBean.java line 90:

> 88:         long restoreUptime = Long.parseLong(output.firstMatch("UptimeSinceRestore ([0-9-]+)", 1));
> 89:         System.err.println("restoreUptime=" + restoreUptime);
> 90:         if (restoreUptime < 0) {

AFAIK we have not had problems with this check, only with the below one. "UptimeSinceRestore" should be close to 0 on all platforms

-------------

PR Review: https://git.openjdk.org/crac/pull/246#pullrequestreview-3036087307
PR Review Comment: https://git.openjdk.org/crac/pull/246#discussion_r2217875547
PR Review Comment: https://git.openjdk.org/crac/pull/246#discussion_r2217876080


More information about the crac-dev mailing list