[crac] RFR: 8367975: [CRaC] Pattern in CRaCCheckpointTo

Timofei Pushkin tpushkin at openjdk.org
Tue Sep 23 10:08:47 UTC 2025


On Thu, 18 Sep 2025 09:22:03 GMT, Radim Vansa <rvansa at openjdk.org> wrote:

> Add support for pattern in `CRaCCheckpointTo` VM option.

The patterns should be documented in `java.md` and maybe mentioned in `globals.hpp`.

Also since the patterns are handled in shared code it would make sense to have OS pattern too.

src/hotspot/cpu/x86/vm_version_x86.hpp line 615:

> 613:     }
> 614: 
> 615:     int print_numbers_hexonly(char *buf_orig, size_t buflen) const {

Can be merged into `print_numbers` with a parameter like `bool hex_only` (false by default, true where this was used), but I don't insist

src/hotspot/share/runtime/crac.cpp line 107:

> 105:   if (iso8601) {
> 106:     if (width >= 0 || zero_pad) {
> 107:       log_warning(crac)("CRaCCheckpointTo=%s cannot set zero_pad or width for ISO-8601 time", CRaCCheckpointTo);

`zero_pad` is a name of a variable not known to the user.
Suggestion:

      log_warning(crac)("CRaCCheckpointTo=%s cannot zero-pad or set width for ISO-8601 time", CRaCCheckpointTo);


I would also write "Cannot <do something> in CRaCCheckpointTo=%s" here and in other similar places below but that is a personal preference.

src/hotspot/share/runtime/crac.cpp line 112:

> 110:     time_t time = timeMillis / 1000;
> 111:     struct tm tms;
> 112:     if (gmtime_r(&time, &tms) == NULL) {

There is `os::gmtime_pd` which is `gmtime_r` on POSIX and `gmtime` on Windows. Not sure why it exists but I would suggest to use it just in case.

src/hotspot/share/runtime/crac.cpp line 112:

> 110:     time_t time = timeMillis / 1000;
> 111:     struct tm tms;
> 112:     if (gmtime_r(&time, &tms) == NULL) {

Suggestion:

    if (gmtime_r(&time, &tms) == nullptr) {

src/hotspot/share/runtime/crac.cpp line 129:

> 127:     return snprintf(buf, buflen, "%*zu", width, size);
> 128:   } else {
> 129:     const char *suffixes[] = { "k", "M", "G" };

I believe without  `static` this is put on stack
Suggestion:

    static constexpr const char *suffixes[] = { "k", "M", "G" };

src/hotspot/share/runtime/crac.cpp line 139:

> 137: }
> 138: 
> 139: #define check_no_width() do { \

Suggestion:

// Also implies no zero-padding
#define check_no_width() do { \

src/hotspot/share/runtime/crac.cpp line 148:

> 146:     int ret = statement; \
> 147:     if (ret < 0 || (size_t) ret > buflen) { \
> 148:       log_error(crac)("Error interpolating CRaCCheckpointTo=%s (@%zu, buffer size %zu)", CRaCCheckpointTo, si, buflen); \

To me it is not completely obvious what @ means here, if it is the position in the interpreted string then I would prefer "position"/"location"

src/hotspot/share/runtime/crac.cpp line 155:

> 153:   } while (false)
> 154: 
> 155: bool crac::resolve_image_location(char *buf, size_t buflen, bool *fixed) {

I expect "resolve" to access the filesystem somehow. Also it is only meant for the checkpoint location. So how about  `interpolate_checkpoint_image_location`?

src/hotspot/share/runtime/crac.cpp line 155:

> 153:   } while (false)
> 154: 
> 155: bool crac::resolve_image_location(char *buf, size_t buflen, bool *fixed) {

`buflen` is not checked when we write to the buffer directly, not via stdlib functions

src/hotspot/share/runtime/crac.cpp line 159:

> 157:   for (size_t si = 0; ; si++) {
> 158:     char c = CRaCCheckpointTo[si];
> 159:     if (c == '%') {

I would suggest handling the simple case first with continue to reduce the nesting:


    if (c != '%') {
      *(buf++) = c;
      --buflen;
      if (c == '\0') {
        break;
      } else {
        continue;
      }
    }
    // handle % with reduce nesting

src/hotspot/share/runtime/crac.cpp line 168:

> 166:       }
> 167:       size_t width_start = si;
> 168:       while (CRaCCheckpointTo[si] >= '1' && CRaCCheckpointTo[si] <= '9') {

Since 0 is not allowed, I believe currently it is not possible to specify width 10, for example?

src/hotspot/share/runtime/crac.cpp line 177:

> 175:       } else if (si > width_start) {
> 176:         width = atoi(&CRaCCheckpointTo[width_start]);
> 177:       }

Since width is not set in the first case:
Suggestion:

      if (zero_pad && width_start == si) {
        log_error(crac)("CRaCCheckpointTo=%s contains a pattern with zero padding but no length", CRaCCheckpointTo);
        return false;
      }
      const int width = si > width_start ? atoi(&CRaCCheckpointTo[width_start]) : -1;

src/hotspot/share/runtime/crac.cpp line 188:

> 186:           check_no_width();
> 187: #ifndef ARCHPROPNAME // defined by build scripts
> 188: # define ARCHPROPNAME "unknown"

Why not assert it is defined?
Suggestion:

#error ARCHPROPNAME macro must be defined

src/hotspot/share/runtime/crac.cpp line 192:

> 190:         check_retval(snprintf(buf, buflen, "%s", ARCHPROPNAME));
> 191:         break;
> 192:       case 'f': { // CPU features

Should check no width/padding

src/hotspot/share/runtime/crac.cpp line 198:

> 196:             buf += ret;
> 197:             buflen -= ret;
> 198:           } // otherwise just empty string

I think we should print a warning otherwise

src/hotspot/share/runtime/crac.cpp line 203:

> 201:       case 'u': { // Random UUID (v4)
> 202:           check_no_width();
> 203:           *fixed = false; // FIXME?

Looks correct to me, what would you want to fix?

src/hotspot/share/runtime/crac.cpp line 208:

> 206:           check_retval(snprintf(buf, buflen, "%08x-%04x-4%03x-a%03x-%04x%08x",
> 207:             static_cast<u4>(os::random()), time_mid_high >> 16, time_mid_high & 0xFFF,
> 208:             seq_and_node_low & 0xFFF, seq_and_node_low >> 16, static_cast<u4>(os::random())));

Not that it matters much, but I believe the 4th part does not always have to start with "a" according to the spec

Suggestion:

          check_retval(snprintf(buf, buflen, "%08x-%04x-4%03x-%04x-%04x%08x",
            static_cast<u4>(os::random()), time_mid_high >> 16, time_mid_high & 0xFFF,
            (seq_and_node_low & 0x3FFF | 0x8000), seq_and_node_low >> 16, static_cast<u4>(os::random())));

src/hotspot/share/runtime/crac.cpp line 225:

> 223:         check_retval(append_time(buf, buflen, c == 'r', zero_pad, width,
> 224:             os::javaTimeMillis() - 1000 * (os::elapsed_counter_since_restore() / os::elapsed_frequency())));
> 225:         break;

Shouldn't we use the native implementations of `RuntimeMXBean.getStartTime()` and `CRaCMXBean.getRestoreTime()` respectively? I think it is reasonable to expect these to be consistent. But if there are reasons not to do that it is ok with me to leave this as is, we just won't document them to be consistent in such case.

src/hotspot/share/runtime/crac.cpp line 257:

> 255: #undef check_retval
> 256: 
> 257: static bool ensure_image_location(const char *path, bool rm) {

Since this is checkpoint-specific:
Suggestion:

static bool ensure_checkpoint_image_location(const char *path, bool rm) {


Or maybe make it also handle the restore case have this in one place.

src/hotspot/share/runtime/crac.cpp line 294:

> 292:   if (!_engine->configure_image_location(image_location)) {
> 293:     return JVM_CHECKPOINT_ERROR;
> 294:   }

Is there a reason we do this and CPU features recording below so late instead of at the beginning `crac::checkpoint`, for example? The only reason I see is that the timestamps in `crac::record_time_before_checkpoint` and `%t` would be further away in that case (BTW, if we leave this as is we could use the recorded time for the pattern for consistency).

If we did not clear the recorded user data on checkpoint, CPU features could be recorded just once on VM initialization instead of on every checkpoint (though not in `prepare_checkpoint` — that is too early, I believe).

And just to have it more compact since the calls are very closely related
Suggestion:

  // Note that CRaCEngine and CRaCEngineOptions are not updated (as documented)
  // so we don't need to re-init the whole engine handle.
  if (!resolve_image_location(image_location, sizeof(image_location), &ignored) ||
      !ensure_image_location(image_location, false) ||
      !_engine->configure_image_location(image_location)) {
    return JVM_CHECKPOINT_ERROR;
}

src/hotspot/share/runtime/crac_engine.hpp line 42:

> 40: class CracEngine : public CHeapObj<mtInternal> {
> 41: public:
> 42:   explicit CracEngine();

`explicit` is not needed now. It was there to adhere to https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rc-explicit.

test/jdk/jdk/crac/PathPatternTest.java line 78:

> 76:         long pid = runCheckpoints("foo/cr_%%_%a_%p_%05c_%3m_%m_%g", false);
> 77:         File f = new File(String.format("foo/cr_%%_%s_%d_%05d_%d_1G_1",
> 78:                 Platform.getOsArch(), pid, Runtime.getRuntime().availableProcessors(), 1024 * 1024 * 1024));

FYI there's `jdk.test.lib.Unit.G` which has `size == 1024 * 1024 * 1024`, and there's other common units

test/jdk/jdk/crac/PathPatternTest.java line 79:

> 77:         File f = new File(String.format("foo/cr_%%_%s_%d_%05d_%d_1G_1",
> 78:                 Platform.getOsArch(), pid, Runtime.getRuntime().availableProcessors(), 1024 * 1024 * 1024));
> 79:         assertTrue(f.exists(), "Expect" + f);

Suggestion:

        assertTrue(f.exists(), "Expect " + f);

test/jdk/jdk/crac/PathPatternTest.java line 144:

> 142:     }
> 143: 
> 144:     private static void deleteDir(String dir) throws IOException {

`clearDir` would fit better. Also it would be simpler to use `jdk.test.lib.util.FileUtils.deleteFileTreeWithRetry(Path)` and re-create the directory itself.

test/jdk/jdk/crac/PathPatternTest.java line 170:

> 168: 
> 169: 
> 170: }

Suggestion:

    }
}

-------------

PR Review: https://git.openjdk.org/crac/pull/264#pullrequestreview-3252522721
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368185546
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368563445
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368381612
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368382210
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368494050
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371467010
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368611060
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368636997
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368724681
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368709528
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368833005
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368769679
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368871268
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368888060
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368883542
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371543305
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371537367
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371599384
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371634172
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371661130
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2368240041
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371769977
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371772080
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371776032
PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2371797448


More information about the crac-dev mailing list