[crac] RFR: Fix CPUFeatures crash on new->old CPU

Radim Vansa rvansa at openjdk.org
Fri Oct 27 13:20:06 UTC 2023


On Wed, 25 Oct 2023 15:09:58 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:

> - reproducible on i7-1165G7 "qemu-kvm -cpu host" for a checkpoint and "qemu-kvm -cpu SandyBridge" for its restore

src/hotspot/share/runtime/crac.cpp line 526:

> 524:   // crac_restore_finalize() may terminate the process if we run on (older) CPU where glibc string functions may crash.
> 525:   // The flag is stored separately as all the code of this function below is difficult to implement without the string functions.
> 526:   bool IgnoreCPUFeatures_local;

In general, please use snake_case for local variables, even though it refers to VM option.

Rather than individual 'bits', can we make this at least part of the `header` structure? We could read that one without using any string functions, and then the variable-size rest of the data (which requires string funcs). The size being read can be part of the `header` structure rather than stat-ing the file.

Also, there is `read_all` function in `crac_linux.cpp` (I am moving that to a shared place in some PRs) that deals better with `::read` returning only partial result.

src/hotspot/share/runtime/crac.cpp line 532:

> 530:   }
> 531:   if (!IgnoreCPUFeatures_local) {
> 532:     VM_Version::crac_restore_finalize();

Is this function idempotent? It's called from `VM_Crac::doit()` as well (and it should be since reading the shm is optional), but could calling it twice cause any trouble?

src/hotspot/share/runtime/crac_structs.hpp line 225:

> 223:     assert(id > 0, "id is expected to be a PID and therefore > 0");
> 224:     char *d = _path;
> 225:     const char prefix[] = "/tmp/cracshm.";

Have you investigated any other option for transferring the data? While both `/tmp/` and `/dev/shm` are temp files, using `shm_open` looks like the go-to approach for sharing data. I dislike hardcoding the `/tmp` path, though it might be viable since perfMemory also has it hardcoded? In any case, this code is not linux-specific (this should be implemented in `crac_posix.cpp`, I suppose).

-------------

PR Review Comment: https://git.openjdk.org/crac/pull/134#discussion_r1374550270
PR Review Comment: https://git.openjdk.org/crac/pull/134#discussion_r1374553073
PR Review Comment: https://git.openjdk.org/crac/pull/134#discussion_r1374526685


More information about the crac-dev mailing list