[crac] RFR: Print better diagnostics for spawning CRIU

Jan Kratochvil jkratochvil at openjdk.org
Fri Aug 4 06:38:02 UTC 2023

On Wed, 2 Aug 2023 17:33:28 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> It was always a mystery why it does not work this time as there are no error messages printed.
>> Still may system call errors are currently not reported but those errors at least do not happen commonly.
>> In fact `os::exec_child_process_and_wait` should be more verbose but that is already a part of OpenJDK so I did not modify it.
>> It would be much easier if `criuengine.c` was in C++. I was considering to switch it but then I wrote it already in C.
>> I understand this code will become obsolete after planned integration of CRIU into JVM but it may not yet be soon enough.
> src/java.base/unix/native/criuengine/criuengine.c line 115:
>> 113: }
>> 114: 
>> 115: static char *path_abs(const char *rel) {
> Is not this exactly `realpath()`?

Not really as `realpath()` is resolving symlinks while this function does not. I agree the code is ugly, I originally expected it will be more nice in C++ but then I realized [std::filesystem::path](https://en.cppreference.com/w/cpp/filesystem/path) is `C++17` only.

> src/java.base/unix/native/criuengine/criuengine.c line 131:
>> 129:     free(rel1_abs);
>> 130:     return retval;
>> 131: }
> This function is closer to path_from, rather than to abs. But by using just path_join/path_from, we'll provide a shorter yet still correct path to the log (the only use of the function), so I propose to drop it.

It became now simple enough in C++, is it OK?

> src/java.base/unix/native/criuengine/criuengine.c line 215:
>> 213:     if (child != wait(&status)) {
>> 214:         criu_cmdline = join_args(args);
>> 215:         fprintf(stderr, "Error waiting for spawned CRIU \"%s\": %m\n", criu_cmdline);
> Calling fprintf repeatedly multiple time would save us from too complex join_args(), right? Could you do this?

With C++ `std::string` it has been solved, hasn't it?

> src/java.base/unix/native/criuengine/criuengine.c line 227:
>> 225:             criu_log = path_abs2(imagedir, log_local);
>> 226:             fprintf(stderr, "Spawned CRIU \"%s\" has not properly exited: exit code %d - check %s\n", criu_cmdline, WEXITSTATUS(status), criu_log);
>> 227:         }
> Could you unify these handlings? There is a slight difference in each code path.

Either it got fixed by C++ or describe more the difference, please.


PR Review Comment: https://git.openjdk.org/crac/pull/97#discussion_r1284036451
PR Review Comment: https://git.openjdk.org/crac/pull/97#discussion_r1284036765
PR Review Comment: https://git.openjdk.org/crac/pull/97#discussion_r1284037517
PR Review Comment: https://git.openjdk.org/crac/pull/97#discussion_r1284038029

More information about the crac-dev mailing list