[crac] RFR: PID adjustment on checkpoint [v8]
Sergey Nazarkin
snazarki at openjdk.org
Tue Jun 27 12:09:32 UTC 2023
On Tue, 27 Jun 2023 11:45:56 GMT, Roman Marchenko <rmarchenko at openjdk.org> wrote:
>> On restore, there might be PID value conflicts because of small PID values, if it was checkpoint'ed in a container. Therefore, when checkpointing in a container, we need to move PID value for new processes to a particular value to avoid conflicts on restore.
>>
>> See https://github.com/CRaC/example-lambda/blob/master/checkpoint.cmd.sh#L8 for example.
>>
>> This PR contains implemented functionality similar to the example above, making this work out of the box. By default, if checkpointing, PID is adjusted only if Java's PID is 1 that means Java is run in a container. To adjust PID manually for a checkpoint'ed process, `-XX:CRaCMinPid=<value>` option should be used along with `CRaCCheckpointTo`. Min `CRaCMinPid` value is 1, max `CRaCMinPid` value is `UINT_MAX`, but it is actually limited by OS's pid_max.
>
> Roman Marchenko has updated the pull request incrementally with two additional commits since the last revision:
>
> - Fixing review comments
> - Revert "Now CracMinPid option must be set explicitly to adjust PID"
>
> This reverts commit b3d66800d6ea441fb86498fdbb229400747eb44f.
Changes requested by snazarki (no project role).
src/java.base/share/native/launcher/main.c line 122:
> 120: const int len = strlen(checkpoint_arg);
> 121: if (0 == strncmp(arg, checkpoint_arg, len)) {
> 122: crac_min_pid = atoi(arg + len);
atoi is not recommended to use anymore as it returns 0 on error.
"It is recommended to instead use the strtol() and strtoul() family of functions in new programs."
src/java.base/share/native/launcher/main.c line 195:
> 193: }
> 194: const char *last_pid_filename = "/proc/sys/kernel/ns_last_pid";
> 195: const int last_pid_file = open(last_pid_filename, O_WRONLY|O_CREAT|O_TRUNC, 0666);
O_CREAT looks redundant.
And this file requires special capability for the process. Shouldn't we address this in the doc?
src/java.base/share/native/launcher/main.c line 200:
> 198: }
> 199: int res = 0;
> 200: if (0 > write(last_pid_file, buf, len)) {
I'd compare with len, just to handle all "write" return values
-------------
PR Review: https://git.openjdk.org/crac/pull/86#pullrequestreview-1500668309
PR Review Comment: https://git.openjdk.org/crac/pull/86#discussion_r1243612548
PR Review Comment: https://git.openjdk.org/crac/pull/86#discussion_r1243617425
PR Review Comment: https://git.openjdk.org/crac/pull/86#discussion_r1243620761
More information about the crac-dev
mailing list