[crac] RFR: 8367975: [CRaC] Pattern in CRaCCheckpointTo

Timofei Pushkin tpushkin at openjdk.org
Tue Sep 23 14:15:29 UTC 2025


On Tue, 23 Sep 2025 12:47:27 GMT, Radim Vansa <rvansa at openjdk.org> wrote:

>> src/hotspot/share/runtime/crac.cpp line 294:
>> 
>>> 292:   if (!_engine->configure_image_location(image_location)) {
>>> 293:     return JVM_CHECKPOINT_ERROR;
>>> 294:   }
>> 
>> Is there a reason we do this and CPU features recording below so late instead of at the beginning `crac::checkpoint`, for example? The only reason I see is that the timestamps in `crac::record_time_before_checkpoint` and `%t` would be further away in that case (BTW, if we leave this as is we could use the recorded time for the pattern for consistency).
>> 
>> If we did not clear the recorded user data on checkpoint, CPU features could be recorded just once on VM initialization instead of on every checkpoint (though not in `prepare_checkpoint` — that is too early, I believe).
>> 
>> And just to have it more compact since the calls are very closely related
>> Suggestion:
>> 
>>   // Note that CRaCEngine and CRaCEngineOptions are not updated (as documented)
>>   // so we don't need to re-init the whole engine handle.
>>   if (!resolve_image_location(image_location, sizeof(image_location), &ignored) ||
>>       !ensure_image_location(image_location, false) ||
>>       !_engine->configure_image_location(image_location)) {
>>     return JVM_CHECKPOINT_ERROR;
>> }
>
> In this PR I have merely reused the place; I don't see much incentive to move it earlier on. I think this is up to preference between
> 1) doing some (reversible) work on checkpoint and then (possibly) failing due to no access to the directory, vs. 
> 2) creating the directory, (possibly) failing during checkpoint and leaving the filesystem tainted (or implementing cleanup)
> I don't have a strong preference, but I agree that we can collapse the two `if`s.

Even with the current code we can still fail after creating the directory, without writing anything and AFAIK without cleaning up, so I do not think moving it earlier will make this much worse.

But yes, for now let's leave it here.

-------------

PR Review Comment: https://git.openjdk.org/crac/pull/264#discussion_r2372478553


More information about the crac-dev mailing list