[crac] RFR: Fix ordering of invocation on Resources

Anton Kozlov akozlov at openjdk.org
Mon Apr 24 09:48:16 UTC 2023


On Mon, 24 Apr 2023 08:59:54 GMT, Radim Vansa <duke at openjdk.org> wrote:

>> I admit that Resource is usually registered implicitly, and the current state (silently ignore newly registered Resources when checkpoint is in progress) is not adequate and may leads to some state leaking to the image, while that is not supposed to (if Resource would be registered some time before).
>> 
>> With unchecked ISE, the user will have to somehow ensure that registration likley being implicit, changing the code to break the abstraction above that Resource/registration.
>> 
>> With the check for new Resources, we'll detect exactly the situtaion user will likely like to know (if everything in the state ready to be stored to the image), and if not, a recovery will be trying again. Every Resource may throw  CheckpointException for their own reasons, and CheckpointException("timeout for waiting some critical clean up, try again") is expected to be something very possible and is expected.
>
> Generally speaking, your proposal to try performing the checkpoint until it succeeds is more permissive, as users can back-off from a bad/fragile design through retrying, rather than being forced to correct the dependencies between components. In the past it seemed to me that you were more on the "good design required" rather than "practical" side of things - please don't take this as any form of "ad hominem" argument, I would just like to understand why you opt for such solution here. In my view a timeout/IO error are a totally different category of error.

The CheckpointException is a CRaC-specific, deliberately thrown exception to communicate the problem with the state to the requester of the checkpoint. The exception may suggest rewriting the code, or another attempt. We don't care a lot about the preceise meaning, we just need to deliver the exception. Although lesser number of exceptions is better of course.

-------------

PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1175040797


More information about the crac-dev mailing list