[crac] RFR: Fix ordering of invocation on Resources [v3]
Radim Vansa
duke at openjdk.org
Tue May 2 12:31:51 UTC 2023
On Tue, 2 May 2023 10:54:51 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:
>> Does that matter, though? The registration itself will be done for next C/R in all cases. The situation where the registrar entered the `register()` method, added the resource and recorded an exception but this has been silently discarded is equivalent to the situation where it entered the `register()` just after restore was completed. If these are independent you cannot force it to fail reliably, unless there is something in running the resource notifications that depends on the registration (but it's not as all resources have been notified).
>>
>> If the intend was to do something after the current restore the resource should have been registered in a blocking way before the checkpoint or as part of the blocking beforeCheckpoint, not absolutely independently.
>
>> Does that matter, though? The registration itself will be done for next C/R in all cases. The situation where the registrar entered the `register()` method, added the resource and recorded an exception but this has been silently discarded is equivalent to the situation where it entered the `register()` just after restore was completed.
>
> Suppose aR() has some really important side-effect, it's totally necessarily to run that on restore. Then it falls to the category of problems this PR tries to solve (silently ignoring registered resources).
>
>> If these are independent you cannot force it to fail reliably, unless there is something in running the resource notifications that depends on the registration (but it's not as all resources have been notified).
>>
>> If the intend was to do something after the current restore the resource should have been registered in a blocking way before the checkpoint or as part of the blocking beforeCheckpoint, not absolutely independently.
>
> Some form of the guarantee will be good. The blocking registration, I assume it something like register() to finish only when the argument is successfully registered? This looks like a viable approach, e.g. it does specify the behavior around possible race and do not affect "normal" workflow when registration happens way before the checkpoint. Do you see any problem with blocking registration?
I don't think we understand each other. Let's say you have a code like this:
new Thread(() -> {
Resource another = /* ... */;
Core.getGlobalContext().register(another);
}).start();
Core.checkpointRestore();
You insisted on `register()` that does not throw. What implementation could ensure that something eventually makes `Core.checkpointRestore()` throw? There's no guarantee that this will run before the checkpoint completes; the code does not order these in any way. The only result the user can expect is that the resource will be registered, eventually.
Had you added a `CountDownLatch` triggered after calling `register()` and waited for at last somewhere in one of the `beforeCheckpoint` methods the race would not happen.
-------------
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1182481551
More information about the crac-dev
mailing list