[crac] RFR: Fix ordering of invocation on Resources
Anton Kozlov
akozlov at openjdk.org
Fri Apr 21 17:17:14 UTC 2023
On Fri, 21 Apr 2023 09:01:07 GMT, Radim Vansa <duke at openjdk.org> wrote:
> * When Context.beforeCheckpoint throws, invoke Context.afterRestore anyway (otherwise some resources stay in suspended state).
> * Handle Resource.beforeCheckpoint triggering a registration of another resource ** Do not cause deadlock when registering from another thread ** Global resource can register JDKResource
> ** JDKResource can register resource with higher priority ** Other registrations are prohibited
src/java.base/share/classes/jdk/crac/LoggerContainer.java line 9:
> 7: * Therefore, we isolate the logger into a subclass and initialize lazily.
> 8: */
> 9: public class LoggerContainer {
Technically, this is not a part of Coordintation API, but an implementation of the logging for JDK needs. So still does not look like it should be `public`. Having this public in `jdk.internal.crac`, etc is totally fine.
src/java.base/share/classes/jdk/crac/Resource.java line 41:
> 39: * Invoked by a {@code Context} as a notification about checkpoint.
> 40: * Order of checkpoint notification is the reverse order of
> 41: * {@link Context#register(Resource) registration}.
This is correct for the Global Context, but the order can be different in other Contexts.
I don't think Resource javadoc should describe the ordering (otherwise, a Context that uses a different ordering should refuse registration, but the Context does not have any mean to know the resource assumes some particular ordering).
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 54:
> 52: locked = true;
> 53: // This is important for the case of recursive registration
> 54: throwIfCheckpointInProgress(priority);
What is expected handling of this exception? And in the current form the exception is not checked, so in most cases that exception won't be expected. Having that in most cases resources are registered during class or object initialization, those entities will be caught in partially constructed states, likely leaving the whole system stuck.
Would it not better to allow registration, but throw CheckpointException at the end of Context.beforeCheckpoint? That is pretty legit exception that means the checkpoint can be attempted again, and another attempt can be pretty well successful, if all Resources do not throw and no new Resources are registered.
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 117:
> 115: restoreQ.add(r);
> 116: } catch (CheckpointException e) {
> 117: enqueueIfContext(r);
Ohh, I see the problem
> When Context.beforeCheckpoint throws, invoke Context.afterRestore anyway (otherwise some resources stay in suspended state).
So this is proposed to be handled in the parent context? Have you considered fixing that in the child context, run afterRestore for successfully checkpointed resources, before throwing CheckpointException to the parent context?
-------------
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1173903740
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1173905238
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1174003392
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1173995614
More information about the crac-dev
mailing list