[crac] RFR: Fix ordering of invocation on Resources [v3]
Anton Kozlov
akozlov at openjdk.org
Fri Apr 28 14:47:23 UTC 2023
On Wed, 26 Apr 2023 15:22:23 GMT, Radim Vansa <duke at openjdk.org> wrote:
>> * keeps the original handling of exceptions: afterRestore is called even if beforeCheckpoint throws
>> * allows to register a resource in a context that did not start beforeCheckpoint invocations yet
>> * registering resource in previous/running context fails the checkpoint but does not trigger exception immediately
>> * instead this will be one of the recorded exceptions and the resource has a chance to fire next time
>> * allowed registration of resources can be invoked from other thread without deadlock; illegal registration can deadlock, though
>
> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision:
>
> - More fine-grained synchronization
> - Rework context ordering (round 2)
>
> * call afterRestore even if beforeCheckpoint throws
> * registering resource in previous/running context does not trigger exception immediatelly
> ** instead this will be one of the recorded exceptions and the resource has a chance to fire next time
> * we don't guarantee threads not deadlocking when trying to register a resource, though
src/java.base/share/classes/jdk/crac/Core.java line 104:
> 102: * Order of invoking {@link Resource#afterRestore(Context)} is the reverse
> 103: * of the order of {@link Resource#beforeCheckpoint(Context) checkpoint notification},
> 104: * hence the same as the order of {@link Context#register(Resource) registration}.
How about moving the Global Context description from the package level here (removing there). In javax.crac it should be fine to link to here IMO.
src/java.base/share/classes/jdk/crac/Core.java line 118:
> 116: public static synchronized boolean isRestoring() {
> 117: return restoring;
> 118: }
Undocumented public API, I'm not sure it should be public.
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 42:
> 40: for (Throwable t : suppressed) {
> 41: Core.recordException(t);
> 42: }
Unwrap Checkpoint/RestoreException only?
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 55:
> 53: restoreQ.add(resource);
> 54: try {
> 55: resource.beforeCheckpoint(semanticContext());
Does this mean a Resource may get another Context and not the one to which it has been registered? This may be very unexpected for the Resource implementation.
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 59:
> 57: recordExceptions(e);
> 58: } catch (Exception e) {
> 59: Core.recordException(e);
Why is there is the distinction? I think we should throw all exceptions from the context, rather than publishing them to a central store, otherwise the parent Context (if any), won't be able to do anything about those.
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 75:
> 73: restoreQ = new ArrayList<>();
> 74: runBeforeCheckpoint();
> 75: Collections.reverse(restoreQ);
Smelly code, restoreQ should be maintained either here or in runBeforeCheckpoint()
src/java.base/share/classes/jdk/crac/impl/AbstractContextImpl.java line 78:
> 76: }
> 77:
> 78: protected abstract void runBeforeCheckpoint();
This is intended to be overwritten (becomes a part of the class interface). The intent behind the separate method is not evident. Corresponding runAfterRestore is private though.
After AbstractContexImpl has lost parameter P and comparator, a distinction between AbstractContexImpl and OrderedContext has been lost. Merging AbstractContexImpl into OrderedContext likely will provided clearer code.
src/java.base/share/classes/jdk/crac/impl/OrderedContext.java line 60:
> 58: // It is possible that something registers to us during restore but before
> 59: // this context's afterRestore was called.
> 60: if (checkpointing && !Core.isRestoring()) {
There is a small window between all beforeCheckpoint() are finished and checkpoint. In this window we'll call setModified(). An there is another window between restore and afterRestore() processing is started, where we'll won't call setModified(). Getting the exception or not will be a result of a race between checkpoint/restore (actual event with near-zero duration, without calling Resources) and registration.
A Resource may also have an empty beforeCheckpoint() and some afterRestore() clean up. We'll register the resource for the next round of checkpoint/restore and will be silence about newly registered Resource. But since beforeCheckpoint() is empty, the original intent could be to do something useful on restore, which won't be done.
src/java.base/share/classes/jdk/crac/impl/PriorityContext.java line 21:
> 19: // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> 20: // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> 21: // POSSIBILITY OF SUCH DAMAGE.
Use standard copyright please
-------------
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180390202
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180303950
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180311461
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180326193
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180320313
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180318502
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180322259
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180374418
PR Review Comment: https://git.openjdk.org/crac/pull/60#discussion_r1180313813
More information about the crac-dev
mailing list