From duke at openjdk.java.net Thu Jan 6 14:58:48 2022 From: duke at openjdk.java.net (Dan Heidinga) Date: Thu, 6 Jan 2022 14:58:48 GMT Subject: [crac] RFR: JDKResource priorities [v3] In-Reply-To: References: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> Message-ID: On Fri, 24 Dec 2021 16:15:07 GMT, Alexey Bakhtin wrote: >> Added priority enumeration for the JDK resources >> It will allow better handling the order of checkpoint notifications for dependent resources > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Changed priority order in the enum Marked as reviewed by DanHeidinga at github.com (no known OpenJDK username). src/java.base/share/classes/jdk/internal/crac/JDKContext.java line 38: > 36: @Override > 37: public int compare(Map.Entry o1, Map.Entry o2) { > 38: return o1.getKey().getPriority().ordinal() - o2.getKey().getPriority().ordinal(); Enums are already comparable so we can use the built in compareTo method here but this isn't critical and shouldn't block merging Suggestion: return o1.getKey().getPriority().compareTo(o2.getKey().getPriority()); ------------- PR: https://git.openjdk.java.net/crac/pull/8 From abakhtin at openjdk.java.net Thu Jan 6 18:07:25 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 6 Jan 2022 18:07:25 GMT Subject: [crac] RFR: JDKResource priorities [v4] In-Reply-To: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> References: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> Message-ID: <9lgHXrOee3NxnoameSe_p4pp-rEHoimKRtUEo6ngJAk=.50caaa54-be80-4519-a2ae-896d7d2152d3@github.com> > Added priority enumeration for the JDK resources > It will allow better handling the order of checkpoint notifications for dependent resources Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Use compareTo() to compare enums ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/8/files - new: https://git.openjdk.java.net/crac/pull/8/files/f1140138..72441c8e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=8&range=03 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=8&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/crac/pull/8.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/8/head:pull/8 PR: https://git.openjdk.java.net/crac/pull/8 From abakhtin at openjdk.java.net Thu Jan 6 18:07:26 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 6 Jan 2022 18:07:26 GMT Subject: [crac] RFR: JDKResource priorities [v3] In-Reply-To: References: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> Message-ID: On Thu, 6 Jan 2022 14:53:17 GMT, Dan Heidinga wrote: >> Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: >> >> Changed priority order in the enum > > src/java.base/share/classes/jdk/internal/crac/JDKContext.java line 38: > >> 36: @Override >> 37: public int compare(Map.Entry o1, Map.Entry o2) { >> 38: return o1.getKey().getPriority().ordinal() - o2.getKey().getPriority().ordinal(); > > Enums are already comparable so we can use the built in compareTo method here but this isn't critical and shouldn't block merging > > Suggestion: > > return o1.getKey().getPriority().compareTo(o2.getKey().getPriority()); Hi @DanHeidinga ! Thank you for review. Yes, compareTo looks better. Updated. ------------- PR: https://git.openjdk.java.net/crac/pull/8 From heidinga at openjdk.java.net Thu Jan 6 18:18:33 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Thu, 6 Jan 2022 18:18:33 GMT Subject: [crac] RFR: JDKResource priorities [v4] In-Reply-To: <9lgHXrOee3NxnoameSe_p4pp-rEHoimKRtUEo6ngJAk=.50caaa54-be80-4519-a2ae-896d7d2152d3@github.com> References: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> <9lgHXrOee3NxnoameSe_p4pp-rEHoimKRtUEo6ngJAk=.50caaa54-be80-4519-a2ae-896d7d2152d3@github.com> Message-ID: On Thu, 6 Jan 2022 18:07:25 GMT, Alexey Bakhtin wrote: >> Added priority enumeration for the JDK resources >> It will allow better handling the order of checkpoint notifications for dependent resources > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Use compareTo() to compare enums lgtm ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/8 From abakhtin at openjdk.java.net Sat Jan 8 21:05:57 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Sat, 8 Jan 2022 21:05:57 GMT Subject: [crac] Integrated: JDKResource priorities In-Reply-To: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> References: <5BBQwRf02HUlyGIHpv28NMFt_Vpaf4S53zofrdrlTAM=.f1016910-052b-40cc-b91a-0cb644075923@github.com> Message-ID: On Mon, 20 Dec 2021 15:39:03 GMT, Alexey Bakhtin wrote: > Added priority enumeration for the JDK resources > It will allow better handling the order of checkpoint notifications for dependent resources This pull request has now been integrated. Changeset: c9fe73ee Author: Alexey Bakhtin Committer: Dan Heidinga URL: https://git.openjdk.java.net/crac/commit/c9fe73eec187d436bdd1efb1bcc7448ccd140e1f Stats: 41 lines in 6 files changed: 31 ins; 0 del; 10 mod JDKResource priorities Reviewed-by: akozlov, heidinga ------------- PR: https://git.openjdk.java.net/crac/pull/8 From abakhtin at openjdk.java.net Tue Jan 11 09:44:04 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 11 Jan 2022 09:44:04 GMT Subject: [crac] Integrated: Disable recursive checkpoint In-Reply-To: References: Message-ID: On Tue, 7 Dec 2021 19:41:49 GMT, Alexey Bakhtin wrote: > This patch proposes restriction of the checkpoint/restore behavior: parallel or recursive checkpoint should be disabled. > CheckpointException will be thrown in case of checkpoint is requested from the beforeCheckpoint/afterRestore methods. > Checkpoint/restore will be suspended In case of another checkpoint already started by another thread. > > This is a prerequisite for https://github.com/openjdk/crac/pull/5 This pull request has now been integrated. Changeset: 43aba3d5 Author: Alexey Bakhtin Committer: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/43aba3d502832a5a3d2e9712558f62e0cf93dbbb Stats: 213 lines in 5 files changed: 208 ins; 0 del; 5 mod Disable recursive checkpoint Reviewed-by: akozlov ------------- PR: https://git.openjdk.java.net/crac/pull/6 From abakhtin at openjdk.java.net Wed Jan 12 14:09:34 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Wed, 12 Jan 2022 14:09:34 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Set JDKResource priorities for SecureRandom - Merge branch 'crac' of https://github.com/openjdk/crac into SecureRandom - Add separate JDKResorce for seeder - Reseed secure random on checkpoint restore ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/7/files - new: https://git.openjdk.java.net/crac/pull/7/files/e26d0b81..8e054933 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=02 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=01-02 Stats: 272 lines in 13 files changed: 253 ins; 0 del; 19 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Thu Jan 13 04:50:57 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 13 Jan 2022 04:50:57 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v2] In-Reply-To: References: Message-ID: On Fri, 24 Dec 2021 14:38:33 GMT, Alexey Bakhtin wrote: >> src/java.base/unix/classes/sun/security/provider/NativePRNG.java line 595: >> >>> 593: for(int i=0; i>> 594: nextBuffer[i] = 0; >>> 595: } >> >> I assume this clean-up serves two purposes: 1) to clear the state that was already used to generate other values, so to prevent guessing them; 2) to force next request for random values will force filling of the buffer with real random data and fresh initialization of mixRandom. >> >> For 2, what if another `beforeCheckpoint` inadvertently calls `NativePRNG.engineNextBytes`? `crLock` (reentrant) will not prevent reading random values from OS and setting an instance for `mixRandom` before the checkpoint, but they will live later forever on restore. So I assume a similar clean-up is required in `afterRestore` (we may happen to store some state for NativePRNG, but this state won't be related to the previous state before the checkpoint and the one created later after restore). > > For 2. I think it is just a matter of JDKResource.Priority. The use case you described is about dependent resources (see https://github.com/openjdk/crac/pull/8). In this case, the priority of the NativePRNG should be adjusted in the JDKResource.Priority. > Also, I would suggest adding a debug option that enables detection of the incorrect usage of the JDKResource A inside beforeCheckpoint of JDKResource B (priority A > priority B) Without the debug option, the image may be compromised -- so there should be no option of running without the checks. The currently employed ReentrantReadWriteLock does not block taking Read lock (access to PRNG) while holding Write lock (that is supposed to block access to the PRNG during checkpoint/restore -- but not from the same thread). It would be much more useful to throw an exception "the image is going to be wrong", or at worst to deadlock. But now the implementation does guarantee the security of the image. https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/locks/ReentrantReadWriteLock.html ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Thu Jan 13 05:30:53 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 13 Jan 2022 05:30:53 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v2] In-Reply-To: References: Message-ID: <7qdiH3C5eV986ZcnmjkmsEjeoQqTibXHWg070rX7R6I=.069e07c6-cc46-4dd4-96da-2e8404146931@github.com> On Fri, 24 Dec 2021 21:00:22 GMT, Alexey Bakhtin wrote: >> src/java.base/unix/classes/sun/security/provider/NativePRNG.java line 589: >> >>> 587: @Override >>> 588: public void beforeCheckpoint(Context context) throws Exception { >>> 589: crLock.lock(); >> >> The new lock is very related to `LOCK_GET_BYTES`, however, this code does not acquire `LOCK_GET_BYTES`. Now all `synchronized (LOCK_GET_BYTES)` blocks are executed under `crLock`. But I would prefer at least `assert crLock.isHeldByCurrentThread()` near those synchronized blocks, or any other way to ensure / document locks relation. Or use the LOCK_GET_BYTES, considering the other problem with premature buffer and mixRandom initialzation. > > I do not see any issues between `crLock` and `LOCK_GET_BYTES.` `beforeCheckpoint` does not acquire any additional locks, so it should not cause deadlocks. > It is normal if different threads try to acquire `crLock.` In this case, one of them waits for the completion of another. E.g. `beforeCheckpoint` waits for completion `implNextBytes()` and vice versa. So, assert `crLock.isHeldByCurrentThread()` is not required. > In case of `implNextBytes` called from the `beforeCheckpoint` and `crLock` already acquired by `RandomIO.beforeCheckpoint` will have improper priorities of dependent `JDKResources.` It should be properly adjusted in the `JDKResource.Priority` (see #8) > The only problem is performance. `INSTANCE` is a single object. So, locking the whole `implNextBytes` method will affect performance dramatically. I think It can be fixed by `ReentrantReadWriteLock`. Will update implementation crLock and LOCK_GET_BYTES should be acquired in the particular order to ensure no deadlock can happen. The order is correct now AFAICS, but in the code these locks are used in distant places, checking the locks relation is nontrivial. Someone in the future may add another use of LOCK_GET_BYTES that would go out-of-sync with crLock. I support using a ReadWriteLock. I'm not sure about how bad the performance was without one (the LOCK_GET_BYTES is there anyway). But some ReadWriteLock is certainly better suited for our task. I'm still not sure the ReentrantReadWriteLock is a perfect one, see another thread. ------------- PR: https://git.openjdk.java.net/crac/pull/9 From abakhtin at openjdk.java.net Thu Jan 13 10:47:36 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 13 Jan 2022 10:47:36 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v3] In-Reply-To: References: Message-ID: > NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore > Also, the internal buffer should be cleared at the checkpoint. Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Prevent NativePRNG usage in the beforeCheckpoint/afterRestore - Merge branch 'crac' of https://github.com/openjdk/crac into NativePRNG - Use ReentrantReadWriteLock - Reseed NativePRNG on checkpoint restore ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/9/files - new: https://git.openjdk.java.net/crac/pull/9/files/196edc75..dd461601 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=9&range=02 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=9&range=01-02 Stats: 274 lines in 13 files changed: 255 ins; 0 del; 19 mod Patch: https://git.openjdk.java.net/crac/pull/9.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/9/head:pull/9 PR: https://git.openjdk.java.net/crac/pull/9 From abakhtin at openjdk.java.net Thu Jan 13 10:47:36 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 13 Jan 2022 10:47:36 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 04:47:55 GMT, Anton Kozlov wrote: >> For 2. I think it is just a matter of JDKResource.Priority. The use case you described is about dependent resources (see https://github.com/openjdk/crac/pull/8). In this case, the priority of the NativePRNG should be adjusted in the JDKResource.Priority. >> Also, I would suggest adding a debug option that enables detection of the incorrect usage of the JDKResource A inside beforeCheckpoint of JDKResource B (priority A > priority B) > > Without the debug option, the image may be compromised -- so there should be no option of running without the checks. The currently employed ReentrantReadWriteLock does not block taking Read lock (access to PRNG) while holding Write lock (that is supposed to block access to the PRNG during checkpoint/restore -- but not from the same thread). It would be much more useful to throw an exception "the image is going to be wrong", or at worst to deadlock. But now the implementation does guarantee the security of the image. > > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/locks/ReentrantReadWriteLock.html Thank you. Added the code to verify if NativePRNG is used from the beforeCheckpoint/afterRestore. In such a case, CheckpointException will be thrown. ------------- PR: https://git.openjdk.java.net/crac/pull/9 From abakhtin at openjdk.java.net Thu Jan 13 10:52:59 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 13 Jan 2022 10:52:59 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v3] In-Reply-To: <7qdiH3C5eV986ZcnmjkmsEjeoQqTibXHWg070rX7R6I=.069e07c6-cc46-4dd4-96da-2e8404146931@github.com> References: <7qdiH3C5eV986ZcnmjkmsEjeoQqTibXHWg070rX7R6I=.069e07c6-cc46-4dd4-96da-2e8404146931@github.com> Message-ID: <0rbBJY5jj4INKKO93WPoxmRlAu5jSpVcF1yfFUqR-JM=.90988f09-9887-4625-b13e-76ca64c23931@github.com> On Thu, 13 Jan 2022 05:26:59 GMT, Anton Kozlov wrote: >> I do not see any issues between `crLock` and `LOCK_GET_BYTES.` `beforeCheckpoint` does not acquire any additional locks, so it should not cause deadlocks. >> It is normal if different threads try to acquire `crLock.` In this case, one of them waits for the completion of another. E.g. `beforeCheckpoint` waits for completion `implNextBytes()` and vice versa. So, assert `crLock.isHeldByCurrentThread()` is not required. >> In case of `implNextBytes` called from the `beforeCheckpoint` and `crLock` already acquired by `RandomIO.beforeCheckpoint` will have improper priorities of dependent `JDKResources.` It should be properly adjusted in the `JDKResource.Priority` (see #8) >> The only problem is performance. `INSTANCE` is a single object. So, locking the whole `implNextBytes` method will affect performance dramatically. I think It can be fixed by `ReentrantReadWriteLock`. Will update implementation > > crLock and LOCK_GET_BYTES should be acquired in the particular order to ensure no deadlock can happen. The order is correct now AFAICS, but in the code these locks are used in distant places, checking the locks relation is nontrivial. Someone in the future may add another use of LOCK_GET_BYTES that would go out-of-sync with crLock. > > I support using a ReadWriteLock. I'm not sure about how bad the performance was without one (the LOCK_GET_BYTES is there anyway). But some ReadWriteLock is certainly better suited for our task. I'm still not sure the ReentrantReadWriteLock is a perfect one, see another thread. I do not like to replace LOCK_GET_BYTES with crLock. It will affect performance because of NativePRNG is singleton and the original LOCK_GET_BYTES synchronizes much smaller pieces of code. The only standard implementation of ReadWriteLock is ReentrantReadWriteLock. So I think it is OK to use ReentrantReadWriteLock with additional checks I just proposed ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Thu Jan 13 16:13:05 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 13 Jan 2022 16:13:05 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v3] In-Reply-To: References: Message-ID: <55NIXCn0uSfSAaEzwExO4T6zExFVURxNlLoVwguu8ME=.6e7244a4-1714-4aac-b544-328ae3983b0a@github.com> On Thu, 13 Jan 2022 10:47:36 GMT, Alexey Bakhtin wrote: >> NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore >> Also, the internal buffer should be cleared at the checkpoint. > > Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Prevent NativePRNG usage in the beforeCheckpoint/afterRestore > - Merge branch 'crac' of https://github.com/openjdk/crac into NativePRNG > - Use ReentrantReadWriteLock > - Reseed NativePRNG on checkpoint restore Marked as reviewed by akozlov (Lead). ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Thu Jan 13 16:13:06 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 13 Jan 2022 16:13:06 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v3] In-Reply-To: <0rbBJY5jj4INKKO93WPoxmRlAu5jSpVcF1yfFUqR-JM=.90988f09-9887-4625-b13e-76ca64c23931@github.com> References: <7qdiH3C5eV986ZcnmjkmsEjeoQqTibXHWg070rX7R6I=.069e07c6-cc46-4dd4-96da-2e8404146931@github.com> <0rbBJY5jj4INKKO93WPoxmRlAu5jSpVcF1yfFUqR-JM=.90988f09-9887-4625-b13e-76ca64c23931@github.com> Message-ID: On Thu, 13 Jan 2022 10:49:46 GMT, Alexey Bakhtin wrote: >> crLock and LOCK_GET_BYTES should be acquired in the particular order to ensure no deadlock can happen. The order is correct now AFAICS, but in the code these locks are used in distant places, checking the locks relation is nontrivial. Someone in the future may add another use of LOCK_GET_BYTES that would go out-of-sync with crLock. >> >> I support using a ReadWriteLock. I'm not sure about how bad the performance was without one (the LOCK_GET_BYTES is there anyway). But some ReadWriteLock is certainly better suited for our task. I'm still not sure the ReentrantReadWriteLock is a perfect one, see another thread. > > I do not like to replace LOCK_GET_BYTES with crLock. It will affect performance because of NativePRNG is singleton and the original LOCK_GET_BYTES synchronizes much smaller pieces of code. > The only standard implementation of ReadWriteLock is ReentrantReadWriteLock. So I think it is OK to use ReentrantReadWriteLock with additional checks I just proposed OK, I don't insist. Two related locks complicate the code, but crLock intent and the difference from LOCK_GET_BYTES is clearer now. I think the current state is good. >> Without the debug option, the image may be compromised -- so there should be no option of running without the checks. The currently employed ReentrantReadWriteLock does not block taking Read lock (access to PRNG) while holding Write lock (that is supposed to block access to the PRNG during checkpoint/restore -- but not from the same thread). It would be much more useful to throw an exception "the image is going to be wrong", or at worst to deadlock. But now the implementation does guarantee the security of the image. >> >> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/locks/ReentrantReadWriteLock.html > > Thank you. Added the code to verify if NativePRNG is used from the beforeCheckpoint/afterRestore. In such a case, CheckpointException will be thrown. Thanks, now it looks good. Just as a note, in the similar case I've used IllegalSelectorException [1]. I think IllegalStateException may fit here as well -- we probably need to converge what exception(s) to use. But it is not necessary now. [1] https://github.com/openjdk/crac/blob/43aba3d502832a5a3d2e9712558f62e0cf93dbbb/src/java.base/linux/classes/sun/nio/ch/EPollSelectorImpl.java#L384 ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Fri Jan 14 11:27:05 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 14 Jan 2022 11:27:05 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 14:09:34 GMT, Alexey Bakhtin wrote: >> Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. >> Two jtreg tests added: >> - verify if no deadlocks introduced by checkpoint/restore >> - verify if SHA1PRNG is reseeded if created with default embedded seed generator > > Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Set JDKResource priorities for SecureRandom > - Merge branch 'crac' of https://github.com/openjdk/crac into SecureRandom > - Add separate JDKResorce for seeder > - Reseed secure random on checkpoint restore src/java.base/share/classes/sun/security/provider/SecureRandom.java line 256: > 254: SeedGenerator.generateSeed(b); > 255: seeder.engineSetSeed(b); > 256: jdk.internal.crac.Core.getJDKContext().register(new SeederHolder()); Resources are weakly referenced [1], so this SeederHolder object will likely be collected very soon [1] https://github.com/openjdk/crac/blob/43aba3d502832a5a3d2e9712558f62e0cf93dbbb/src/java.base/share/classes/jdk/crac/package-info.java#L74 src/java.base/share/classes/sun/security/provider/SecureRandom.java line 261: > 259: @Override > 260: public void beforeCheckpoint(Context context) throws Exception { > 261: objLock.lock(); Do we assume a special state of seeder? A comment is really needed. This lock is not acquired anywhere else, i.e. it does not guard access to the static `seeder` field. The field can be referenced by a SecureRandom object that was created after Priority.SECURE_RANDOM was handled (the new SecureRandom was not notified about checkpoint -- there is no problem unless the new SecureRandom catches some state). If seeder has built-in protection and will block the new SecureRandom, the lock is not necessary. If not, seeder needs to be guarded. In the latter case, it should probably be a ReadWriteLock as in #9. ------------- PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Tue Jan 18 15:42:38 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 18 Jan 2022 15:42:38 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v4] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Update object lock during checkpoint/restore ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/7/files - new: https://git.openjdk.java.net/crac/pull/7/files/8e054933..3ade9580 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=03 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=02-03 Stats: 43 lines in 2 files changed: 21 ins; 9 del; 13 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Tue Jan 18 15:42:40 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 18 Jan 2022 15:42:40 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: References: Message-ID: <2TPvqUkRYCeL9XLZFCPYjwXKH_LrKa1e-qIuAz1YJsc=.cb6c1c9f-2f66-4511-ae39-9f48dab5a4e2@github.com> On Thu, 13 Jan 2022 16:32:55 GMT, Anton Kozlov wrote: >> Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Set JDKResource priorities for SecureRandom >> - Merge branch 'crac' of https://github.com/openjdk/crac into SecureRandom >> - Add separate JDKResorce for seeder >> - Reseed secure random on checkpoint restore > > src/java.base/share/classes/sun/security/provider/SecureRandom.java line 256: > >> 254: SeedGenerator.generateSeed(b); >> 255: seeder.engineSetSeed(b); >> 256: jdk.internal.crac.Core.getJDKContext().register(new SeederHolder()); > > Resources are weakly referenced [1], so this SeederHolder object will likely be collected very soon > > [1] https://github.com/openjdk/crac/blob/43aba3d502832a5a3d2e9712558f62e0cf93dbbb/src/java.base/share/classes/jdk/crac/package-info.java#L74 Thank you. Fixed in new version ------------- PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Tue Jan 18 15:49:58 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 18 Jan 2022 15:49:58 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 16:30:08 GMT, Anton Kozlov wrote: >> Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Set JDKResource priorities for SecureRandom >> - Merge branch 'crac' of https://github.com/openjdk/crac into SecureRandom >> - Add separate JDKResorce for seeder >> - Reseed secure random on checkpoint restore > > src/java.base/share/classes/sun/security/provider/SecureRandom.java line 261: > >> 259: @Override >> 260: public void beforeCheckpoint(Context context) throws Exception { >> 261: objLock.lock(); > > Do we assume a special state of seeder? A comment is really needed. This lock is not acquired anywhere else, i.e. it does not guard access to the static `seeder` field. The field can be referenced by a SecureRandom object that was created after Priority.SECURE_RANDOM was handled (the new SecureRandom was not notified about checkpoint -- there is no problem unless the new SecureRandom catches some state). If seeder has built-in protection and will block the new SecureRandom, the lock is not necessary. If not, seeder needs to be guarded. In the latter case, it should probably be a ReadWriteLock as in #9. Thank you. I've added verification if PRNG object is already locked during checkpoint/restore. In this case, CheckpointException will be thrown. In other cases the object will be locked until checkpoint/restore is completed ------------- PR: https://git.openjdk.java.net/crac/pull/7 From asmehra at redhat.com Thu Jan 20 16:04:43 2022 From: asmehra at redhat.com (Ashutosh Mehra) Date: Thu, 20 Jan 2022 11:04:43 -0500 Subject: CheckpointOpenFileException for /var/lib/sss/mc/passwd Message-ID: While trying C/R using CRaC build on my linux system (RHEL 8 based), I encountered this exception: $ sudo ./build/linux-x86_64-server-slowdebug/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:CRaCCheckpointTo=cr -XX:+CRPrintResourcesOnCheckpoint HelloWorld Before checkpoint JVM: FD fd=0 type=character: details1="/dev/pts/4" OK: inherited from process env JVM: FD fd=1 type=character: details1="/dev/pts/4" OK: inherited from process env JVM: FD fd=2 type=character: details1="/dev/pts/4" OK: inherited from process env JVM: FD fd=3 type=regular: details1="/home/asmehra/data/ashu-mehra/crac/build/linux-x86_64-server-slowdebug/images/jdk/lib/modules" OK: inherited from process env JVM: FD fd=4 type=regular: details1="/var/lib/sss/mc/passwd" BAD: opened by application JVM: FD fd=5 type=socket: details1="socket:[2248020]" BAD: opened by application details2="socket:[2248020]" Exception in thread "main" jdk.crac.CheckpointException at java.base/jdk.crac.Core.checkpointRestore1(Core.java:142) at java.base/jdk.crac.Core.checkpointRestore(Core.java:193) at HelloWorld.main(HelloWorld.java:9) Suppressed: jdk.crac.impl.CheckpointOpenFileException: /var/lib/sss/mc/passwd at java.base/jdk.crac.Core.translateJVMExceptions(Core.java:84) at java.base/jdk.crac.Core.checkpointRestore1(Core.java:145) ... 2 more Suppressed: jdk.crac.impl.CheckpointOpenSocketException: socket:[2248020] at java.base/jdk.crac.Core.translateJVMExceptions(Core.java:88) at java.base/jdk.crac.Core.checkpointRestore1(Core.java:145) ... 2 more Notice this message in the above output: JVM: FD fd=4 type=regular: details1="/var/lib/sss/mc/passwd" BAD: opened by application Here is the HelloWorld application used in the above example: public class HelloWorld { public static void main(String args[]) throws Exception { System.out.println("Before checkpoint"); jdk.crac.Core.checkpointRestore(); System.out.println("After checkpoint"); } } Clearly the application is not trying to open /var/lib/sss/mc/passwd. I tried to figure out what causes the process to open /var/lib/sss/mc/passwd file. Turns out it originates from libc when JVM tries to get user name to create mmap based shared memory using user name as the location: (gdb) bt #0 0x00007ffff72af550 in open64 () from /lib64/libc.so.6 #1 0x00007ffff44b4314 in sss_open_cloexec () from /lib64/libnss_sss.so.2 #2 0x00007ffff44b3fc9 in sss_nss_mc_get_ctx () from /lib64/libnss_sss.so.2 #3 0x00007ffff44b4770 in sss_nss_mc_getpwuid () from /lib64/libnss_sss.so.2 #4 0x00007ffff44b061e in _nss_sss_getpwuid_r () from /lib64/libnss_sss.so.2 #5 0x00007ffff728a41d in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #6 0x00007ffff5e960eb in get_user_name (uid=0) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/os/posix/perfMemory_posix.cpp:470 #7 0x00007ffff5e97026 in mmap_create_shared (size=32768) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/os/posix/perfMemory_posix.cpp:972 #8 0x00007ffff5e972e8 in create_shared_memory (size=32768) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/os/posix/perfMemory_posix.cpp:1049 #9 0x00007ffff5e97ac1 in PerfMemory::create_memory_region (size=32768) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/os/posix/perfMemory_posix.cpp:1232 #10 0x00007ffff5e94e7c in PerfMemory::initialize () at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/runtime/perfMemory.cpp:107 #11 0x00007ffff5e94d9f in perfMemory_init () at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/runtime/perfMemory.cpp:62 #12 0x00007ffff5935d03 in vm_init_globals () at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/runtime/init.cpp:108 #13 0x00007ffff610bfeb in Threads::create_vm (args=0x7ffff7fd3df0, canTryAgain=0x7ffff7fd3ce3) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/runtime/thread.cpp:2813 #14 0x00007ffff5a37ddf in JNI_CreateJavaVM_inner (vm=0x7ffff7fd3e48, penv=0x7ffff7fd3e50, args=0x7ffff7fd3df0) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/prims/jni.cpp:3621 #15 0x00007ffff5a38138 in JNI_CreateJavaVM (vm=0x7ffff7fd3e48, penv=0x7ffff7fd3e50, args=0x7ffff7fd3df0) at /home/asmehra/data/ashu-mehra/crac/src/hotspot/share/prims/jni.cpp:3709 #16 0x00007ffff79b04ce in InitializeJVM (pvm=0x7ffff7fd3e48, penv=0x7ffff7fd3e50, ifn=0x7ffff7fd3ea0) at /home/asmehra/data/ashu-mehra/crac/src/java.base/share/native/libjli/java.c:1541 #17 0x00007ffff79ad042 in JavaMain (_args=0x7fffffffb040) at /home/asmehra/data/ashu-mehra/crac/src/java.base/share/native/libjli/java.c:415 #18 0x00007ffff79b3e16 in ThreadJavaMain (args=0x7fffffffb040) at /home/asmehra/data/ashu-mehra/crac/src/java.base/unix/native/libjli/java_md.c:651 #19 0x00007ffff779115a in start_thread () from /lib64/libpthread.so.0 #20 0x00007ffff72bef73 in clone () from /lib64/libc.so.6 libnss_sss.so.2 comes into picture because this system is configured to use SSSD using NSS as the provider for password and groups map. This is something outside the control of the JVM. Ideally this fd should have been treated as opened by JVM, not the application. During startup the JVM caches the fds in _vm_inited_fds to be able to segregate the fds opened by the application and the JVM, but this happens before it creates the shared memory. This is the reason why the fd for /var/lib/sss/mc/passwd is not included in the set of _vm_inited_fds and results in the exception at the time of checkpoint. I think this issue points out an important observation that we should try to do any kind of segregation of resources between JVM-owned and application-owned as late as possible, probably just before we start executing the application code. In this case delaying the initialization of _vm_inited_fds should help. For now, I can workaround this issue by avoiding SSSD by updating /etc/nsswitch.conf: Current order of entries in /etc/nsswitch.conf is: passwd: sss files systemd group: sss files systemd To workaround, change the order to: passwd: files sss systemd group: files sss systemd Regards, Ashutosh Mehra From akozlov at openjdk.java.net Fri Jan 21 15:00:22 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 21 Jan 2022 15:00:22 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:42:38 GMT, Alexey Bakhtin wrote: >> Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. >> Two jtreg tests added: >> - verify if no deadlocks introduced by checkpoint/restore >> - verify if SHA1PRNG is reseeded if created with default embedded seed generator > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Update object lock during checkpoint/restore src/java.base/share/classes/jdk/crac/CheckpointException.java line 32: > 30: * Suppresses exceptions thrown during checkpoint notification. > 31: */ > 32: public class CheckpointException extends RuntimeException { This is a severe change and it makes CheckpointException unchecked. We want users to provide explicit handling of CheckpointException, please revert. src/java.base/share/classes/sun/security/provider/SecureRandom.java line 169: > 167: objLock.lock(); > 168: try { > 169: // verify if objLock is already acquired in beforeCheckpoint Probably "check if objLock has not been already acquired in beforeCheckpoint" ? src/java.base/share/classes/sun/security/provider/SecureRandom.java line 216: > 214: } > 215: > 216: private void invalidate() { I would like to have "assert objLock.isHeldByCurrentThread()" or another "objLock.lock()" here. This method is called from SeederHolder.beforeCheckpoint. There is no race here only if that executes after seeder's beforeCheckpoint and in the same thread. While it is so, additional safety would not harm. ------------- PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Fri Jan 21 15:00:25 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 21 Jan 2022 15:00:25 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: References: Message-ID: <-476Z5xfrx9oV0XLS2kH-ysRCsBMD3k5V7pgfcweghg=.4ef281d4-1041-4dfb-a0ea-c4a228367ec9@github.com> On Tue, 18 Jan 2022 15:46:41 GMT, Alexey Bakhtin wrote: >> src/java.base/share/classes/sun/security/provider/SecureRandom.java line 261: >> >>> 259: @Override >>> 260: public void beforeCheckpoint(Context context) throws Exception { >>> 261: objLock.lock(); >> >> Do we assume a special state of seeder? A comment is really needed. This lock is not acquired anywhere else, i.e. it does not guard access to the static `seeder` field. The field can be referenced by a SecureRandom object that was created after Priority.SECURE_RANDOM was handled (the new SecureRandom was not notified about checkpoint -- there is no problem unless the new SecureRandom catches some state). If seeder has built-in protection and will block the new SecureRandom, the lock is not necessary. If not, seeder needs to be guarded. In the latter case, it should probably be a ReadWriteLock as in #9. > > Thank you. I've added verification if PRNG object is already locked during checkpoint/restore. In this case, CheckpointException will be thrown. In other cases the object will be locked until checkpoint/restore is completed The CheckpointException does not look like a perfect exception here. To avoid controversial requirements, could it be RuntimeException or another unchecked exception, like e.g. IllegalStateException? ------------- PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at azul.com Tue Jan 25 10:30:48 2022 From: akozlov at azul.com (Anton Kozlov) Date: Tue, 25 Jan 2022 13:30:48 +0300 Subject: CheckpointOpenFileException for /var/lib/sss/mc/passwd In-Reply-To: References: Message-ID: <3a98f4e9-7375-a4d2-86b0-b3bfd313b6ea@azul.com> On 1/20/22 19:04, Ashutosh Mehra wrote: > While trying C/R using CRaC build on my linux system (RHEL 8 based), I > encountered this exception: Thanks for sharing. We've met with the problem some time ago, and used the same workaround -- so it looks legit to me. The message is misleading, it should be probably "opened during process run time". _vm_inited_fds's purpose is to track descriptors provided from "outside environment", like stdin/out/err -- the JVM has no a chance to handle them, they are granted. The /var/.../passwd on the other hand appears after JVM/JDK platform initialized, so something in its execution made the desriptor appear. Is it correct that the file is generated a dynamically and may change over time? If so, allowing the the fd will render images of CRIU-based implementations unusable, so the check played a role of a safety guard of the JVM itself. This happened for every file descriptor open by JVM, like jcmd channel or perfdata file -- before the CRaC-specific code was added for them. In theory /var/.../passwd may be workarounded on the JVM side as well -- for example, avoiding perfdata, avoiding getpuid at all, the custom /etc/passwd scan,.. We don't want JVM to cache because the cache may become outdated after image was created. The /etc/nsswitch.conf is a direct configuration, so looks good. The downside is that is is a global config. Thanks, Anton From abakhtin at openjdk.java.net Thu Jan 27 09:00:44 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 27 Jan 2022 09:00:44 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v5] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Added assert in SecureRandom.invalidate() ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/7/files - new: https://git.openjdk.java.net/crac/pull/7/files/3ade9580..8e7944f9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=04 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=03-04 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Thu Jan 27 09:00:46 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 27 Jan 2022 09:00:46 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 09:21:19 GMT, Anton Kozlov wrote: >> Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: >> >> Update object lock during checkpoint/restore > > src/java.base/share/classes/jdk/crac/CheckpointException.java line 32: > >> 30: * Suppresses exceptions thrown during checkpoint notification. >> 31: */ >> 32: public class CheckpointException extends RuntimeException { > > This is a severe change and it makes CheckpointException unchecked. We want users to provide explicit handling of CheckpointException, please revert. This is an internal jdk.crac Exception that is not visible to users. It is used for JVM resources only and handled explicitly in jdk.crac.Core. This exception will be thrown from existing JDK classes, so it was changed to unchecked to make it possible to throw exception without changing signature of the existing public API (e.g. https://github.com/openjdk/crac/blob/3ade9580452ab2db193e9c2c1b458a2ff17a8597/src/java.base/share/classes/sun/security/provider/SecureRandom.java#L296 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L556 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L490) Also, it was approved already in the https://github.com/openjdk/crac/pull/9 > src/java.base/share/classes/sun/security/provider/SecureRandom.java line 169: > >> 167: objLock.lock(); >> 168: try { >> 169: // verify if objLock is already acquired in beforeCheckpoint > > Probably "check if objLock has not been already acquired in beforeCheckpoint" ? Thank you, changed. > src/java.base/share/classes/sun/security/provider/SecureRandom.java line 216: > >> 214: } >> 215: >> 216: private void invalidate() { > > I would like to have "assert objLock.isHeldByCurrentThread()" or another "objLock.lock()" here. This method is called from SeederHolder.beforeCheckpoint. There is no race here only if that executes after seeder's beforeCheckpoint and in the same thread. While it is so, additional safety would not harm. Thank you. Added ------------- PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Thu Jan 27 09:00:46 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 27 Jan 2022 09:00:46 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v3] In-Reply-To: <-476Z5xfrx9oV0XLS2kH-ysRCsBMD3k5V7pgfcweghg=.4ef281d4-1041-4dfb-a0ea-c4a228367ec9@github.com> References: <-476Z5xfrx9oV0XLS2kH-ysRCsBMD3k5V7pgfcweghg=.4ef281d4-1041-4dfb-a0ea-c4a228367ec9@github.com> Message-ID: On Fri, 21 Jan 2022 14:57:07 GMT, Anton Kozlov wrote: >> Thank you. I've added verification if PRNG object is already locked during checkpoint/restore. In this case, CheckpointException will be thrown. In other cases the object will be locked until checkpoint/restore is completed > > The CheckpointException does not look like a perfect exception here. To avoid controversial requirements, could it be RuntimeException or another unchecked exception, like e.g. IllegalStateException? CheckpointException is CheckpointException now. We can not use IllegalStateException because we should catch this exception and correctly roll back the checkpoint. Also, see my comments in CheckpointException https://github.com/openjdk/crac/pull/7#discussion_r793379918 ------------- PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Fri Jan 28 12:07:18 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 28 Jan 2022 12:07:18 GMT Subject: [crac] RFR: Run native CRaC checks after failed beforeCheckpoint Message-ID: After checkpoint failed at the Java level, it's worth to make a "dry-run" checkpoint at the native state: check file descriptors, process -XX:+CRHeapDumpOnCheckpointException, etc. The patch also removes unused parameter of `checkpoint_restore(FdsInfo* fds)` ------------- Commit messages: - Run native checks after failed beforeCheckpoint Changes: https://git.openjdk.java.net/crac/pull/11/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=11&range=00 Stats: 134 lines in 7 files changed: 90 ins; 11 del; 33 mod Patch: https://git.openjdk.java.net/crac/pull/11.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/11/head:pull/11 PR: https://git.openjdk.java.net/crac/pull/11 From abakhtin at openjdk.java.net Fri Jan 28 12:11:25 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 12:11:25 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v6] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Revert CheckpointException changes ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/7/files - new: https://git.openjdk.java.net/crac/pull/7/files/8e7944f9..eb6111ee Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=05 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=04-05 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Fri Jan 28 12:54:27 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 12:54:27 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v4] In-Reply-To: References: Message-ID: > NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore > Also, the internal buffer should be cleared at the checkpoint. Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Revert Checkpoint/RestoreException changes ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/9/files - new: https://git.openjdk.java.net/crac/pull/9/files/dd461601..637460a7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=9&range=03 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=9&range=02-03 Stats: 5 lines in 3 files changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.java.net/crac/pull/9.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/9/head:pull/9 PR: https://git.openjdk.java.net/crac/pull/9 From abakhtin at openjdk.java.net Fri Jan 28 12:55:12 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 12:55:12 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v7] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: Exclude CheckpointException from import ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/7/files - new: https://git.openjdk.java.net/crac/pull/7/files/eb6111ee..f33c6ab0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=06 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Fri Jan 28 12:59:35 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 12:59:35 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 08:51:34 GMT, Alexey Bakhtin wrote: >> src/java.base/share/classes/jdk/crac/CheckpointException.java line 32: >> >>> 30: * Suppresses exceptions thrown during checkpoint notification. >>> 31: */ >>> 32: public class CheckpointException extends RuntimeException { >> >> This is a severe change and it makes CheckpointException unchecked. We want users to provide explicit handling of CheckpointException, please revert. > > This is an internal jdk.crac Exception that is not visible to users. It is used for JVM resources only and handled explicitly in jdk.crac.Core. This exception will be thrown from existing JDK classes, so it was changed to unchecked to make it possible to throw exception without changing signature of the existing public API (e.g. https://github.com/openjdk/crac/blob/3ade9580452ab2db193e9c2c1b458a2ff17a8597/src/java.base/share/classes/sun/security/provider/SecureRandom.java#L296 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L556 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L490) > Also, it was approved already in the https://github.com/openjdk/crac/pull/9 After additional discussions, it was decided to revert these changes and use checked CheckpointException. JDKResources can throw another RuntimeExceptions like IllegalStateException if required to indicate a failure during checkpoint/restore. All exceptions are processed in the AbstractContextImpl.beforeCheckpoint(), restore resources and throw ChecpointException to the user with the real cause. ------------- PR: https://git.openjdk.java.net/crac/pull/7 From abakhtin at openjdk.java.net Fri Jan 28 13:00:36 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 13:00:36 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 12:54:27 GMT, Alexey Bakhtin wrote: >> NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore >> Also, the internal buffer should be cleared at the checkpoint. > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Revert Checkpoint/RestoreException changes After additional discussions, it was decided to revert these changes and use checked CheckpointException. JDKResources can throw another RuntimeExceptions like IllegalStateException if required to indicate a failure during checkpoint/restore. All exceptions are processed in the AbstractContextImpl.beforeCheckpoint(), restore resources and throw ChecpointException to the user with the real cause. ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Fri Jan 28 13:45:45 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 28 Jan 2022 13:45:45 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v7] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 12:55:12 GMT, Alexey Bakhtin wrote: >> Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. >> Two jtreg tests added: >> - verify if no deadlocks introduced by checkpoint/restore >> - verify if SHA1PRNG is reseeded if created with default embedded seed generator > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Exclude CheckpointException from import Looks good to me! ------------- Marked as reviewed by akozlov (Lead). PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Fri Jan 28 13:45:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 28 Jan 2022 13:45:47 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 12:56:39 GMT, Alexey Bakhtin wrote: >> This is an internal jdk.crac Exception that is not visible to users. It is used for JVM resources only and handled explicitly in jdk.crac.Core. This exception will be thrown from existing JDK classes, so it was changed to unchecked to make it possible to throw exception without changing signature of the existing public API (e.g. https://github.com/openjdk/crac/blob/3ade9580452ab2db193e9c2c1b458a2ff17a8597/src/java.base/share/classes/sun/security/provider/SecureRandom.java#L296 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L556 or https://github.com/openjdk/crac/blob/dd46160142a3ec490a400f56738d0251d128494a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java#L490) >> Also, it was approved already in the https://github.com/openjdk/crac/pull/9 > > After additional discussions, it was decided to revert these changes and use checked CheckpointException. > JDKResources can throw another RuntimeExceptions like IllegalStateException if required to indicate a failure during checkpoint/restore. All exceptions are processed in the AbstractContextImpl.beforeCheckpoint(), restore resources and throw ChecpointException to the user with the real cause. Thanks for fixing this and keeping #9 consistent ------------- PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Fri Jan 28 13:46:36 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 28 Jan 2022 13:46:36 GMT Subject: [crac] RFR: Reseed NativePRNG on checkpoint restore [v4] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 12:54:27 GMT, Alexey Bakhtin wrote: >> NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore >> Also, the internal buffer should be cleared at the checkpoint. > > Alexey Bakhtin has updated the pull request incrementally with one additional commit since the last revision: > > Revert Checkpoint/RestoreException changes Marked as reviewed by akozlov (Lead). ------------- PR: https://git.openjdk.java.net/crac/pull/9 From akozlov at openjdk.java.net Fri Jan 28 16:17:57 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 28 Jan 2022 16:17:57 GMT Subject: [crac] RFR: Get tracing properties lazily Message-ID: A minor change to make tracing flags lazily initialized. Without this, reading j.l.System's properties during jdk.crac.Core initialization could lead to NPE if happens too early, i.e. before j.l.System's properties are initialized. ------------- Commit messages: - Get tracing properties lazily Changes: https://git.openjdk.java.net/crac/pull/12/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=12&range=00 Stats: 86 lines in 3 files changed: 60 ins; 16 del; 10 mod Patch: https://git.openjdk.java.net/crac/pull/12.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/12/head:pull/12 PR: https://git.openjdk.java.net/crac/pull/12 From abakhtin at openjdk.java.net Fri Jan 28 16:41:45 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 16:41:45 GMT Subject: [crac] Integrated: Reseed NativePRNG on checkpoint restore In-Reply-To: References: Message-ID: On Thu, 23 Dec 2021 11:30:13 GMT, Alexey Bakhtin wrote: > NativePRNG should be re-seeded during checkpoint/restore because it uses SHA1PRNG secure random for additional seed. It is seeded at initialization, so it is not re-seeded automatically during checkpoint/restore > Also, the internal buffer should be cleared at the checkpoint. This pull request has now been integrated. Changeset: 11e1037b Author: Alexey Bakhtin Committer: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/11e1037b099710e0819c0dc111ca56b8e7872a6e Stats: 255 lines in 4 files changed: 252 ins; 0 del; 3 mod Reseed NativePRNG on checkpoint restore Reviewed-by: akozlov ------------- PR: https://git.openjdk.java.net/crac/pull/9 From heidinga at openjdk.java.net Fri Jan 28 17:09:33 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 28 Jan 2022 17:09:33 GMT Subject: [crac] RFR: Get tracing properties lazily In-Reply-To: References: Message-ID: <6X97aort42aCwHCu8vO4SPNGDtjwmcSI3Q-33ebGRZ4=.bf4407bf-605e-4b8c-87b3-f046dbd9ffab@github.com> On Fri, 28 Jan 2022 16:11:13 GMT, Anton Kozlov wrote: > A minor change to make tracing flags lazily initialized. > > Without this, reading j.l.System's properties during jdk.crac.Core initialization could lead to NPE if happens too early, i.e. before j.l.System's properties are initialized. looks reasonable to me ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/12 From abakhtin at openjdk.java.net Fri Jan 28 17:16:11 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Fri, 28 Jan 2022 17:16:11 GMT Subject: [crac] RFR: Reseed secure random on checkpoint restore [v8] In-Reply-To: References: Message-ID: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator Alexey Bakhtin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'crac' into SecureRandom - Exclude CheckpointException from import - Revert CheckpointException changes - Added assert in SecureRandom.invalidate() - Update object lock during checkpoint/restore - Set JDKResource priorities for SecureRandom - Merge branch 'crac' of https://github.com/openjdk/crac into SecureRandom - Add separate JDKResorce for seeder - Reseed secure random on checkpoint restore ------------- Changes: https://git.openjdk.java.net/crac/pull/7/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=7&range=07 Stats: 258 lines in 4 files changed: 209 ins; 5 del; 44 mod Patch: https://git.openjdk.java.net/crac/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/7/head:pull/7 PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Mon Jan 31 09:29:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 31 Jan 2022 09:29:47 GMT Subject: [crac] RFR: Get tracing properties lazily In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 16:11:13 GMT, Anton Kozlov wrote: > A minor change to make tracing flags lazily initialized. > > Without this, reading j.l.System's properties during jdk.crac.Core initialization could lead to NPE if happens too early, i.e. before j.l.System's properties are initialized. Thanks! ------------- PR: https://git.openjdk.java.net/crac/pull/12 From akozlov at openjdk.java.net Mon Jan 31 09:29:48 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 31 Jan 2022 09:29:48 GMT Subject: [crac] Integrated: Get tracing properties lazily In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 16:11:13 GMT, Anton Kozlov wrote: > A minor change to make tracing flags lazily initialized. > > Without this, reading j.l.System's properties during jdk.crac.Core initialization could lead to NPE if happens too early, i.e. before j.l.System's properties are initialized. This pull request has now been integrated. Changeset: 51e564a5 Author: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/51e564a58cbdbf593f61fbbc8eb4adbb5cd19737 Stats: 86 lines in 3 files changed: 60 ins; 16 del; 10 mod Get tracing properties lazily Reviewed-by: heidinga ------------- PR: https://git.openjdk.java.net/crac/pull/12 From abakhtin at openjdk.java.net Mon Jan 31 09:31:49 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Mon, 31 Jan 2022 09:31:49 GMT Subject: [crac] Integrated: Reseed secure random on checkpoint restore In-Reply-To: References: Message-ID: <27rn50_kEHngbMcoHxmLtLJbxQLfsGkYAYjOg0pj0Cs=.1491a628-2475-48ed-816d-c5a726df9886@github.com> On Fri, 17 Dec 2021 13:38:18 GMT, Alexey Bakhtin wrote: > Proposed changes in the SecureRandom implementation allow invalidating and reseeding SHA1PRNG secure random during checkpoint/restore. SHA1PRNG can be invalidated and reseeded in case of being created with a default embedded seed generator. Also, SHA1PRNG is used as an additional seed generator to the SUN NativePRNG implementation, so it is desirable to have reseeded SHA1PRNG after restore. > Two jtreg tests added: > - verify if no deadlocks introduced by checkpoint/restore > - verify if SHA1PRNG is reseeded if created with default embedded seed generator This pull request has now been integrated. Changeset: b990928d Author: Alexey Bakhtin Committer: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/b990928db22278f580a78aef93966ca03024e728 Stats: 258 lines in 4 files changed: 209 ins; 5 del; 44 mod Reseed secure random on checkpoint restore Reviewed-by: akozlov ------------- PR: https://git.openjdk.java.net/crac/pull/7 From akozlov at openjdk.java.net Mon Jan 31 11:51:52 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 31 Jan 2022 11:51:52 GMT Subject: [crac] RFR: Ensure empty Reference Handler and Cleaners queues Message-ID: At the time of checkpoint, a set of References may need handling. This change ensures no References pending in ReferenceHandler and in Cleaners. System.gc() is a best effort attempt to make GC to look for References. Default VM flags (-DisableExplicitGC, -ExplicitGCInvokesConcurrent) should not block the call, but additional investigation is needed to make sure GC found all references. ------------- Commit messages: - Ensure empty Reference Handler and Cleaners queues Changes: https://git.openjdk.java.net/crac/pull/13/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=13&range=00 Stats: 126 lines in 5 files changed: 124 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/crac/pull/13.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/13/head:pull/13 PR: https://git.openjdk.java.net/crac/pull/13 From akozlov at openjdk.java.net Mon Jan 31 12:51:22 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 31 Jan 2022 12:51:22 GMT Subject: [crac] RFR: Ensure empty Reference Handler and Cleaners queues [v2] In-Reply-To: References: Message-ID: > At the time of checkpoint, a set of References may need handling. This change ensures no References pending in ReferenceHandler and in Cleaners. > > System.gc() is a best effort attempt to make GC to look for References. Default VM flags (-DisableExplicitGC, -ExplicitGCInvokesConcurrent) should not block the call, but additional investigation is needed to make sure GC found all references. Anton Kozlov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Ensure empty Reference Handler and Cleaners queues ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/13/files - new: https://git.openjdk.java.net/crac/pull/13/files/79096270..d26b2335 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=13&range=01 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=13&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/crac/pull/13.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/13/head:pull/13 PR: https://git.openjdk.java.net/crac/pull/13 From heidinga at openjdk.java.net Mon Jan 31 20:36:43 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Mon, 31 Jan 2022 20:36:43 GMT Subject: [crac] RFR: Ensure empty Reference Handler and Cleaners queues [v2] In-Reply-To: References: Message-ID: On Mon, 31 Jan 2022 12:51:22 GMT, Anton Kozlov wrote: >> At the time of checkpoint, a set of References may need handling. This change ensures no References pending in ReferenceHandler and in Cleaners. >> >> System.gc() is a best effort attempt to make GC to look for References. Default VM flags (-DisableExplicitGC, -ExplicitGCInvokesConcurrent) should not block the call, but additional investigation is needed to make sure GC found all references. > > Anton Kozlov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Ensure empty Reference Handler and Cleaners queues This seems like a reasonable approach to try out. I'm not sold on the extra `notifyAll` in ReferenceQueue::remove but don't have a better suggestion. src/java.base/share/classes/java/lang/ref/Reference.java line 344: > 342: @Override > 343: public void beforeCheckpoint(Context context) throws Exception { > 344: System.gc(); Is a single `System.gc()` sufficient for the Hotspot collectors? With OpenJ9, we used to treat back to back System.gc() calls specially as requiring extra effort. Does Hotspot do something similar? ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/13