From duke at openjdk.java.net Tue May 3 16:35:49 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Tue, 3 May 2022 16:35:49 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: On Thu, 28 Apr 2022 12:25:20 GMT, Anton Kozlov wrote: >> VM changes: To identify properties that can be modified on restore, >> added a new bool field SystemProperty::_modifiable_on_restore. >> All the jdk related properties are marked unmodifiable. Rest of the >> properties are considered modifiable. >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the >> properties prefixed with "-D" are maintained in a separate list in >> Arguments::_system_properties_for_restore. This list is passed to the JVM >> being restored by writing to a shared memory object. >> When the JVM is restored, it reads the new properties from shared memory >> object and updates its existing list of properties maintained in >> Arguments::_system_properties. >> >> JDK changes: System::props needs to be updated on restore to account for >> new system properties. For this purpose j.l.System registers a new >> JDKResource which queries new properties from the VM in afterRestore() >> notification and updates System::props. The JDKResource registered by >> j.l.System is given highest priority so it is the first resource to get >> afterRestore() notification. >> >> Signed-off-by: Ashutosh Mehra > >> VM changes: To identify properties that can be modified on restore, added a new bool field SystemProperty::_modifiable_on_restore. All the jdk-related properties are marked unmodifiable. The rest of the properties are considered modifiable. > > On the java level, all properties can be changed with `System.setProperty`. Our property model is much closer to a program do e.g. after restore > > System.setProperty("sun.boot.library.path", "test"); > > There is no mechanism (except the deprecated SecurityManager) that prevents this from working, although no one promises the property value will be considered if set in this way. So I propose not to limit ourselves artificially. Who knows, maybe some JDK system property is OK to fix it on restore, e.g. if the checkpoint is done very early and a class that reads the property is not yet initialized. > >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the properties prefixed with "-D" are maintained in a separate list in Arguments::_system_properties_for_restore. > > Have you considered how to make this change smaller? E.g adding an Explicit marker on the property? That's unfortunate that implicit properties are already added to the list before we get to the restore, and explicit properties cannot be distinguished from them. I'm fine with the change in Arguments::parse_each_vm_init_arg, but refactorings or adding a set of methods to manage the for_restore set is a bit overkill. Initializing a full VM just to be replaced by another restored one (as we do for now) does not make a lot of sense and needs fixing at some moment. So I'd like to avoid too many changes in the VM code that we'll have to revert. > > Another way is to pass all properties from restoring VM to the one being restored, for simplicity and flexibility. And to filter properties in the being restored VM -- that will be the single point of responsibility. But here there is a drawback: VMs may have different versions, so implicit properties in the being restored VM will be implicitly overwritten. > >> This list is passed to the JVM being restored by writing to a shared memory object. When the JVM is restored, it reads the new properties from shared memory object and updates its existing list of properties maintained in Arguments::_system_properties. > > I could not find users of the _system_properties after VM was initialized and JDK pulled the set of properties from the VM. I think it's not necessary to do, just as a call to j.l.System.setProperty is not reflected in the VM. > >> JDK changes: System::props needs to be updated on restore to account for new system properties. For this purpose j.l.System registers a new JDKResource which queries new properties from the VM in afterRestore() notification and updates System::props. The JDKResource registered by j.l.System is given highest priority so it is the first resource to get afterRestore() notification. > > AFAIU, for the regular bootstrap procedure, "pulling" of properties is required as the VM does not know when j.l.System will be initialized, so instead, they are pulled at the j.l.System's initialization. But I think we can assume j.l.System is always initialized at the checkpoint, so we can avoid the resource, and make jdk.crac.Core just to set the new properties with the j.l.System.setProperty. @AntonKozlov thanks for reviewing the patch. > So I propose not to limit ourselves artificially. Who knows, maybe some JDK system property is OK to fix it on restore, e.g. if the checkpoint is done very early and a class that reads the property is not yet initialized. My main intention of this patch is to allow users to specify different properties through command line when restoring the process. I am assuming in this scenario the user would want the new properties to have same effect as when starting a new JVM process. In this respect this mechanism is different from calling `System.setProperty()` for which, as you mentioned in your comment, no promise is made that they would be considered. Now, for some jdk internal properties, we may be able to consider the new values, but I believe it would not be possible for every jdk internal property. This is the reason why I wanted to separate out the properties that can be modified from those which cannot be. I intentionally chose to mark all jdk-related properties as unmodifiable because I am focusing mainly on application specific properties. However, this should not prevent us from marking a jdk-related property as modifiable in future, but it would have to be done on case-by-case basis as the need arises. For example, if we want to allow the user to specify different directory for native libraries on restore, we can mark `java.library.path` as modifiable and handle its users at the Java level using the restore hooks. > Have you considered how to make this change smaller? E.g adding an Explicit marker on the property? That's unfortunate that implicit properties are already added to the list before we get to the restore, and explicit properties cannot be distinguished from them. I'm fine with the change in Arguments::parse_each_vm_init_arg, but refactorings or adding a set of methods to manage the for_restore set is a bit overkill. Initializing a full VM just to be replaced by another restored one (as we do for now) does not make a lot of sense and needs fixing at some moment. So I'd like to avoid too many changes in the VM code that we'll have to revert. > Another way is to pass all properties from restoring VM to the one being restored, for simplicity and flexibility. And to filter properties in the being restored VM -- that will be the single point of responsibility. But here there is a drawback: VMs may have different versions, so implicit properties in the being restored VM will be implicitly overwritten. I agree this is an overkill. The reason why I didn't add an Explicit like marker is because it adds another bool field to each SystemProperty object, which has no use in the regular JVM. It will only be used in the initiating JVM, although I agree it would be a temporary change, provided we find another way to initiate the restore operation instead of using the full JVM. I also feel adding explicit methods to handle properties for restore is much cleaner as it separates out the code for handling properties for restore operation from the regular processing. It makes it easier to follow code IMHO. I don't have strong preference about either of these mechanisms, considering these are temporary changes and expected to be removed once we find a better mechanism to initiate restore operation. So if you feel adding Explicit marker is better I can introduce that change. I also thought about passing all the properties and filtering them out in the restored VM, but dropped it for the reason you mentioned. > I could not find users of the _system_properties after VM was initialized and JDK pulled the set of properties from the VM. I think it's not necessary to do, just as a call to j.l.System.setProperty is not reflected in the VM. As I stated in my first comment, I believe this is exactly the difference between `System.getProperty()` and this mechanism of providing new properties on restore. We would want the VM view of the system properties to also be updated. Another reason for updating `_system_properties` is JVMTI api `GetSystemProperties()` which relies on `_system_properties`. After restore `GetSystemProperties()` should provide the system properties that take into account new properties specified on commnad line. > But I think we can assume j.l.System is always initialized at the checkpoint, so we can avoid the resource, and make jdk.crac.Core just to set the new properties with the j.l.System.setProperty. Fair point. I will make the change to update properties using `System.getProperty()` directly after restore from `jdk.crac.Core`. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Thu May 5 03:01:46 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Thu, 5 May 2022 03:01:46 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v2] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Some refactoring to update system properties from jdk.crac.Core Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/be595dab..927bf3b2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=01 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=00-01 Stats: 378 lines in 8 files changed: 162 ins; 180 del; 36 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Thu May 5 03:03:35 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Thu, 5 May 2022 03:03:35 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v3] In-Reply-To: References: Message-ID: <_IOodr-rIXuvg-rNG24sz4HRp7ACrEDiLz6JGTwtG6s=.a2623e15-dd72-4f94-b6f2-fd2e4f100330@github.com> > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Remove unused JVM_GetModifiableProperties Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/927bf3b2..04c4f970 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=02 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=01-02 Stats: 41 lines in 3 files changed: 0 ins; 41 del; 0 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Thu May 5 09:03:50 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 5 May 2022 09:03:50 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: On Tue, 3 May 2022 16:32:03 GMT, Ashutosh Mehra wrote: > My main intention of this patch is to allow users to specify different properties through command line when restoring the process. Completely agree. I think this is the most useful part of the properties update feature. That's unfortunate that JDK properties are such problematic, that's why I'd like to avoid doing something completely new with an unclear path to validate, and instead do something familiar for users. > I am assuming in this scenario the user would want the new properties to have same effect as when starting a new JVM process. In this respect this mechanism is different from calling `System.setProperty()` > We would want the VM view of the system properties to also be updated. Another reason for updating `_system_properties` is JVMTI api `GetSystemProperties()` which relies on `_system_properties`. After restore `GetSystemProperties()` should provide the system properties that take into account new properties specified on commnad line. Could you elaborate a bit more about the setting properties as in a new JVM process? E.g. why this cannot be avoided, or when this can be useful compared to other approaches? I like the JVMTI example because it is another model we can follow. I think we need to choose one model (JVMTI and j.l.S.properties are identified for now), follow it, and try to understand how much it satisfies users' needs. Or at least ours :) Updating _system_properties scares me. They are not changed with System.getProperties and a JVMTI agent can only change them during OnLoad phase [1]. Is there a way to change them once the java code started to execute? [1] https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#SetSystemProperty > Now, for some jdk internal properties, we may be able to consider the new values, but I believe it would not be possible for every jdk internal property. This is the reason why I wanted to separate out the properties that can be modified from those which cannot be. I would like to distinguish setting a value for a property and the way or the point when the value is processed by some unrelated entity to the way the property is set. Handling is a very implementation detail. System.setProperty can change any system property. JVMTI SetSystemProperty cannot change only ones with SystemProperty::_writeable == false, and there are not so many of them. For example, "java.vm.specification.name" cannot be set in _system_properties, but can be set by System.setProperty. Depending on the property, it may be processed at any time. A common pattern is to cache the value in the initialization of a class that will use that later. So updating a JDK property in System.properties may make sense early. And a set of VM properties follow a pattern to get the value early in JDK bootstrap before it can be changed by System.setProperty. For example, "sun.nio.MaxDirectMemorySize". So even if the property is updated in _system_properties, the value is likely to be cached in JDK. But the visible update in _system_properties may trick JVMTI code about the actual value used by the platform (JDK in this case). > I don't have strong preference about either of these mechanisms, considering these are temporary changes and expected to be removed once we find a better mechanism to initiate restore operation. So if you feel adding Explicit marker is better I can introduce that change. I'm not sure. Having that there is another pass over all of the arguments anyway, is it possible to provide all explicit native arguments to the function doing the restore (os::Linux::restore for now)? Then we won't need to introduce a lot of aux code in the VM and it will be simpler to move out VM eventually. PS. Thanks for the recent changes, I haven't read them in detail, but the structure looks better IMO. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From heidinga at redhat.com Thu May 5 17:23:00 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Thu, 5 May 2022 13:23:00 -0400 Subject: Execution models Message-ID: Hi, partly inspired by Anton's "Provide arguments for restore" PR [1], I wrote up some thoughts on how we can use those changes to make our source code phase-aware. The key takeaway in my post [2] is that being able to specify one main class for the checkpoint phase and another for the restore phase lets developers reflect the structure of these phases in their source code. That's powerful! Something to be thinking about as we evolve our approach to checkpoint/restore. --Dan [1] https://github.com/openjdk/crac/pull/16 [2] https://danheidinga.github.io/phase-aware-source-code/ From duke at openjdk.java.net Fri May 6 15:17:25 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Fri, 6 May 2022 15:17:25 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: <1DfzVHDKw73BMvi1K26B-osgfJzB0e9SAz99uKvNlaQ=.51e789c0-0718-4185-aed8-2a96851b5899@github.com> On Thu, 5 May 2022 09:00:17 GMT, Anton Kozlov wrote: > I think we need to choose one model (JVMTI and j.l.S.properties are identified for now), follow it, and try to understand how much it satisfies users' needs. Or at least ours :) >From this perspective I think we can go with not updating `_system_properties` on restore. Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in `j.l.System`. > I would like to distinguish setting a value for a property and the way or the point when the value is processed by some unrelated entity to the way the property is set. Handling is a very implementation detail. I get your point. Let's drop the "modifiable" marker on the properties and just update the `java.lang.System.getProperties` view so that applications can make use of the new/updated properties. > Having that there is another pass over all of the arguments anyway, is it possible to provide all explicit native arguments to the function doing the restore (os::Linux::restore for now)? Then we won't need to introduce a lot of aux code in the VM and it will be simpler to move out VM eventually. Did you mean we just pass the properties and application arguments directly to `os::Linux::restore()` instead of getting it through `Arguments` class? It should be possible. I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. The java launcher adds following properties to the `args` in addition to the user specified: 1. -Djava.class.path 2. -Dsun.java.command 3. -Dsun.java.launcher We can ignore 1 and 3, collect the application arguments from 2, filter any option starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Wed May 11 18:31:30 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 11 May 2022 18:31:30 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: <1DfzVHDKw73BMvi1K26B-osgfJzB0e9SAz99uKvNlaQ=.51e789c0-0718-4185-aed8-2a96851b5899@github.com> References: <1DfzVHDKw73BMvi1K26B-osgfJzB0e9SAz99uKvNlaQ=.51e789c0-0718-4185-aed8-2a96851b5899@github.com> Message-ID: <5HuepVFd38Nizg5Vs_OC8fGCTx1J7yKfMOscfAGGa7Q=.703ba870-753b-4705-85fb-9bbeb5765f38@github.com> On Fri, 6 May 2022 15:14:11 GMT, Ashutosh Mehra wrote: > Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in j.l.System. Seems so, although not something I would call a strict requirement -- something that has emerged as useful and clear. > I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. Yeah, `os::Linux::restore(char** args)` for example, with -D and arguments. At the moment we need a few -XX: arguments parsed, CRaCRestoreFrom, CREngine, and probably more. It was convenient to assume all -XX arguments are parsed and available in os::Linux::restore. But since this won't last forever, let's see how and if to preserve the existing arguments parsing. I would avoid doing big changes, but up to you. > It The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. The java launcher adds following properties to the `args` in addition to the user specified: > > 1. -Djava.class.path > 2. -Dsun.java.command > 3. -Dsun.java.launcher > > We can ignore 1 and 3, collect the application arguments from 2, filter options in `args` starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? Apparently yes. Although if possible, it would be better to collect positional arguments directly from the arguments set instead of sun.java.command, so arguments with blank chars will be handled correctly -- this is a problem for now. Thanks! ------------- PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 18 16:36:05 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 18 May 2022 16:36:05 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v4] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Parse arguments for restore separately Also - removed "modifiable" marker for system properties - pass all system properties to the process being restored Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/04c4f970..9859e9c0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=03 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=02-03 Stats: 214 lines in 6 files changed: 77 ins; 116 del; 21 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 18 17:58:11 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 18 May 2022 17:58:11 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v5] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Parse arguments for restore separately Also - removed "modifiable" marker for system properties - pass all system properties to the process being restored Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/9859e9c0..e8476c58 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=04 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=03-04 Stats: 40 lines in 5 files changed: 11 ins; 26 del; 3 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 18 18:08:52 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 18 May 2022 18:08:52 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v6] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Parse arguments for restore separately and perform restore early Also - removed "modifiable" marker for system properties - pass all system properties to the process being restored Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/e8476c58..7603303d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=05 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=04-05 Stats: 9 lines in 1 file changed: 0 ins; 9 del; 0 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 18 18:11:38 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 18 May 2022 18:11:38 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v7] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Parse arguments for restore separately and perform restore early Also - removed "modifiable" marker for system properties - pass all system properties to the process being restored Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/7603303d..6943dcbb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=06 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 18 18:20:22 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 18 May 2022 18:20:22 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: <5HuepVFd38Nizg5Vs_OC8fGCTx1J7yKfMOscfAGGa7Q=.703ba870-753b-4705-85fb-9bbeb5765f38@github.com> References: <1DfzVHDKw73BMvi1K26B-osgfJzB0e9SAz99uKvNlaQ=.51e789c0-0718-4185-aed8-2a96851b5899@github.com> <5HuepVFd38Nizg5Vs_OC8fGCTx1J7yKfMOscfAGGa7Q=.703ba870-753b-4705-85fb-9bbeb5765f38@github.com> Message-ID: On Wed, 11 May 2022 18:27:59 GMT, Anton Kozlov wrote: >>> I think we need to choose one model (JVMTI and j.l.S.properties are identified for now), follow it, and try to understand how much it satisfies users' needs. Or at least ours :) >> >> From this perspective I think we can go with not updating `_system_properties` on restore. Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in `j.l.System`. >> >>> I would like to distinguish setting a value for a property and the way or the point when the value is processed by some unrelated entity to the way the property is set. Handling is a very implementation detail. >> >> I get your point. Let's drop the "modifiable" marker on the properties and just update the `java.lang.System.getProperties` view so that applications can make use of the new/updated properties. >> >>> Having that there is another pass over all of the arguments anyway, is it possible to provide all explicit native arguments to the function doing the restore (os::Linux::restore for now)? Then we won't need to introduce a lot of aux code in the VM and it will be simpler to move out VM eventually. >> >> Did you mean we just pass the properties and application arguments directly to `os::Linux::restore()` instead of getting it through `Arguments` class? It should be possible. I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. >> The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. >> The java launcher adds following properties to the `args` in addition to the user specified: >> 1. -Djava.class.path >> 2. -Dsun.java.command >> 3. -Dsun.java.launcher >> >> We can ignore 1 and 3, collect the application arguments from 2, filter options in `args` starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? > >> Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in j.l.System. > > Seems so, although not something I would call a strict requirement -- something that has emerged as useful and clear. > >> I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. > > Yeah, `os::Linux::restore(char** args)` for example, with -D and arguments. > > At the moment we need a few -XX: arguments parsed, CRaCRestoreFrom, CREngine, and probably more. It was convenient to assume all -XX arguments are parsed and available in os::Linux::restore. But since this won't last forever, let's see how and if to preserve the existing arguments parsing. I would avoid doing big changes, but up to you. > >> It The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. The java launcher adds following properties to the `args` in addition to the user specified: >> >> 1. -Djava.class.path >> 2. -Dsun.java.command >> 3. -Dsun.java.launcher >> >> We can ignore 1 and 3, collect the application arguments from 2, filter options in `args` starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? > > Apparently yes. Although if possible, it would be better to collect positional arguments directly from the arguments set instead of sun.java.command, so arguments with blank chars will be handled correctly -- this is a problem for now. > > Thanks! @AntonKozlov > Apparently yes. Although if possible, it would be better to collect positional arguments directly from the arguments set instead of sun.java.command, so arguments with blank chars will be handled correctly -- this is a problem for now. Even in normal mode the positional arguments are gathered from `sun.java.command`. They don't seem to be available directly from the arguments. So I have stayed with that approach. I have updated the change set based on previous comments. Please review. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From volker.simonis at gmail.com Thu May 19 12:09:09 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 19 May 2022 14:09:09 +0200 Subject: Snapsafety of core library classes Message-ID: Hi, I wonder if anybody has thought about how snapsafety for the core library classes should be implemented in CRaC? By "snapsafety" I mean correct and secure operation after restoring a JVM process which was previously checkpointed and possibly cloned. The first question is about deciding which classes can be considered snapsafe? Naively any class whose objects hold some state will be affected by snapshotting and cloning. For simple classes like String or Integer we know that their objects are constant and cloning them doesn't do any harm. Objects of other classes might however contain more sensitive state like caches, unique identifiers, certificates, encryption keys etc. which shouldn't be cloned or which become invalid after restore. By looking at the current CRaC repository [1] I can see that some classes (e.g. sun.security.provider.SecureRandom or sun.security.provider.NativePRNG.RandomIO) directly implement j.i.c.JDKResource in order to make them snapsafe. But all the classes which do so, are non-public. This means that snapsafety is currently a "hidden", implicit feature of some classes in the core library (i.e. if I create a new j.s.SecureRandom object, I can not know if it will be snapsafe or not). Do we want to make snapsafety an undocumented, implicit feature or do we want to explicitly call it out in the JavaDoc, e.g. by forcing classes which want to be snapsafe to implement javax.crac.Resource (similar to implementing Serializable)? I think both approaches have their pros and cons. If we make snapsafety an explicit feature, we tell users that the corresponding classes will behave correctly on snapshot and restore events. But what about all the other classes in the core libraries. Are they all snapsafe or snapunsafe by default? If we make snapsafety an implicit feature it would become an "implementation detail". This means we could have JDKs which are snapsafe while other are not. It also means we could make older JDK version snapsafe which would not be possible with the explicit model because it is impossible to retrofit classes in older releases to implement new interfaces. @Dan: I remember you've mentioned that you've experimented with CRiU in OpenJ9 as well. I'd be specifically interested about the core library changes you had to do in order to make the JDK snapsafe. I took a look at the OpenJ9 snapshot branch [2] , but couldn't find and library changes there at all? Could you please share more details on this topic if possible? What are your thoughts on this issue? Best regards, Volker [1] https://github.com/openjdk/crac/compare/crac?expand=1#diff-b7061481 [2] https://github.com/eclipse-openj9/openj9/compare/snapshot#diff-54ac925d From duke at openjdk.java.net Thu May 19 19:22:18 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Thu, 19 May 2022 19:22:18 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v8] In-Reply-To: References: Message-ID: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Parse arguments for restore separately and perform restore early Also - removed "modifiable" marker for system properties - pass all system properties to the process being restored Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/6943dcbb..6356599f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=07 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=06-07 Stats: 16 lines in 1 file changed: 8 ins; 8 del; 0 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From heidinga at redhat.com Fri May 20 13:38:18 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Fri, 20 May 2022 09:38:18 -0400 Subject: Snapsafety of core library classes In-Reply-To: References: Message-ID: (CCing a couple of the OpenJ9 developers involved in the CRIU-efforts for their awareness as well) On Thu, May 19, 2022 at 8:09 AM Volker Simonis wrote: > > Hi, > > I wonder if anybody has thought about how snapsafety for the core > library classes should be implemented in CRaC? By "snapsafety" I mean > correct and secure operation after restoring a JVM process which was > previously checkpointed and possibly cloned. This is currently being developed on an ad-hoc basis in CRaC. Look for classes that implement the jdk.crac.Resource interface and the actions they take in the ::afterRestore / ::beforeCheckpoint methods to see how each class has addressed its own "snapsafety". To your point, I think we're still exploring and determining the cases that are snapsafe (or not). We can look at the classes GraalVM has patched with Substitutions as a starting set of classes that will need adaptation to be snapsafe. That will help identify a starting set but the full set will be larger. > The first question is about deciding which classes can be considered > snapsafe? Naively any class whose objects hold some state will be > affected by snapshotting and cloning. For simple classes like String > or Integer we know that their objects are constant and cloning them > doesn't do any harm. Objects of other classes might however contain > more sensitive state like caches, unique identifiers, certificates, > encryption keys etc. which shouldn't be cloned or which become invalid > after restore. Agreed. Though each class will need to be individually examined to ensure that the changes to make it snapshot don't break the invariants of the class. Looking just at caches as an example, it may seem safe to clean out the cache before a checkpoint but doing so may break invariants about canonicalization of values as those looked up prior to the checkpoint may be different (not ==) to those looked up after restore. > By looking at the current CRaC repository [1] I can see that some > classes (e.g. sun.security.provider.SecureRandom or > sun.security.provider.NativePRNG.RandomIO) directly implement > j.i.c.JDKResource in order to make them snapsafe. But all the classes > which do so, are non-public. This means that snapsafety is currently a > "hidden", implicit feature of some classes in the core library (i.e. > if I create a new j.s.SecureRandom object, I can not know if it will > be snapsafe or not). > > Do we want to make snapsafety an undocumented, implicit feature or do > we want to explicitly call it out in the JavaDoc, e.g. by forcing > classes which want to be snapsafe to implement javax.crac.Resource > (similar to implementing Serializable)? Bringing snapsafety into the language makes sense. Implementing Resource is probably overkill for most classes as their safety is an emergent property of the field's snap safety. Can we reverse this to tag "snap-unsafe" classes and have javac warn / error when compiling a class with snap-unsafe fields unless they implement Resource? Does the concept of snapsafety need to differentiate between the static state of the class and its instances? > > I think both approaches have their pros and cons. If we make > snapsafety an explicit feature, we tell users that the corresponding > classes will behave correctly on snapshot and restore events. But what > about all the other classes in the core libraries. Are they all > snapsafe or snapunsafe by default? > > If we make snapsafety an implicit feature it would become an > "implementation detail". This means we could have JDKs which are > snapsafe while other are not. It also means we could make older JDK > version snapsafe which would not be possible with the explicit model > because it is impossible to retrofit classes in older releases to > implement new interfaces. I'd prefer to make it explicit in the programming model to avoid the "sins of serialization". Brian wrote a document titled "Towards Better Serialization" [A] where it outlines the issues with serialization, including: * "Pretends to be a library feature, but isn't", * "Pretends to be a statically typed feature, but isn't", and * "Magic methods and fields". We should be thinking about snapsafety in the context of serialization (as that's effectively what a snapshot is) and any solution we propose should be clear on how it avoids the sins of serialization. > > @Dan: I remember you've mentioned that you've experimented with CRiU > in OpenJ9 as well. I'd be specifically interested about the core > library changes you had to do in order to make the JDK snapsafe. I > took a look at the OpenJ9 snapshot branch [2] , but couldn't find and > library changes there at all? Could you please share more details on > this topic if possible? The snapshot branch (now inactive) was our experiment to do snapshot/restore directly in the JVM. OpenJ9's since switched to working on CRIU checkpoint/restore as it allows solving a smaller problem (the libraries, basically) first. The code for this is in the master branch under feature flags. The J9 approach to CRIU has a slightly different model than that used in CRaC. It's model is to treat lifecycle hooks (equivalent of jdk.crac.Resource) in basically three layers: * application level hooks, * class library hooks, and * JVM hooks. The VM enters a single threaded mode to execute the class library hooks and the JVM hooks. This avoids some difficult interactions between updating things and having both the updated and original values being consumed at the same time at the cost of some (potential) deadlock concerns. Similarly, we use a single threaded mode on restore as well. The two major class library level hooks we've added so far address environment variables [B] and security providers [C]. For the env vars, we only allow setting new env vars at restore. This honours the spirit of the JVM's existing approach to cache the env vars on first access, and prevents inconsistent views of what the env is actually set to so we avoid having old vs new consistency issues. For security providers, J9 installs a minimal provider prior to the checkpoint to avoid caching sensitive state in the checkpoint. At restore, it removes the minimal provider and installs the full set of real providers. For the JVM hooks, J9 uses a heap walk to apply per-object fixups. This is how j.u.Random is reseeded [D]. We've also been looking at Timers to determine how to adapt them to account for the time lapse between the checkpoint and restore [E]. The use cases we've been looking at are primarily containerized applications. The level of snapsafety needed when the container has limited distribution is probably less than needed when the container (or checkpoint) will be broadly distributed. We need to look at snapsafety as a layered approach that depends in part on the deployment model. The more widely deployed a checkpoint image is shared, the more care is needed to redact/fixup the info included in the image. Sorry for the slightly rambling response, but there are lots of different angles we can look at snapsafety through. --Dan [A] https://openjdk.java.net/projects/amber/design-notes/towards-better-serialization [B] https://github.com/eclipse-openj9/openj9/blob/45e4b0bd91018ffd35c5e2d72dd27632a84af5d2/jcl/src/openj9.criu/share/classes/org/eclipse/openj9/criu/CRIUSupport.java#L489 [C] https://github.com/eclipse-openj9/openj9/blob/45e4b0bd91018ffd35c5e2d72dd27632a84af5d2/jcl/src/openj9.criu/share/classes/org/eclipse/openj9/criu/SecurityProviders.java#L28-L39 [D] https://github.com/eclipse-openj9/openj9/blob/cc586f03bd5359157f99fb342015f24f4e064755/runtime/vm/CRIUHelpers.cpp#L160-L169 [E] https://github.com/eclipse-openj9/openj9/issues/14211#issuecomment-1117937739 > > What are your thoughts on this issue? > > Best regards, > Volker > > [1] https://github.com/openjdk/crac/compare/crac?expand=1#diff-b7061481 > [2] https://github.com/eclipse-openj9/openj9/compare/snapshot#diff-54ac925d > From akozlov at azul.com Mon May 23 20:11:29 2022 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 23 May 2022 23:11:29 +0300 Subject: Snapsafety of core library classes In-Reply-To: References: Message-ID: <3094e92a-c1a9-ff76-a5eb-e323fd8b73ff@azul.com> Hi, On 5/19/22 15:09, Volker Simonis wrote: > I wonder if anybody has thought about how snapsafety for the core > library classes should be implemented in CRaC? By "snapsafety" I mean > correct and secure operation after restoring a JVM process which was > previously checkpointed and possibly cloned. First of all, thanks for the start of the interesting discussion. I just skimmed through Dan's answer and I will need some time to think properly about that. I think "snapsafety" needs a more precise definition. For correctness, we know problems when a field of a java object to be interpreted as an external resource id: a pointer packed as long, an fd as int, a String that is a key in a network DB. Problems arise with the interpretation, and only if interpretation involves some objects that are outside of the current java instance. A java object itself (as a data type and associate methods) is always correct after the restore, since its bit representation of primitives and object refs are not changed, and associated methods are the same. Clones of an object do not interfere, so all of them should be correct. For the security of the operation, SecureRandom vs Random is a great example. One is secure and the other is not, and the difference is very domain-specific, rather some universal predicate for an instance of j.l.Object. I assume you are looking some mean for java programmer to ensure some properties of their programs w.r.t. checkpoint and restore? or regarding an implementation of java with CRaC, to avoid or report using java classes that were not changed for CRaC, although they should? > If we make > snapsafety an explicit feature, we tell users that the corresponding > classes will behave correctly on snapshot and restore events. The fact a class implements the interface does not necessarily imply something useful. The implementation of the methods can be empty or buggy. Although JDK classes can ensure something strict, we'd need some way to check implementation of the interface for user classes. Thanks, Anton From akozlov at openjdk.java.net Wed May 25 11:02:20 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 25 May 2022 11:02:20 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v8] In-Reply-To: References: Message-ID: On Thu, 19 May 2022 19:22:18 GMT, Ashutosh Mehra wrote: >> VM changes: To identify properties that can be modified on restore, >> added a new bool field SystemProperty::_modifiable_on_restore. >> All the jdk related properties are marked unmodifiable. Rest of the >> properties are considered modifiable. >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the >> properties prefixed with "-D" are maintained in a separate list in >> Arguments::_system_properties_for_restore. This list is passed to the JVM >> being restored by writing to a shared memory object. >> When the JVM is restored, it reads the new properties from shared memory >> object and updates its existing list of properties maintained in >> Arguments::_system_properties. >> >> JDK changes: System::props needs to be updated on restore to account for >> new system properties. For this purpose j.l.System registers a new >> JDKResource which queries new properties from the VM in afterRestore() >> notification and updates System::props. The JDKResource registered by >> j.l.System is given highest priority so it is the first resource to get >> afterRestore() notification. >> >> Signed-off-by: Ashutosh Mehra > > Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Parse arguments for restore separately and perform restore early > > Also > - removed "modifiable" marker for system properties > - pass all system properties to the process being restored > > Signed-off-by: Ashutosh Mehra Minor comments below. I've tried the patch with CRIU and CREngine=pauseengine and it works as expected: it's possible to set new user properties and override existing ones. Great! src/hotspot/os/linux/os_linux.cpp line 285: > 283: }; > 284: > 285: class VM_CracRestoreParameters : public CHeapObj { `VM_` prefix look like this class inherits VM_Op, but it's not. Removing the prefix would be fine. src/hotspot/os/linux/os_linux.cpp line 308: > 306: _nprops(0), > 307: _properties(new (ResourceObj::C_HEAP, mtInternal) GrowableArray(0, mtInternal)), > 308: _args(args) This constructor apparently is not used, remove? ------------- PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 25 14:52:17 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 25 May 2022 14:52:17 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v9] In-Reply-To: References: Message-ID: <3gzQjo_0ocvvO0fqeTGMid7REsMlo306Acnc0TnGhMM=.80e3f510-6e56-46c3-946d-6199ad2a314c@github.com> > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Rename VM_CracRestoreParameters to CracRestoreParameters Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/21/files - new: https://git.openjdk.java.net/crac/pull/21/files/6356599f..7cdd91b0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=08 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=07-08 Stats: 10 lines in 1 file changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 25 14:52:19 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 25 May 2022 14:52:19 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v8] In-Reply-To: References: Message-ID: On Thu, 19 May 2022 19:22:18 GMT, Ashutosh Mehra wrote: >> VM changes: To identify properties that can be modified on restore, >> added a new bool field SystemProperty::_modifiable_on_restore. >> All the jdk related properties are marked unmodifiable. Rest of the >> properties are considered modifiable. >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the >> properties prefixed with "-D" are maintained in a separate list in >> Arguments::_system_properties_for_restore. This list is passed to the JVM >> being restored by writing to a shared memory object. >> When the JVM is restored, it reads the new properties from shared memory >> object and updates its existing list of properties maintained in >> Arguments::_system_properties. >> >> JDK changes: System::props needs to be updated on restore to account for >> new system properties. For this purpose j.l.System registers a new >> JDKResource which queries new properties from the VM in afterRestore() >> notification and updates System::props. The JDKResource registered by >> j.l.System is given highest priority so it is the first resource to get >> afterRestore() notification. >> >> Signed-off-by: Ashutosh Mehra > > Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Parse arguments for restore separately and perform restore early > > Also > - removed "modifiable" marker for system properties > - pass all system properties to the process being restored > > Signed-off-by: Ashutosh Mehra Added new commit to address the suggestion for renaming `VM_CracRestoreParameters`. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From duke at openjdk.java.net Wed May 25 14:52:21 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 25 May 2022 14:52:21 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v8] In-Reply-To: References: Message-ID: On Tue, 24 May 2022 15:47:17 GMT, Anton Kozlov wrote: >> Ashutosh Mehra has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Parse arguments for restore separately and perform restore early >> >> Also >> - removed "modifiable" marker for system properties >> - pass all system properties to the process being restored >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/os/linux/os_linux.cpp line 285: > >> 283: }; >> 284: >> 285: class VM_CracRestoreParameters : public CHeapObj { > > `VM_` prefix look like this class inherits VM_Op, but it's not. Removing the prefix would be fine. Done > src/hotspot/os/linux/os_linux.cpp line 308: > >> 306: _nprops(0), >> 307: _properties(new (ResourceObj::C_HEAP, mtInternal) GrowableArray(0, mtInternal)), >> 308: _args(args) > > This constructor apparently is not used, remove? This constructor is used in `os::Linux::restore` ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Thu May 26 13:25:09 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 26 May 2022 13:25:09 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v9] In-Reply-To: <3gzQjo_0ocvvO0fqeTGMid7REsMlo306Acnc0TnGhMM=.80e3f510-6e56-46c3-946d-6199ad2a314c@github.com> References: <3gzQjo_0ocvvO0fqeTGMid7REsMlo306Acnc0TnGhMM=.80e3f510-6e56-46c3-946d-6199ad2a314c@github.com> Message-ID: On Wed, 25 May 2022 14:52:17 GMT, Ashutosh Mehra wrote: >> VM changes: To identify properties that can be modified on restore, >> added a new bool field SystemProperty::_modifiable_on_restore. >> All the jdk related properties are marked unmodifiable. Rest of the >> properties are considered modifiable. >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the >> properties prefixed with "-D" are maintained in a separate list in >> Arguments::_system_properties_for_restore. This list is passed to the JVM >> being restored by writing to a shared memory object. >> When the JVM is restored, it reads the new properties from shared memory >> object and updates its existing list of properties maintained in >> Arguments::_system_properties. >> >> JDK changes: System::props needs to be updated on restore to account for >> new system properties. For this purpose j.l.System registers a new >> JDKResource which queries new properties from the VM in afterRestore() >> notification and updates System::props. The JDKResource registered by >> j.l.System is given highest priority so it is the first resource to get >> afterRestore() notification. >> >> Signed-off-by: Ashutosh Mehra > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM_CracRestoreParameters to CracRestoreParameters > > Signed-off-by: Ashutosh Mehra Thanks! LGTM ------------- Marked as reviewed by akozlov (Lead). PR: https://git.openjdk.java.net/crac/pull/21 From heidinga at redhat.com Thu May 26 14:25:55 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Thu, 26 May 2022 10:25:55 -0400 Subject: FJP::setParallelism update Message-ID: How often should we plan to update the crac branch from the mainline? With the recent merge of Loom, j.u.c.ForkJoinPool gained a new capability that allows users to resize the existing FJPs. This was added by Doug Lea (big thanks!) to support Loom's use cases and to also to help us adapt to changes in the number of threads between checkpoint & restore. The new method is ::setParallelism(int) [0]. Once the crac branch consumes the latest jdk stream, I'll update FJP to implement Resource and we can update the parallelism so pools, in particular the common pool, work better when the available # of threads changes. --Dan [0] https://github.com/openjdk/crac/blob/f235955eefb1141a2e72116dfcf345e40416f059/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L2938 From akozlov at azul.com Fri May 27 10:44:56 2022 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 27 May 2022 13:44:56 +0300 Subject: FJP::setParallelism update In-Reply-To: References: Message-ID: On 5/26/22 17:25, Dan Heidinga wrote: > How often should we plan to update the crac branch from the mainline? I'd like to keep the branch with JDK17 (the latest LTS). How about crac branch receiving updates from openjdk/jdk/master, and crac-17u from openjdk/jdk17u/master? And to encourage doing backports of CRaC-specific changes from crac to crac-17u, bit don't make that mandatory. If this is OK, we'll need to fork off crac-17u first. Thanks, Anton From akozlov at azul.com Fri May 27 11:12:40 2022 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 27 May 2022 14:12:40 +0300 Subject: FJP::setParallelism update In-Reply-To: References: Message-ID: <12e3b897-37b6-5104-fe4e-8e05cc5d391b@azul.com> On 5/26/22 17:25, Dan Heidinga wrote: > The new method is ::setParallelism(int) [0]. Once the crac branch > consumes the latest jdk stream, I'll update FJP to implement Resource > and we can update the parallelism so pools, in particular the common > pool, work better when the available # of threads changes. > > [0] https://github.com/openjdk/crac/blob/f235955eefb1141a2e72116dfcf345e40416f059/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L2938 Are you going change the common pool behavior or all instances of FJP? Or all instances of FJP with the default parallelism value? I think we need carefully avoid changing parallelism of a FJP if that was set explicitly. Thanks, Anton From heidinga at redhat.com Fri May 27 12:51:14 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Fri, 27 May 2022 08:51:14 -0400 Subject: FJP::setParallelism update In-Reply-To: References: Message-ID: > > How often should we plan to update the crac branch from the mainline? > > I'd like to keep the branch with JDK17 (the latest LTS). How about crac > branch receiving updates from openjdk/jdk/master, and crac-17u from > openjdk/jdk17u/master? And to encourage doing backports of CRaC-specific > changes from crac to crac-17u, bit don't make that mandatory. > > If this is OK, we'll need to fork off crac-17u first. That sounds like a reasonable approach if we want to track both the main dev line and the last LTS. Backporting patches may become a nuisance but given the rate of change in crac today, this seems like an OK starting point. I'm curious about the value of tracking the LTS though given we'll need to merge to mainline when crac graduates. Is the intention to make it easier for users to try out under their existing deployments? --Dan > > Thanks, > Anton > From heidinga at redhat.com Fri May 27 12:59:56 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Fri, 27 May 2022 08:59:56 -0400 Subject: FJP::setParallelism update In-Reply-To: <12e3b897-37b6-5104-fe4e-8e05cc5d391b@azul.com> References: <12e3b897-37b6-5104-fe4e-8e05cc5d391b@azul.com> Message-ID: > > The new method is ::setParallelism(int) [0]. Once the crac branch > > consumes the latest jdk stream, I'll update FJP to implement Resource > > and we can update the parallelism so pools, in particular the common > > pool, work better when the available # of threads changes. > > > > [0] https://github.com/openjdk/crac/blob/f235955eefb1141a2e72116dfcf345e40416f059/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L2938 > > Are you going change the common pool behavior or all instances of FJP? > Or all instances of FJP with the default parallelism value? I think we > need carefully avoid changing parallelism of a FJP if that was set > explicitly. Definitely the common pool as it's managed by the JVM (though we won't change it if the parallelism has been set by env var). For other pools, I'm not sure what the right behaviour is yet. They would have been tuned based on the # of processors of the machine they were running on prior to the checkpoint. After the checkpoint, if we're running on a different machine or have a different share of the # of processors, then previous tunings - even if explicit - are likely wrong. We could develop an adjustment factor: abs(prevCPUs - currentCPUs)) and apply that to each pool in an attempt to preserve user intent, but it may be better to have users directly update their pools. This would be a great use for Volkier's "snapsaftey" concept if we could treat use of a FJP as a warning/error that moved up the call stack and either prevented checkpointing or was addressed by someone who implemented Resource. Some of this will depend on looking at FJP usage to see how deeply embedded it is into applications. Do users have control to tune these usages? Is it mainly driven by libraries / frameworks tuning to # of cpus? Lots to look at here. --Dan > > Thanks, > Anton > From akozlov at azul.com Fri May 27 13:43:03 2022 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 27 May 2022 16:43:03 +0300 Subject: FJP::setParallelism update In-Reply-To: References: Message-ID: <65a995f1-7dc9-e5f1-52fc-c1ea803a6e25@azul.com> On 5/27/22 15:51, Dan Heidinga wrote: > I'm curious about the value of tracking the LTS though given we'll > need to merge to mainline when crac graduates. Is the intention to > make it easier for users to try out under their existing deployments? Yes, exactly. That will hopefully provide us more feedback, and the behavior of changes made for CRaC (and possible problems caused by them) won't interfere with upstream changes. From volker.simonis at gmail.com Fri May 27 13:51:32 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 27 May 2022 06:51:32 -0700 Subject: FJP::setParallelism update In-Reply-To: References: Message-ID: Dan Heidinga schrieb am Fr., 27. Mai 2022, 05:51: > > > How often should we plan to update the crac branch from the mainline? > > > > I'd like to keep the branch with JDK17 (the latest LTS). How about crac > > branch receiving updates from openjdk/jdk/master, and crac-17u from > > openjdk/jdk17u/master? And to encourage doing backports of CRaC-specific > > changes from crac to crac-17u, bit don't make that mandatory. > > > > If this is OK, we'll need to fork off crac-17u first. > > That sounds like a reasonable approach if we want to track both the > main dev line and the last LTS. Backporting patches may become a > nuisance but given the rate of change in crac today, this seems like > an OK starting point. > I'm curious about the value of tracking the LTS though given we'll > need to merge to mainline when crac graduates. Is the intention to > make it easier for users to try out under their existing deployments? > This sounds reasonable to me as well. I agree that the main dev line should be the master but we should try to downport on a best effort base. I think it's already hard for many users to run their application on 17 and only supporting the latest version would complicate experimentation even more. > --Dan > > > > > Thanks, > > Anton > > > > From akozlov at azul.com Fri May 27 14:25:33 2022 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 27 May 2022 17:25:33 +0300 Subject: FJP::setParallelism update In-Reply-To: References: <12e3b897-37b6-5104-fe4e-8e05cc5d391b@azul.com> Message-ID: On 5/27/22 15:59, Dan Heidinga wrote: > Definitely the common pool as it's managed by the JVM (though we won't > change it if the parallelism has been set by env var). Got it, thanks. > For other pools, I'm not sure what the right behaviour is yet. They > would have been tuned based on the # of processors of the machine they > were running on prior to the checkpoint. After the checkpoint, if > we're running on a different machine or have a different share of the > # of processors, then previous tunings - even if explicit - are likely > wrong. From the other side, won't that be a performance problem, rather than the correctness issue? As a user, I would prefer for my settings to be persisted, if I have an ability to change them, rather than automatic adjustment that I probably don't want and would need to adjust back. > We could develop an adjustment factor: abs(prevCPUs - currentCPUs)) > and apply that to each pool in an attempt to preserve user intent, but > it may be better to have users directly update their pools. Agree that explicitly fixing parallelism looks like a best approach, since we don't know the intent. But that probably can be expressed directly on interface level, like setting parallelism as a fraction of available cores. Then, it will be possible to differentiate between parallelism as a fixed number and the one that is actually a function of available cores -- the only one that needs adjustment. > This > would be a great use for Volkier's "snapsaftey" concept if we could > treat use of a FJP as a warning/error that moved up the call stack and > either prevented checkpointing or was addressed by someone who > implemented Resource. I think this is possible to hack in something similar right now. E.g. make the FJP to implement the Resource interface that throw an exception "you should not have FJP" on checkpoint. To get checkpoint working back again, a programmer would need to override interface methods FJP with a sensible logic for checkpoint/restore, e.g. do nothing. Not sure does this fit to the snapsafety concept. Thanks, Anton From heidinga at openjdk.java.net Fri May 27 15:53:30 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 27 May 2022 15:53:30 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v3] In-Reply-To: References: Message-ID: On Mon, 25 Apr 2022 17:59:57 GMT, Anton Kozlov wrote: >> This change updates Reference Handling after Alan's comments [1]. >> * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. >> * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). >> * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. >> >> The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. >> >> [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Make jdk.crac.Misc final Marked as reviewed by heidinga (Committer). ------------- PR: https://git.openjdk.java.net/crac/pull/22 From heidinga at openjdk.java.net Fri May 27 15:53:31 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 27 May 2022 15:53:31 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v2] In-Reply-To: References: Message-ID: On Mon, 25 Apr 2022 17:52:12 GMT, Anton Kozlov wrote: >> src/java.base/share/classes/java/lang/ref/Reference.java line 331: >> >>> 329: >>> 330: @Override >>> 331: public boolean waitForQueueProcessed(ReferenceQueue queue, >> >> Should this method be static as it doesn't use the instance's state? >> >> Alternatively, if it's for *this* reference's queue, then the instance variable should be used and the queue parameter can be removed. >> >> Actually, I'm starting to think this method shouldn't exist on reference. It belongs on the ReferenceQueue rather than here. Possibly as a static helper method if there's a need to expose the version that takes a queue. > > This is an interface method of JavaLangRefAccess that exposes some package-private java.lang.ref methods. So it cannot be static. `this` in this context is the instance of that JavaLangRefAccess. And the interface methods just calls the instance method of ReferenceQueue as you suggest. > > That instance method is intentionally made not-public, so our EA code will be compatible with JDK17. After some thought, this looks like better approach than introducing new public methods until those methods will be agreed to be good ones. I don't feel this for the method, after all, I found a better name since the last patch in this area :) Reading this again, I now see the new method is a member of the `new JavaLangRefAccess() {}` class which makes perfect sense for why it's here like this. ------------- PR: https://git.openjdk.java.net/crac/pull/22 From duke at openjdk.java.net Fri May 27 20:45:14 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Fri, 27 May 2022 20:45:14 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: <5HuepVFd38Nizg5Vs_OC8fGCTx1J7yKfMOscfAGGa7Q=.703ba870-753b-4705-85fb-9bbeb5765f38@github.com> References: <1DfzVHDKw73BMvi1K26B-osgfJzB0e9SAz99uKvNlaQ=.51e789c0-0718-4185-aed8-2a96851b5899@github.com> <5HuepVFd38Nizg5Vs_OC8fGCTx1J7yKfMOscfAGGa7Q=.703ba870-753b-4705-85fb-9bbeb5765f38@github.com> Message-ID: On Wed, 11 May 2022 18:27:59 GMT, Anton Kozlov wrote: >>> I think we need to choose one model (JVMTI and j.l.S.properties are identified for now), follow it, and try to understand how much it satisfies users' needs. Or at least ours :) >> >> From this perspective I think we can go with not updating `_system_properties` on restore. Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in `j.l.System`. >> >>> I would like to distinguish setting a value for a property and the way or the point when the value is processed by some unrelated entity to the way the property is set. Handling is a very implementation detail. >> >> I get your point. Let's drop the "modifiable" marker on the properties and just update the `java.lang.System.getProperties` view so that applications can make use of the new/updated properties. >> >>> Having that there is another pass over all of the arguments anyway, is it possible to provide all explicit native arguments to the function doing the restore (os::Linux::restore for now)? Then we won't need to introduce a lot of aux code in the VM and it will be simpler to move out VM eventually. >> >> Did you mean we just pass the properties and application arguments directly to `os::Linux::restore()` instead of getting it through `Arguments` class? It should be possible. I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. >> The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. >> The java launcher adds following properties to the `args` in addition to the user specified: >> 1. -Djava.class.path >> 2. -Dsun.java.command >> 3. -Dsun.java.launcher >> >> We can ignore 1 and 3, collect the application arguments from 2, filter options in `args` starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? > >> Right now the requirement, as I understand, is to provide a way to specify new or update existing application specific properties on restore which can be achieved by just updating the properties in j.l.System. > > Seems so, although not something I would call a strict requirement -- something that has emerged as useful and clear. > >> I guess as soon as we get access to the args, we can call `os::Linux::restore()` and avoid all the JVM initialization process. > > Yeah, `os::Linux::restore(char** args)` for example, with -D and arguments. > > At the moment we need a few -XX: arguments parsed, CRaCRestoreFrom, CREngine, and probably more. It was convenient to assume all -XX arguments are parsed and available in os::Linux::restore. But since this won't last forever, let's see how and if to preserve the existing arguments parsing. I would avoid doing big changes, but up to you. > >> It The `args` parameter to `Threads::create_vm()` has all the stuff that we need to pass to `os::Linux::restore()`. We can do a quick pass over `args` to extract the properties and application arguments. The java launcher adds following properties to the `args` in addition to the user specified: >> >> 1. -Djava.class.path >> 2. -Dsun.java.command >> 3. -Dsun.java.launcher >> >> We can ignore 1 and 3, collect the application arguments from 2, filter options in `args` starting with `-D` and pass the set to `os::Linux::restore()`. Is that what you are suggesting? > > Apparently yes. Although if possible, it would be better to collect positional arguments directly from the arguments set instead of sun.java.command, so arguments with blank chars will be handled correctly -- this is a problem for now. > > Thanks! @AntonKozlov can this be integrated now? ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Sat May 28 10:17:08 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 28 May 2022 10:17:08 GMT Subject: [crac] RFR: Allow users to pass new properties on restore [v9] In-Reply-To: <3gzQjo_0ocvvO0fqeTGMid7REsMlo306Acnc0TnGhMM=.80e3f510-6e56-46c3-946d-6199ad2a314c@github.com> References: <3gzQjo_0ocvvO0fqeTGMid7REsMlo306Acnc0TnGhMM=.80e3f510-6e56-46c3-946d-6199ad2a314c@github.com> Message-ID: On Wed, 25 May 2022 14:52:17 GMT, Ashutosh Mehra wrote: >> VM changes: To identify properties that can be modified on restore, >> added a new bool field SystemProperty::_modifiable_on_restore. >> All the jdk related properties are marked unmodifiable. Rest of the >> properties are considered modifiable. >> When the JVM is launched with -XX:CRaCRestoreFrom option, then the >> properties prefixed with "-D" are maintained in a separate list in >> Arguments::_system_properties_for_restore. This list is passed to the JVM >> being restored by writing to a shared memory object. >> When the JVM is restored, it reads the new properties from shared memory >> object and updates its existing list of properties maintained in >> Arguments::_system_properties. >> >> JDK changes: System::props needs to be updated on restore to account for >> new system properties. For this purpose j.l.System registers a new >> JDKResource which queries new properties from the VM in afterRestore() >> notification and updates System::props. The JDKResource registered by >> j.l.System is given highest priority so it is the first resource to get >> afterRestore() notification. >> >> Signed-off-by: Ashutosh Mehra > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM_CracRestoreParameters to CracRestoreParameters > > Signed-off-by: Ashutosh Mehra Of course, please proceed with the instruction in the comment https://github.com/openjdk/crac/pull/21#issuecomment-1111571364. You need to comment the PR with `/integrate` ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Mon May 30 08:23:53 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 30 May 2022 08:23:53 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v3] In-Reply-To: References: Message-ID: On Mon, 25 Apr 2022 17:59:57 GMT, Anton Kozlov wrote: >> This change updates Reference Handling after Alan's comments [1]. >> * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. >> * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). >> * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. >> >> The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. >> >> [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Make jdk.crac.Misc final Thanks for review! I'll request cross-reivew on jdk-dev, since this changes follows suggestions from there. ------------- PR: https://git.openjdk.java.net/crac/pull/22 From duke at openjdk.java.net Mon May 30 13:43:14 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Mon, 30 May 2022 13:43:14 GMT Subject: [crac] Integrated: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: On Thu, 14 Apr 2022 15:07:22 GMT, Ashutosh Mehra wrote: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra This pull request has now been integrated. Changeset: b2783c90 Author: Ashutosh Mehra Committer: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/b2783c90a8ad81f6a8564e6cacf97a1ea0190ccd Stats: 330 lines in 6 files changed: 239 ins; 52 del; 39 mod Allow users to pass new properties on restore Reviewed-by: akozlov ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at azul.com Mon May 30 14:17:34 2022 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 30 May 2022 17:17:34 +0300 Subject: [crac] RFR: Ensure empty Reference Handler and Cleaners queues In-Reply-To: References: <1c2136a0-90c2-cabc-a948-bc4a02f1533b@oracle.com> Message-ID: <4990dc41-f466-007b-6128-d5b9d410c553@azul.com> Could you please look at the updated version at [0]? The new API for ReferenceQueue still targets the problem of synchronizing Reference handling with CRaC. An example of that is a java object that becomes unreachable just before the checkpoint, and an associated Reference needs to be processed to release some native resource. Creating an image of the VM with that native resource linked is both unsafe (the native resource may not exist at the restore -- CRaC VM does its best to prevent successful checkpoint in this case) and inefficient, as every restored instance will perform the same processing of the same Reference that was captured by the image. So we need to ensure Reference processing is complete. For the processing done by a thread (or a set of threads), the change provides an updated API to await the set of threads blocked on the Queue awaiting references. This ensures that threads are done processing References from that Queue. > > Once the method returns then there is no guarantee that the number > > of waiters hasn't changed, but I think you know that > > I hoped to guarantee all Queues are empty by waiting a sufficient > number of waiters for each Queue, in the order of Queues passing > References between each other (for a single thread). But now even > there, I see handling of a Reference later in the order may make > another one pending, filling up a Queue that was supposed to be empty. > For a strong guarantee that all Queues are empty, some sort of > iteration may be required, that will check no Queue had a new > reference since the last check. Processing of a single Reference may generate an arbitrary number of more enqueued References. More formally, ReferenceQueues and their processors form a directed graph, in which nodes are Queues and edges are relation "handling of a Reference from the source may enqueue another Reference into destination". Edges are defined by the code of processing and not data. The graph can be of the arbitrary form, e.g. there can be cycles, so Reference processing does not need even to converge. So the only reasonable way to get reference processing quiescent is to ensure References for each Queue are processed (by calling the new API), in the order of Queues may get References. > I think a public API is needed as users may have the same problem as > we do. But the current code does not support this (we need to allow > user code after JDK Queues are emptied). The API now fully supports calling from the user code. Each invocation of the new API ensures all unreachable objects are discovered and pushed to a Queue before the Queue is checked for pending references and the number of waiting threads. > > At a high level it should be okay to provide a JDK-internal way to > > await quiescent. You've added it as a public API which might be okay > > for the current exploration but I don't think it would be exposed in > > its current form. The new ReferenceQueue API is moved into jdk.crac.* package, to avoid polluting Java API of CRaC EA builds that are based on JDK 17 for now. [0] https://github.com/openjdk/crac/pull/22 [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/WeakHashMap.java#L361 Thanks, Anton From duke at openjdk.java.net Tue May 31 14:19:47 2022 From: duke at openjdk.java.net (Kuznetsov Ilya Alexandrovich) Date: Tue, 31 May 2022 14:19:47 GMT Subject: [crac] RFR: X11 CRaC reinitializing on CheckpointRestore Message-ID: Allows CRaC to perform a CheckpointRestore operation for applications using GUI (Swing, AWT) and X11 connection. Resources are registered only if the application uses the GUI. The order in which resources are reinitialized matters: Toolkit should be cleared before reference handling for a proper garbage collection, and GraphicsEnvironment after handling for a correct X11 disconnection. Some resources restore lazily. The `beforeCheckpoint()` operation dispose necessary toolkit and connection resources and disconnects from X11. This allows CRaC to perform a Checkpoint since there is no external connection. The `afterRestore()` operations reconnect to X11 and then restore necessary connection and toolkit resources. Thus, after the Restore operation, we have a clean X11 connection. It is ready to restore the original GUI state. ------------- Commit messages: - Whitespace fix attempt - Merge remote-tracking branch 'origin/crac' into crac - X11 CRaC reinitializing on CheckpointRestore - X11 reinitializing on CheckpointRestore Changes: https://git.openjdk.java.net/crac/pull/19/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=19&range=00 Stats: 573 lines in 20 files changed: 530 ins; 8 del; 35 mod Patch: https://git.openjdk.java.net/crac/pull/19.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/19/head:pull/19 PR: https://git.openjdk.java.net/crac/pull/19