From duke at openjdk.java.net Fri Apr 1 16:20:07 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Fri, 1 Apr 2022 16:20:07 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> References: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> Message-ID: On Thu, 24 Mar 2022 14:52:48 GMT, Anton Kozlov wrote: > I can imagine that a micro-service may need to know coordinates of other services on restore. Command line arguments may be used to provide the path to a config file then. @AntonKozlov Thanks for the explanation. This makes the requirement a lot more clear now. I have a slightly different take on the application's perspective for this requirement. I am looking at the need for new arguments as no different than the post-restore adjustments that the application may have to do. Currently this is implemented using `Resource` interface and I feel the same mechanism can be extended for handling the new arguments at the application level as well. This can be done by treating the class responsible for processing arguments as a `Resource` which would get the `afterRestore` notification and can then get the new arguments from the Context passed to it. For instance, in the JavaCompilerCRaC example, `JavaCompilerCRaC` can extend `Resource`, implement `afterRestore()`, get the new args from the `Context` and make internal adjustments which could be reading the config file, or opening a socket on the port number, or in this example call `runJavac()`. I have created a [patch](https://github.com/ashu-mehra/crac/commit/3360123f4d4671bfdcee129115300d0f108d3d8f) on top of your changes to highlight this behavior. This approach also removes the requirement for the application to create another main class to be used only on restore. In fact, the user can just do `java -XX:CRaCRestoreFrom= arg1 arg2` and the new arguments would be available to the application on restore. Your thoughts on this approach? ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Fri Apr 1 17:52:01 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 1 Apr 2022 17:52:01 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> Message-ID: On Fri, 1 Apr 2022 16:16:37 GMT, Ashutosh Mehra wrote: > I am looking at the need for new arguments as no different than the post-restore adjustments that the application may have to do. Indeed, I was thinking about the same, and now I realized there is a subtle difference. There are a lot Resources, if each of them may process arguments, how would they interpret them? E.g. does `-f` parameter means the same thing for all resources in the system? If not, how to specify a parameter is designated to a particular Resource? Technically nothing prevents two Resources to assume they are in position to process arguments. Resources are required when an object of a class may require checkpoint/restore notification, but for the program the existence of the object is not clear (the object may not be created, the object may be GCed, the class may not be loaded at all in this program configuration). If it's possible to program without Resource's, it's better than with them. BTW, the JavaCompilerCRaC object on line 44 won't survive GC, so it won't be notified. https://github.com/ashu-mehra/crac/commit/3360123f4d4671bfdcee129115300d0f108d3d8f#diff-4d20e6a15e5a09de1e430f18a7c84eb257eb3d2b98721a35f4853d3fd2dadf37R44 > Currently this is implemented using `Resource` interface and I feel the same mechanism can be extended for handling the new arguments at the application level as well. This can be done by treating the class responsible for processing arguments as a `Resource` which would get the `afterRestore` notification and can then get the new arguments from the Context passed to it. I still think that taking args as method parameters is cleaner and more streamlined. I would agree that arguments should be stored somewhere and then referred by a method call, if regular CLI arguments are stored somewhere and then referred by the main method. But they are provided as parameters to the main method, so I think we should not diverge in this. I mean, we don't write java programs in such way: class Main { public void main() { String args = getArgs(); } } An additional benefit of arguments as parameters -- they could be GCed as soon as a method processing them completes. Having arguments stored would make them eternally consuming memory. The context to store args does not look as the right place: context's are forming hierarchy, so you'd need to set the args for downstream context when it's notified from the parent context. And arguments make sense only during afterRestore execution. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From duke at openjdk.java.net Fri Apr 1 21:21:07 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Fri, 1 Apr 2022 21:21:07 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> Message-ID: On Fri, 1 Apr 2022 17:49:11 GMT, Anton Kozlov wrote: > There are a lot Resources, if each of them may process arguments, how would they interpret them? E.g. does -f parameter means the same thing for all resources in the system? If not, how to specify a parameter is designated to a particular Resource? Technically nothing prevents two Resources to assume they are in position to process arguments. > Resources are required when an object of a class may require checkpoint/restore notification, but for the program the existence of the object is not clear (the object may not be created, the object may be GCed, the class may not be loaded at all in this program configuration). If it's possible to program without Resource's, it's better than with them. So I am assuming that if an application needs to accept new arguments to restore, it would be to make some internal adjustments. And I would expect the application to model any adjustments post restore using Resource interface. So essentially, what I am trying to say is if the application is using new arguments, there would be _some_ `Resource` that would take _some_ action in `afterRestore()` notification based on the new args. Do you see any other way the new arguments could be used? > An additional benefit of arguments as parameters -- they could be GCed as soon as a method processing them completes. Having arguments stored would make them eternally consuming memory. Well, this can be achieved if the new arguments are passed as parameters to `afterRestore()`. We then wouldn't need to store them anywhere. I think the main concern from my point of view with the current approach was the new main class that the user needs to create to access the new arguments. And I realized another implication of this approach which somehow I missed out earlier (not sure if this has already been thought about) - it essentially gives unlimited control to the user to do anything on restore. It allows the application developer to update the objects, instead of doing through the Resource interface. So now they have to choose which approach to use for post-restore adjustments. More importantly, it allows the user to basically run a different application on restore! I think this is not what we want, right? ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Mon Apr 4 13:51:02 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 4 Apr 2022 13:51:02 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> Message-ID: On Fri, 1 Apr 2022 21:17:52 GMT, Ashutosh Mehra wrote: > So I am assuming that if an application needs to accept new arguments to restore, it would be to make some internal adjustments. And I would expect the application to model any adjustments post restore using Resource interface. So essentially, what I am trying to say is if the application is using new arguments, there would be _some_ `Resource` that would take _some_ action in `afterRestore()` notification based on the new args. Do you see any other way the new arguments could be used? The problem is that you cannot guarantee a single resource will handle the arguments. Providing arguments for every resource will encourage Resources using them, while only a single Resource should actually do that. Resources are required to register different unrelated modules of the program, from third-party libraries to JDK. Each Resource at the smallest is a file handler, or network connection, or e.g. a native part of Selector on Linux (EPoll). Resources are more suited for non-precise checkpoint which comes at a random moment in time -- when you cannot provide a single routine to de/re-initialize, or it's hard. E.g. microservice main endpoint and service logging endpoint -- each endpoint should be likely a resource, as they are likely handled with different parts of the program and fitting them in the single resource will be rather hard (was the logging endpoing is configured, was it initialized,...). > Well, this can be achieved if the new arguments are passed as parameters to afterRestore() This will encourage resources using new arguments even more. I've thought about adding another kind of resources (related/completely unrelated to the existing ones), but I see only a little value in them, but they are not better than the class. > I think the main concern from my point of view with the current approach was the new main class that the user needs to create to access the new arguments. I don't completely understand the concern. How a new Resource is different from this point of view? It's a change anyway. But with the new class it's possible to make it easier and to provide a common class in JDK for all users (I'm not sure the particular class is a great idea): public class jdk.crac.util.StoreArgs { String[] newArgs; public static main(String[] args) { newArgs = args; } public static String[] getNewArgs() { return newArgs; } } That is, a class that called if you really need to store new arguments somewhere. Or another class that prints image state (Thread dump?) and exits. > And I realized another implication of this approach which somehow I missed out earlier (not sure if this has already been thought about) - it essentially gives unlimited control to the user to do anything on restore. It allows the application developer to update the objects, instead of doing through the Resource interface. So now they have to choose which approach to use for post-restore adjustments. I hope I've addressed this above. I expect that Resources won't need arguments as they do not need them now, and a separate mean for handling arguments would not necessary need to be a Resource. > More importantly, it allows the user to basically run a different application on restore! I think this is not what we want, right? Why not? An image could be suited for different purposes. A regular java program may have different entry points (classes with the main method) and only one of them employed in a single run. With the restore, the old program does not go away, just another class's main method is called before returning from checkpointRestore(). ------------- PR: https://git.openjdk.java.net/crac/pull/16 From noreply at github.com Thu Apr 7 06:17:54 2022 From: noreply at github.com (Anton Kozlov) Date: Wed, 06 Apr 2022 23:17:54 -0700 Subject: [CRaC/criu] cb6168: Update GHA Message-ID: Branch: refs/heads/crac Home: https://github.com/CRaC/criu Commit: cb616884bff8c512ad880b214cefc4b984b48072 https://github.com/CRaC/criu/commit/cb616884bff8c512ad880b214cefc4b984b48072 Author: Anton Kozlov Date: 2022-04-07 (Thu, 07 Apr 2022) Changed paths: M .github/workflows/ccpp.yml Log Message: ----------- Update GHA Commit: 467699114895610f41d7e06bf42be0f0f9ecc6c8 https://github.com/CRaC/criu/commit/467699114895610f41d7e06bf42be0f0f9ecc6c8 Author: Anton Kozlov Date: 2022-04-07 (Thu, 07 Apr 2022) Changed paths: M criu/files-reg.c Log Message: ----------- Workaround ubuntu kernel bug Compare: https://github.com/CRaC/criu/compare/b3a210513e5a...467699114895 From akozlov at openjdk.java.net Thu Apr 7 13:35:56 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 7 Apr 2022 13:35:56 GMT Subject: [crac] RFR: Provide arguments for restore [v7] In-Reply-To: References: Message-ID: > This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Handle missing env (although should not happen) ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/16/files - new: https://git.openjdk.java.net/crac/pull/16/files/44126ba4..235ea1a9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=06 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=05-06 Stats: 4 lines in 3 files changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/crac/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/16/head:pull/16 PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Apr 7 13:40:16 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 7 Apr 2022 13:40:16 GMT Subject: [crac] RFR: Provide arguments for restore [v7] In-Reply-To: References: Message-ID: On Thu, 7 Apr 2022 13:35:56 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Handle missing env (although should not happen) I propose to move with the PR in its current form, unless it is wrong. If not, we can fix it later based on user's feedback. I deliberately don't add the documentation to preserve some flexibility. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Apr 7 14:09:46 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 7 Apr 2022 14:09:46 GMT Subject: [crac] RFR: Collect CREngine child Message-ID: When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 ------------- Commit messages: - Collect CREngine child Changes: https://git.openjdk.java.net/crac/pull/18/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=18&range=00 Stats: 16 lines in 1 file changed: 14 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/crac/pull/18.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/18/head:pull/18 PR: https://git.openjdk.java.net/crac/pull/18 From duke at openjdk.java.net Thu Apr 7 20:24:13 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Thu, 7 Apr 2022 20:24:13 GMT Subject: [crac] RFR: Provide arguments for restore [v7] In-Reply-To: References: Message-ID: <5i9zV_LiBgjAYymHAmDTiePWSXuLmBZwtZGBPR1wSyY=.b3f49628-68ee-4477-8210-c3ba8b248c65@github.com> On Thu, 7 Apr 2022 13:35:56 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Handle missing env (although should not happen) Dan and I have a discussion internally about this PR and we agree its worth going ahead with what we have in this PR. At this point there isn't much clarity on the the way the applications would need to access the new arguments, but providing some mechanism to make the new arguments available on restore would at least enable us to play with more real-world applications and use-cases having these requirements. Based on that experience we can always revisit and make adjustments if required. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Fri Apr 8 09:47:25 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 8 Apr 2022 09:47:25 GMT Subject: [crac] RFR: Provide arguments for restore [v7] In-Reply-To: References: Message-ID: <9TuI5vtmHgaE_X9d-Yzq1S0Cs3nBEfVmRNk8mFbPs6E=.ab7710fa-d573-449c-a514-c0091aa79e6f@github.com> On Thu, 7 Apr 2022 13:35:56 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Handle missing env (although should not happen) Thanks for discussing this! Yes, let's try to use this and then adjust. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Fri Apr 8 09:47:26 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 8 Apr 2022 09:47:26 GMT Subject: [crac] Integrated: Provide arguments for restore In-Reply-To: References: Message-ID: On Fri, 11 Feb 2022 11:56:10 GMT, Anton Kozlov wrote: > This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. This pull request has now been integrated. Changeset: 828ea227 Author: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/828ea227ca4636af52f7bec79c395956ac89a7cd Stats: 209 lines in 8 files changed: 177 ins; 10 del; 22 mod Provide arguments for restore Reviewed-by: abakhtin ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Fri Apr 8 10:16:03 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 8 Apr 2022 10:16:03 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access In-Reply-To: References: Message-ID: <9nyEAoh60oypJ5VeQSNm1J1D6Gt5yFbej1VosRYP3T0=.b539168b-8431-4ae8-8aa4-10b64f8d9852@github.com> On Thu, 3 Mar 2022 13:52:31 GMT, Anton Kozlov wrote: > A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. Ping? It's not critical, but fixes a annoying exception during checkpoint. ------------- PR: https://git.openjdk.java.net/crac/pull/17 From heidinga at openjdk.java.net Fri Apr 8 12:51:17 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 8 Apr 2022 12:51:17 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 13:52:31 GMT, Anton Kozlov wrote: > A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. > > The exception looks like below: > > java.lang.reflect.InvocationTargetException > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at java.base/jdk.internal.loader.URLClassPath$JarLoader$ClassLoaderJarFile.beforeCheckpoint(URLClassPath.java:837) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.Core.checkpointRestore1(Core.java:108) > at java.base/jdk.crac.Core.checkpointRestore(Core.java:182) > at JavaCompilerCRaC.main(JavaCompilerCRaC.java:27) > Caused by: java.lang.NullPointerException: Cannot invoke "java.util.zip.ZipFile$Source.getFile()" because the return value of "java.util.zip.ZipFile$CleanableResource.getSource()" is null > at java.base/java.util.zip.ZipFile.beforeCheckpoint(ZipFile.java:1088) > ... 10 more lgtm - Seems like a fairly straightforward refactoring. ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/17 From omikhaltcova at openjdk.java.net Fri Apr 8 13:43:08 2022 From: omikhaltcova at openjdk.java.net (Olga Mikhaltsova) Date: Fri, 8 Apr 2022 13:43:08 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 13:52:31 GMT, Anton Kozlov wrote: > A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. > > The exception looks like below: > > java.lang.reflect.InvocationTargetException > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at java.base/jdk.internal.loader.URLClassPath$JarLoader$ClassLoaderJarFile.beforeCheckpoint(URLClassPath.java:837) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.Core.checkpointRestore1(Core.java:108) > at java.base/jdk.crac.Core.checkpointRestore(Core.java:182) > at JavaCompilerCRaC.main(JavaCompilerCRaC.java:27) > Caused by: java.lang.NullPointerException: Cannot invoke "java.util.zip.ZipFile$Source.getFile()" because the return value of "java.util.zip.ZipFile$CleanableResource.getSource()" is null > at java.base/java.util.zip.ZipFile.beforeCheckpoint(ZipFile.java:1088) > ... 10 more src/java.base/share/classes/java/util/zip/ZipFile.java line 1797: > 1795: FileDescriptor fd = null; > 1796: try { > 1797: fd = f.getFD(); Could you pls check: zfile.getFD() ? ------------- PR: https://git.openjdk.java.net/crac/pull/17 From asmehra at redhat.com Fri Apr 8 15:47:37 2022 From: asmehra at redhat.com (Ashutosh Mehra) Date: Fri, 8 Apr 2022 11:47:37 -0400 Subject: Provide ability to pass new system property or update existing property on restore Message-ID: Similar to the requirement for providing new command line arguments to the application on restore, it would be useful if the user is able to provide new system properties as well. For example: > java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr would pass the system property key1=value1 and key2=value2 to the process being restored. If key1 is already present as a system property, its current value will be replaced with value1. If key1 is not present, it gets added as a new system property in the JVM. Any other system property already present in the JVM would not be touched. To implement this we can use the same mechanism as for passing the new application arguments, i.e. the new system properties would be written to the shared memory by the initiating JVM (the instance of the JVM that invokes the criu engine for restore). I intend to use the same shared memory that is currently used for passing the new application arguments. The shared memory can be partitioned to store the system properties followed by the application arguments. The restored JVM would read the new system properties from the shared memory and add them or update them in the Arguments::_system_properties which is the set of system properties maintained by the VM. In addition, these properties need to be updated in the JDK as well. To accomplish that, the java.lang.System class would need to be updated to recompute the System::props field. Note that this only takes care of the JDK and JVM. If the application/library has read the property before checkpoint and cached it, it would have to register the appropriate resource(s) for afterRestore() notification to refresh the property value. If this makes sense, I can work on the changes. If not, feel free to suggest any alternatives. Regards, Ashutosh Mehra From akozlov at openjdk.java.net Fri Apr 8 19:14:10 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 8 Apr 2022 19:14:10 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access [v2] In-Reply-To: References: Message-ID: > A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. > > The exception looks like below: > > java.lang.reflect.InvocationTargetException > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at java.base/jdk.internal.loader.URLClassPath$JarLoader$ClassLoaderJarFile.beforeCheckpoint(URLClassPath.java:837) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.Core.checkpointRestore1(Core.java:108) > at java.base/jdk.crac.Core.checkpointRestore(Core.java:182) > at JavaCompilerCRaC.main(JavaCompilerCRaC.java:27) > Caused by: java.lang.NullPointerException: Cannot invoke "java.util.zip.ZipFile$Source.getFile()" because the return value of "java.util.zip.ZipFile$CleanableResource.getSource()" is null > at java.base/java.util.zip.ZipFile.beforeCheckpoint(ZipFile.java:1088) > ... 10 more Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'zip-fix' of https://github.com/AntonKozlov/crac into zip-fix - Fix NPE on ZipFile$Source access ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/17/files - new: https://git.openjdk.java.net/crac/pull/17/files/d1a74be6..8718c10a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=17&range=01 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=17&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/crac/pull/17.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/17/head:pull/17 PR: https://git.openjdk.java.net/crac/pull/17 From akozlov at openjdk.java.net Fri Apr 8 19:14:12 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 8 Apr 2022 19:14:12 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access [v2] In-Reply-To: References: Message-ID: On Fri, 8 Apr 2022 13:35:59 GMT, Olga Mikhaltsova wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'zip-fix' of https://github.com/AntonKozlov/crac into zip-fix >> - Fix NPE on ZipFile$Source access > > src/java.base/share/classes/java/util/zip/ZipFile.java line 1797: > >> 1795: FileDescriptor fd = null; >> 1796: try { >> 1797: fd = f.getFD(); > > Could you pls check, may be, zfile.getFD() instead of f.getFD() ? You're a hawkeye! I had an amended version all the time locally, and tested that. The difference is indeed in this line. Thanks! Fixed. ------------- PR: https://git.openjdk.java.net/crac/pull/17 From asmehra at redhat.com Mon Apr 11 14:24:30 2022 From: asmehra at redhat.com (Ashutosh Mehra) Date: Mon, 11 Apr 2022 10:24:30 -0400 Subject: Provide ability to pass new system property or update existing property on restore In-Reply-To: References: Message-ID: *> For example:> java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr> would pass the system property key1=value1 and key2=value2 to the process being restored.* Currently the initiating JVM only serves to write the command line arguments to shared memory and then calls *exec() *to run the criu engine. I am wondering if the initiating JVM needs to recognize these properties? It can probably ignore these properties if the *-XX:CRaCRestoreFrom* option is present, but then in future if ever there is a need to pass a system property to the initiating JVM, it wouldn't be possible. To keep that option open, we would need to differentiate between the properties that apply to the initiating JVM and the ones that apply to the JVM being restored. Any thoughts on this? Ashutosh Mehra, Red Hat Runtimes On Fri, Apr 8, 2022 at 11:47 AM Ashutosh Mehra wrote: > Similar to the requirement for providing new command line arguments to the > application on restore, > it would be useful if the user is able to provide new system properties as > well. > For example: > > > java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr > > would pass the system property key1=value1 and key2=value2 to the process > being restored. > > If key1 is already present as a system property, its current value will be > replaced with value1. > If key1 is not present, it gets added as a new system property in the JVM. > Any other system property already present in the JVM would not be touched. > > To implement this we can use the same mechanism as for passing the new > application arguments, > i.e. the new system properties would be written to the shared memory by > the initiating JVM (the instance > of the JVM that invokes the criu engine for restore). > I intend to use the same shared memory that is currently used for passing > the new application arguments. > The shared memory can be partitioned to store the system properties > followed by the application arguments. > > The restored JVM would read the new system properties from the shared > memory and add them or update them > in the Arguments::_system_properties which is the set of system properties > maintained by the VM. > In addition, these properties need to be updated in the JDK as well. To > accomplish that, the java.lang.System > class would need to be updated to recompute the System::props field. > > Note that this only takes care of the JDK and JVM. > If the application/library has read the property before checkpoint and > cached it, it would have to register > the appropriate resource(s) for afterRestore() notification to refresh the > property value. > > If this makes sense, I can work on the changes. If not, feel free to > suggest any alternatives. > > Regards, > Ashutosh Mehra > From heidinga at redhat.com Mon Apr 11 14:44:49 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 11 Apr 2022 10:44:49 -0400 Subject: Provide ability to pass new system property or update existing property on restore In-Reply-To: References: Message-ID: > > *> For example:> java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr> > would pass the system property key1=value1 and key2=value2 to the process > being restored.* > > Currently the initiating JVM only serves to write the command line > arguments to shared memory and then calls *exec() *to run the criu engine. > > I am wondering if the initiating JVM needs to recognize these properties? > It can probably ignore these properties if the *-XX:CRaCRestoreFrom* > option is present, but then in future if ever there is a need to pass a > system property to the initiating JVM, it wouldn't be possible. > To keep that option open, we would need to differentiate between the > properties that apply to the initiating JVM and the ones that apply to the > JVM being restored. OpenJ9 looked at how to deal with environment variables [1] and came up with a reasonable approach that we could copy here. They register a file as part of the checkpoint call that will contain the new environment variable values on restore. During the restore, they read the specified file and set the env vars. We could use a similar approach for System properties. Rather than setting the -D options on the command line, register prior to the checkpoint which file to read, and use that file to pass in the new properties. It changes the way that command lines are used between the checkpoint and restore runs which is slightly less convenient, though not outside the realm of reasonable approaches given the restore already needs an extra -XX option. --Dan [1] https://github.com/eclipse-openj9/openj9/issues/13545 > > Any thoughts on this? > > Ashutosh Mehra, > Red Hat Runtimes > > > On Fri, Apr 8, 2022 at 11:47 AM Ashutosh Mehra wrote: > > > Similar to the requirement for providing new command line arguments to the > > application on restore, > > it would be useful if the user is able to provide new system properties as > > well. > > For example: > > > > > java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr > > > > would pass the system property key1=value1 and key2=value2 to the process > > being restored. > > > > If key1 is already present as a system property, its current value will be > > replaced with value1. > > If key1 is not present, it gets added as a new system property in the JVM. > > Any other system property already present in the JVM would not be touched. > > > > To implement this we can use the same mechanism as for passing the new > > application arguments, > > i.e. the new system properties would be written to the shared memory by > > the initiating JVM (the instance > > of the JVM that invokes the criu engine for restore). > > I intend to use the same shared memory that is currently used for passing > > the new application arguments. > > The shared memory can be partitioned to store the system properties > > followed by the application arguments. > > > > The restored JVM would read the new system properties from the shared > > memory and add them or update them > > in the Arguments::_system_properties which is the set of system properties > > maintained by the VM. > > In addition, these properties need to be updated in the JDK as well. To > > accomplish that, the java.lang.System > > class would need to be updated to recompute the System::props field. > > > > Note that this only takes care of the JDK and JVM. > > If the application/library has read the property before checkpoint and > > cached it, it would have to register > > the appropriate resource(s) for afterRestore() notification to refresh the > > property value. > > > > If this makes sense, I can work on the changes. If not, feel free to > > suggest any alternatives. > > > > Regards, > > Ashutosh Mehra > > > From akozlov at azul.com Mon Apr 11 16:02:43 2022 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 11 Apr 2022 19:02:43 +0300 Subject: Provide ability to pass new system property or update existing property on restore In-Reply-To: References: Message-ID: On 4/8/22 18:47, Ashutosh Mehra wrote: > it would be useful if the user is able to provide new system properties as > well. > For example: > >> java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr > > would pass the system property key1=value1 and key2=value2 to the process > being restored. I think this is great idea! The proposal and the proposed implementation make a lot of sense to me. I will happily look at the change. Thanks, Anton From akozlov at azul.com Mon Apr 11 16:24:53 2022 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 11 Apr 2022 19:24:53 +0300 Subject: Provide ability to pass new system property or update existing property on restore In-Reply-To: References: Message-ID: <1b1121de-f0c0-d98a-7e82-e462adf097ee@azul.com> On 4/11/22 17:44, Dan Heidinga wrote: >> >> *> For example:> java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr> >> would pass the system property key1=value1 and key2=value2 to the process >> being restored.* >> >> Currently the initiating JVM only serves to write the command line >> arguments to shared memory and then calls *exec() *to run the criu engine. >> >> I am wondering if the initiating JVM needs to recognize these properties? >> It can probably ignore these properties if the *-XX:CRaCRestoreFrom* >> option is present, but then in future if ever there is a need to pass a >> system property to the initiating JVM, it wouldn't be possible. >> To keep that option open, we would need to differentiate between the >> properties that apply to the initiating JVM and the ones that apply to the >> JVM being restored. > > OpenJ9 looked at how to deal with environment variables [1] and came > up with a reasonable approach that we could copy here. They register > a file as part of the checkpoint call that will contain the new > environment variable values on restore. During the restore, they read > the specified file and set the env vars. > > We could use a similar approach for System properties. Rather than > setting the -D options on the command line, register prior to the > checkpoint which file to read, and use that file to pass in the new > properties. > > It changes the way that command lines are used between the checkpoint > and restore runs which is slightly less convenient, though not outside > the realm of reasonable approaches given the restore already needs an > extra -XX option. > > --Dan > > [1] https://github.com/eclipse-openj9/openj9/issues/13545 From my perspective, the VM that is initialized only to be replaced is not a very straightforward way to restore another VM. I would abandon this for the java launcher detecting -XX:CRaCRestoreFrom argument. But the launcher would need to handle -XX:CREngine as well and there would be some code duplication in the launcher and VM. So I didn't rush to do things right. I don't think that we'd ever want to handle properties in the initiating VM that we'd not want to pass to the being restored VM. In the opposite, I like the same interface for properties passing. Suppose that -XX:CRaCRestoreFrom fail and -XX:+CRaCIgnoreRestoreIfUnavailable is enabled, the properties provided for restore would be treated as the usual system properties in the VM that will run as the fallback in this case. So I'd vote for using the existing syntax: > java -Dkey1=value1 -Dkey2=value2 -XX:CRaCRestoreFrom=cr ... Thanks, Anton From akozlov at openjdk.java.net Mon Apr 11 17:55:22 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 11 Apr 2022 17:55:22 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access [v2] In-Reply-To: References: Message-ID: <8VgNWnXu_9WLgq4qfW2vvLSOF4za5I6gIvR99KGxa5E=.b3faa43e-f213-47cd-84f5-b9b0d7f9d3b9@github.com> On Fri, 8 Apr 2022 19:14:10 GMT, Anton Kozlov wrote: >> A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. >> >> The exception looks like below: >> >> java.lang.reflect.InvocationTargetException >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:568) >> at java.base/jdk.internal.loader.URLClassPath$JarLoader$ClassLoaderJarFile.beforeCheckpoint(URLClassPath.java:837) >> at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) >> at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) >> at java.base/jdk.crac.Core.checkpointRestore1(Core.java:108) >> at java.base/jdk.crac.Core.checkpointRestore(Core.java:182) >> at JavaCompilerCRaC.main(JavaCompilerCRaC.java:27) >> Caused by: java.lang.NullPointerException: Cannot invoke "java.util.zip.ZipFile$Source.getFile()" because the return value of "java.util.zip.ZipFile$CleanableResource.getSource()" is null >> at java.base/java.util.zip.ZipFile.beforeCheckpoint(ZipFile.java:1088) >> ... 10 more > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'zip-fix' of https://github.com/AntonKozlov/crac into zip-fix > - Fix NPE on ZipFile$Source access Thanks for reviews! ------------- PR: https://git.openjdk.java.net/crac/pull/17 From akozlov at openjdk.java.net Mon Apr 11 17:59:19 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 11 Apr 2022 17:59:19 GMT Subject: [crac] Integrated: Fix NPE on ZipFile$Source access In-Reply-To: References: Message-ID: <7U1PWfKN1zahlC-iCQYuteNljTYfL8YWYBHwjS19_4A=.07795dc6-e98f-40c8-b2b3-bc1cad2d856a@github.com> On Thu, 3 Mar 2022 13:52:31 GMT, Anton Kozlov wrote: > A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. > > The exception looks like below: > > java.lang.reflect.InvocationTargetException > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at java.base/jdk.internal.loader.URLClassPath$JarLoader$ClassLoaderJarFile.beforeCheckpoint(URLClassPath.java:837) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.impl.AbstractContextImpl.beforeCheckpoint(AbstractContextImpl.java:66) > at java.base/jdk.crac.Core.checkpointRestore1(Core.java:108) > at java.base/jdk.crac.Core.checkpointRestore(Core.java:182) > at JavaCompilerCRaC.main(JavaCompilerCRaC.java:27) > Caused by: java.lang.NullPointerException: Cannot invoke "java.util.zip.ZipFile$Source.getFile()" because the return value of "java.util.zip.ZipFile$CleanableResource.getSource()" is null > at java.base/java.util.zip.ZipFile.beforeCheckpoint(ZipFile.java:1088) > ... 10 more This pull request has now been integrated. Changeset: 6363393d Author: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/6363393d025d823e531cfd894d33e6777aef4021 Stats: 28 lines in 1 file changed: 13 ins; 10 del; 5 mod Fix NPE on ZipFile$Source access Reviewed-by: heidinga ------------- PR: https://git.openjdk.java.net/crac/pull/17 From akozlov at openjdk.java.net Tue Apr 12 07:04:28 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 12 Apr 2022 07:04:28 GMT Subject: [crac] RFR: Disable non-buildable platforms in GHA Message-ID: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> Disable builds on GitHub Actions for platforms that is not expected to build successfully. The change removes GHA failures noise in forks of the project repo. https://github.com/AntonKozlov/crac/actions/runs/2109490740 I've also requested GHA to be enabled to the project repo, so tests report should appear for this PR. ------------- Commit messages: - Disable non-buildable platforms in GHA Changes: https://git.openjdk.java.net/crac/pull/20/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=20&range=00 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/crac/pull/20.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/20/head:pull/20 PR: https://git.openjdk.java.net/crac/pull/20 From heidinga at openjdk.java.net Thu Apr 14 15:21:03 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Thu, 14 Apr 2022 15:21:03 GMT Subject: [crac] RFR: Disable non-buildable platforms in GHA In-Reply-To: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> References: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> Message-ID: On Tue, 12 Apr 2022 06:56:31 GMT, Anton Kozlov wrote: > Disable builds on GitHub Actions for platforms that is not expected to build successfully. The change removes GHA failures noise in forks of the project repo. https://github.com/AntonKozlov/crac/actions/runs/2109490740 > > I've also requested GHA to be enabled to the project repo, so tests report should appear for this PR. I'm not a github actions expert but this seems like a reasonable approach to disable the unneeded platforms. And it's not worse than what we have now - so go for it! ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/20 From akozlov at openjdk.java.net Mon Apr 18 10:58:23 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 18 Apr 2022 10:58:23 GMT Subject: [crac] RFR: Disable non-buildable platforms in GHA In-Reply-To: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> References: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> Message-ID: On Tue, 12 Apr 2022 06:56:31 GMT, Anton Kozlov wrote: > Disable builds on GitHub Actions for platforms that is not expected to build successfully. The change removes GHA failures noise in forks of the project repo. https://github.com/AntonKozlov/crac/actions/runs/2109490740 > > I've also requested GHA to be enabled to the project repo, so tests report should appear for this PR. Thanks! ------------- PR: https://git.openjdk.java.net/crac/pull/20 From akozlov at openjdk.java.net Mon Apr 18 10:58:23 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 18 Apr 2022 10:58:23 GMT Subject: [crac] Integrated: Disable non-buildable platforms in GHA In-Reply-To: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> References: <2BoYpl2UUJJU9kGhCwzN-VjjO932otZwVloVOioB0qc=.0a4b6bf4-9076-4099-916c-0e6278bc4436@github.com> Message-ID: <222QCbVN-_Kc4FdXhbF1_jq3vLqgkm5b4Doo5ryYT0k=.8f126ec9-faf4-45e8-8f01-65b990e6c68d@github.com> On Tue, 12 Apr 2022 06:56:31 GMT, Anton Kozlov wrote: > Disable builds on GitHub Actions for platforms that is not expected to build successfully. The change removes GHA failures noise in forks of the project repo. https://github.com/AntonKozlov/crac/actions/runs/2109490740 > > I've also requested GHA to be enabled to the project repo, so tests report should appear for this PR. This pull request has now been integrated. Changeset: ae1644a7 Author: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/ae1644a790c6579aeea637b1568044ab90b3e8fe Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Disable non-buildable platforms in GHA Reviewed-by: heidinga ------------- PR: https://git.openjdk.java.net/crac/pull/20 From akozlov at openjdk.java.net Mon Apr 18 18:21:35 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 18 Apr 2022 18:21:35 GMT Subject: [crac] RFR: Update Reference Handling for CRaC Message-ID: This change updates Reference Handling after Alan's comments [1]. * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 ------------- Commit messages: - Update ReferenceHandle for CRaC Changes: https://git.openjdk.java.net/crac/pull/22/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=22&range=00 Stats: 227 lines in 7 files changed: 185 ins; 24 del; 18 mod Patch: https://git.openjdk.java.net/crac/pull/22.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/22/head:pull/22 PR: https://git.openjdk.java.net/crac/pull/22 From akozlov at openjdk.java.net Tue Apr 19 07:44:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 19 Apr 2022 07:44:47 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v2] In-Reply-To: References: Message-ID: > This change updates Reference Handling after Alan's comments [1]. > * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. > * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). > * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. > > The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. > > [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'jdk/crac/crac' into refqueue-2 - Update ReferenceHandle for CRaC * Hide new API for internal use * Add timeout * Remove REFERENCE_HANDLER resource ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/22/files - new: https://git.openjdk.java.net/crac/pull/22/files/c913ea55..5c4e03f6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=22&range=01 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=22&range=00-01 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/crac/pull/22.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/22/head:pull/22 PR: https://git.openjdk.java.net/crac/pull/22 From akozlov at openjdk.java.net Tue Apr 19 12:50:54 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 19 Apr 2022 12:50:54 GMT Subject: [crac] RFR: Collect CREngine child In-Reply-To: References: Message-ID: On Thu, 7 Apr 2022 14:02:39 GMT, Anton Kozlov wrote: > When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. > > [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 Ping? This a bit annoying problem that we checkpoint/restore an unnecessary process. ------------- PR: https://git.openjdk.java.net/crac/pull/18 From heidinga at openjdk.java.net Wed Apr 20 18:02:33 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Wed, 20 Apr 2022 18:02:33 GMT Subject: [crac] RFR: Collect CREngine child In-Reply-To: References: Message-ID: On Thu, 7 Apr 2022 14:02:39 GMT, Anton Kozlov wrote: > When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. > > [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 Apart from the possible fork failure case question, this looks reasonable And not really related to this PR, but didn't we disable the non linux x64 pr builds? Do we need to relook at that change? src/hotspot/os/linux/os_linux.cpp line 5833: > 5831: } > 5832: > 5833: pid_t pid = fork(); Should we check for fork() to fail? If so, we need to handle `-1` separately from `0` (child) and `> 0` (parent) ------------- PR: https://git.openjdk.java.net/crac/pull/18 From heidinga at openjdk.java.net Wed Apr 20 20:24:51 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Wed, 20 Apr 2022 20:24:51 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v2] In-Reply-To: References: Message-ID: On Tue, 19 Apr 2022 07:44:47 GMT, Anton Kozlov wrote: >> This change updates Reference Handling after Alan's comments [1]. >> * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. >> * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). >> * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. >> >> The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. >> >> [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge remote-tracking branch 'jdk/crac/crac' into refqueue-2 > - Update ReferenceHandle for CRaC > > * Hide new API for internal use > * Add timeout > * Remove REFERENCE_HANDLER resource I'm less certain of the test as it requires System.gc() to cause the reference to be enqueued. This tests tend to be flaky across different GC policies but I don't have a great suggestion on how to make it more reliable src/java.base/share/classes/java/lang/ref/Reference.java line 331: > 329: > 330: @Override > 331: public boolean waitForQueueProcessed(ReferenceQueue queue, Should this method be static as it doesn't use the instance's state? Alternatively, if it's for *this* reference's queue, then the instance variable should be used and the queue parameter can be removed. Actually, I'm starting to think this method shouldn't exist on reference. It belongs on the ReferenceQueue rather than here. Possibly as a static helper method if there's a need to expose the version that takes a queue. src/java.base/share/classes/jdk/crac/Misc.java line 10: > 8: * Additional utilities. > 9: */ > 10: public class Misc { Given this is a utility class, it should probably be final. No use in allowing subclasses ------------- PR: https://git.openjdk.java.net/crac/pull/22 From akozlov at openjdk.java.net Thu Apr 21 09:10:00 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 21 Apr 2022 09:10:00 GMT Subject: [crac] RFR: Collect CREngine child [v2] In-Reply-To: References: Message-ID: > When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. > > [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Check fork error - Merge remote-tracking branch 'jdk/crac/crac' into crengine-child - Collect CREngine child ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/18/files - new: https://git.openjdk.java.net/crac/pull/18/files/984c58f6..923d1090 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=18&range=01 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=18&range=00-01 Stats: 247 lines in 10 files changed: 194 ins; 20 del; 33 mod Patch: https://git.openjdk.java.net/crac/pull/18.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/18/head:pull/18 PR: https://git.openjdk.java.net/crac/pull/18 From akozlov at openjdk.java.net Thu Apr 21 10:54:01 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 21 Apr 2022 10:54:01 GMT Subject: [crac] RFR: Collect CREngine child [v2] In-Reply-To: References: Message-ID: On Thu, 21 Apr 2022 09:10:00 GMT, Anton Kozlov wrote: >> When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. >> >> [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Check fork error > - Merge remote-tracking branch 'jdk/crac/crac' into crengine-child > - Collect CREngine child I've added the check for the failed fork, thanks! I forgot to merge upstream changes to the branch, now the status of the PR is green. ------------- PR: https://git.openjdk.java.net/crac/pull/18 From heidinga at openjdk.java.net Thu Apr 21 13:16:01 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Thu, 21 Apr 2022 13:16:01 GMT Subject: [crac] RFR: Collect CREngine child [v2] In-Reply-To: References: Message-ID: On Thu, 21 Apr 2022 09:10:00 GMT, Anton Kozlov wrote: >> When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. >> >> [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Check fork error > - Merge remote-tracking branch 'jdk/crac/crac' into crengine-child > - Collect CREngine child Looks good to me ------------- Marked as reviewed by heidinga (Committer). PR: https://git.openjdk.java.net/crac/pull/18 From akozlov at openjdk.java.net Fri Apr 22 09:12:29 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 22 Apr 2022 09:12:29 GMT Subject: [crac] RFR: Collect CREngine child [v2] In-Reply-To: References: Message-ID: On Thu, 21 Apr 2022 09:10:00 GMT, Anton Kozlov wrote: >> When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. >> >> [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Check fork error > - Merge remote-tracking branch 'jdk/crac/crac' into crengine-child > - Collect CREngine child Thanks for review! ------------- PR: https://git.openjdk.java.net/crac/pull/18 From akozlov at openjdk.java.net Fri Apr 22 09:12:31 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 22 Apr 2022 09:12:31 GMT Subject: [crac] Integrated: Collect CREngine child In-Reply-To: References: Message-ID: On Thu, 7 Apr 2022 14:02:39 GMT, Anton Kozlov wrote: > When VM calls for CREngine it does not properly awaits termination of the engine. In case of `criuengine`, this could lead for a dangling process in the CRIU image [1]. This patch adds missing waitpid that eliminates the problem. > > [1] https://github.com/openjdk/crac/blob/crac/src/java.base/unix/native/criuengine/criuengine.c#L68 This pull request has now been integrated. Changeset: d2b19e91 Author: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/d2b19e9119cdf6d7b4e6197ff14c05327524071a Stats: 20 lines in 1 file changed: 18 ins; 0 del; 2 mod Collect CREngine child Reviewed-by: heidinga ------------- PR: https://git.openjdk.java.net/crac/pull/18 From akozlov at openjdk.java.net Mon Apr 25 17:55:58 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 25 Apr 2022 17:55:58 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v2] In-Reply-To: References: Message-ID: On Wed, 20 Apr 2022 19:46:49 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge remote-tracking branch 'jdk/crac/crac' into refqueue-2 >> - Update ReferenceHandle for CRaC >> >> * Hide new API for internal use >> * Add timeout >> * Remove REFERENCE_HANDLER resource > > src/java.base/share/classes/java/lang/ref/Reference.java line 331: > >> 329: >> 330: @Override >> 331: public boolean waitForQueueProcessed(ReferenceQueue queue, > > Should this method be static as it doesn't use the instance's state? > > Alternatively, if it's for *this* reference's queue, then the instance variable should be used and the queue parameter can be removed. > > Actually, I'm starting to think this method shouldn't exist on reference. It belongs on the ReferenceQueue rather than here. Possibly as a static helper method if there's a need to expose the version that takes a queue. This is an interface method of JavaLangRefAccess that exposes some package-private java.lang.ref methods. So it cannot be static. `this` in this context is the instance of that JavaLangRefAccess. And the interface methods just calls the instance method of ReferenceQueue as you suggest. That instance method is intentionally made not-public, so our EA code will be compatible with JDK17. After some thought, this looks like better approach than introducing new public methods until those methods will be agreed to be good ones. I don't feel this for the method, after all, I found a better name since the last patch in this area :) ------------- PR: https://git.openjdk.java.net/crac/pull/22 From akozlov at openjdk.java.net Mon Apr 25 18:00:00 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 25 Apr 2022 18:00:00 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v2] In-Reply-To: References: Message-ID: On Wed, 20 Apr 2022 20:03:42 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge remote-tracking branch 'jdk/crac/crac' into refqueue-2 >> - Update ReferenceHandle for CRaC >> >> * Hide new API for internal use >> * Add timeout >> * Remove REFERENCE_HANDLER resource > > src/java.base/share/classes/jdk/crac/Misc.java line 10: > >> 8: * Additional utilities. >> 9: */ >> 10: public class Misc { > > Given this is a utility class, it should probably be final. No use in allowing subclasses Right, thanks! ------------- PR: https://git.openjdk.java.net/crac/pull/22 From akozlov at openjdk.java.net Mon Apr 25 17:59:57 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 25 Apr 2022 17:59:57 GMT Subject: [crac] RFR: Update Reference Handling for CRaC [v3] In-Reply-To: References: Message-ID: > This change updates Reference Handling after Alan's comments [1]. > * The new API is moved to the CRaC-related classes to avoid polluting or changing standard classes. This should make EA builds more attractive. > * The method for waiting threads now accepts timeout, as turns to be needed by CleanerImpl.beforeCheckpoint(). > * Now reference handling outside of the JDK code is supported, see the supplied test update. The handing does not depend on the first-level reference processing thread's beforeCheckpoint is called first. After that, it was possible to remove the resource and the corresponding REFERENCE_HANDLER priority. > > The methods for waiting threads cannot guarantee that a reference handling is complete for a particular queue and a set of threads, as nothing prevents an another concurrent thread to change the reachability of a random object that will end up in the queue after the method returns. The method is designed to synchronize reference handling and checkpoint. That is, to ensure that objects that are on the way to be enqueued are indeed enqueued and corresponding clean-up is performed, as demonstrated by the test. > > [1] https://github.com/openjdk/crac/pull/13#issuecomment-1028024855 Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Make jdk.crac.Misc final ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/22/files - new: https://git.openjdk.java.net/crac/pull/22/files/5c4e03f6..24d2f2e4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=22&range=02 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=22&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/crac/pull/22.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/22/head:pull/22 PR: https://git.openjdk.java.net/crac/pull/22 From duke at openjdk.java.net Wed Apr 27 23:26:40 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Wed, 27 Apr 2022 23:26:40 GMT Subject: [crac] RFR: Allow users to pass new properties on restore Message-ID: VM changes: To identify properties that can be modified on restore, added a new bool field SystemProperty::_modifiable_on_restore. All the jdk related properties are marked unmodifiable. Rest of the properties are considered modifiable. When the JVM is launched with -XX:CRaCRestoreFrom option, then the properties prefixed with "-D" are maintained in a separate list in Arguments::_system_properties_for_restore. This list is passed to the JVM being restored by writing to a shared memory object. When the JVM is restored, it reads the new properties from shared memory object and updates its existing list of properties maintained in Arguments::_system_properties. JDK changes: System::props needs to be updated on restore to account for new system properties. For this purpose j.l.System registers a new JDKResource which queries new properties from the VM in afterRestore() notification and updates System::props. The JDKResource registered by j.l.System is given highest priority so it is the first resource to get afterRestore() notification. Signed-off-by: Ashutosh Mehra ------------- Commit messages: - Allow users to pass new properties on restore Changes: https://git.openjdk.java.net/crac/pull/21/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=21&range=00 Stats: 391 lines in 10 files changed: 324 ins; 15 del; 52 mod Patch: https://git.openjdk.java.net/crac/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/21/head:pull/21 PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Wed Apr 27 23:26:40 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 27 Apr 2022 23:26:40 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: On Thu, 14 Apr 2022 15:07:22 GMT, Ashutosh Mehra wrote: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra Thanks for the patch! The change is big, so I'd need some time to read and understand this. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at openjdk.java.net Thu Apr 28 12:31:19 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 28 Apr 2022 12:31:19 GMT Subject: [crac] RFR: Allow users to pass new properties on restore In-Reply-To: References: Message-ID: On Thu, 14 Apr 2022 15:07:22 GMT, Ashutosh Mehra wrote: > VM changes: To identify properties that can be modified on restore, > added a new bool field SystemProperty::_modifiable_on_restore. > All the jdk related properties are marked unmodifiable. Rest of the > properties are considered modifiable. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the > properties prefixed with "-D" are maintained in a separate list in > Arguments::_system_properties_for_restore. This list is passed to the JVM > being restored by writing to a shared memory object. > When the JVM is restored, it reads the new properties from shared memory > object and updates its existing list of properties maintained in > Arguments::_system_properties. > > JDK changes: System::props needs to be updated on restore to account for > new system properties. For this purpose j.l.System registers a new > JDKResource which queries new properties from the VM in afterRestore() > notification and updates System::props. The JDKResource registered by > j.l.System is given highest priority so it is the first resource to get > afterRestore() notification. > > Signed-off-by: Ashutosh Mehra > VM changes: To identify properties that can be modified on restore, added a new bool field SystemProperty::_modifiable_on_restore. All the jdk-related properties are marked unmodifiable. The rest of the properties are considered modifiable. On the java level, all properties can be changed with `System.setProperty`. Our property model is much closer to a program do e.g. after restore System.setProperty("sun.boot.library.path", "test"); There is no mechanism (except the deprecated SecurityManager) that prevents this from working, although no one promises the property value will be considered if set in this way. So I propose not to limit ourselves artificially. Who knows, maybe some JDK system property is OK to fix it on restore, e.g. if the checkpoint is done very early and a class that reads the property is not yet initialized. > When the JVM is launched with -XX:CRaCRestoreFrom option, then the properties prefixed with "-D" are maintained in a separate list in Arguments::_system_properties_for_restore. Have you considered how to make this change smaller? E.g adding an Explicit marker on the property? That's unfortunate that implicit properties are already added to the list before we get to the restore, and explicit properties cannot be distinguished from them. I'm fine with the change in Arguments::parse_each_vm_init_arg, but refactorings or adding a set of methods to manage the for_restore set is a bit overkill. Initializing a full VM just to be replaced by another restored one (as we do for now) does not make a lot of sense and needs fixing at some moment. So I'd like to avoid too many changes in the VM code that we'll have to revert. Another way is to pass all properties from restoring VM to the one being restored, for simplicity and flexibility. And to filter properties in the being restored VM -- that will be the single point of responsibility. But here there is a drawback: VMs may have different versions, so implicit properties in the being restored VM will be implicitly overwritten. > This list is passed to the JVM being restored by writing to a shared memory object. When the JVM is restored, it reads the new properties from shared memory object and updates its existing list of properties maintained in Arguments::_system_properties. I could not find users of the _system_properties after VM was initialized and JDK pulled the set of properties from the VM. I think it's not necessary to do, just as a call to j.l.System.setProperty is not reflected in the VM. > JDK changes: System::props needs to be updated on restore to account for new system properties. For this purpose j.l.System registers a new JDKResource which queries new properties from the VM in afterRestore() notification and updates System::props. The JDKResource registered by j.l.System is given highest priority so it is the first resource to get afterRestore() notification. AFAIU, for the regular bootstrap procedure, "pulling" of properties is required as the VM does not know when j.l.System will be initialized, so instead, they are pulled at the j.l.System's initialization. But I think we can assume j.l.System is always initialized at the checkpoint, so we can avoid the resource, and make jdk.crac.Core just to set the new properties with the j.l.System.setProperty. src/hotspot/os/linux/os_linux.cpp line 5858: > 5856: } > 5857: > 5858: write(shmfd, (void *)&props_count, sizeof(props_count)); This seems to write a header for the subsequent data. Could you create a `struct` for that so the format will be explicitly written? src/hotspot/share/prims/jvm.cpp line 322: > 320: * names and values from the jvm SystemProperty which are modifiable on restore. > 321: */ > 322: JVM_ENTRY(jobjectArray, JVM_GetModifiableProperties(JNIEnv *env)) Instead of a new JVM interface and JNI function, please just return the properties object along the rest of information from os::Linux::checkpoint src/java.base/share/classes/java/lang/System.java line 2110: > 2108: > 2109: private static void registerCRaCResource() { > 2110: jdk.internal.crac.Core.getJDKContext().register(new CRaCResource()); The weak link to the CRaCResource object won't prevent the object to become unreachable and thus properties won't be updated, e.g. after a few GCs. ------------- PR: https://git.openjdk.java.net/crac/pull/21 From akozlov at azul.com Thu Apr 28 19:52:14 2022 From: akozlov at azul.com (Anton Kozlov) Date: Thu, 28 Apr 2022 22:52:14 +0300 Subject: CRaC example: AWS Lambda Message-ID: <944dacc0-cfa9-35a9-f7a4-75e6bc1ff868@azul.com> Hi, I've recently finished another example where CRaC faster startup may be useful: a simple Java server-less app for Amazon Lambda that runs on a build of the Project CRaC [1]. I invite you to try the example. Please report issues to that github project. Our CRIU fork now supports[2] restoring in more restricted environments, e.g. without ptrace syscall available. Regardless of the example, I hope relaxing requirements for CRIU to be another hint that an implementation of a simple checkpoint-restore mechanism that works on something beyond Linux is feasible, although this needs more investigation. Thanks, Anton [1] https://github.com/CRaC/example-lambda [2] https://github.com/CRaC/criu/commit/db457a80298fca7963c1477b7a95ab4d46ce2885