From akozlov at openjdk.java.net Tue Mar 1 12:13:26 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 1 Mar 2022 12:13:26 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: References: Message-ID: > This change adds a new API and implementation to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > The current implementation obligates the first argument of the new set not to start with the dash, otherwise, the java launcher will interpret it as its own parameter. So the first argument should be a "verb" similar to the Main class. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Rename to newArguments, add javax.crac mirror ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/16/files - new: https://git.openjdk.java.net/crac/pull/16/files/fc947311..b3a09209 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=02 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=01-02 Stats: 20 lines in 3 files changed: 14 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/crac/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/16/head:pull/16 PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Mar 3 13:57:57 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 3 Mar 2022 13:57:57 GMT Subject: [crac] RFR: Fix NPE on ZipFile$Source access Message-ID: A race between ZipFile$CleanableResource.run() and ZipFile$Resource.beforeCheckpoint() can lead to NullPointerException when zsrc == null. This is observed on some runs of JavaCompilerCRaC.java from #16. The change aligns beforeCheckpoint() with the run(), providing the zsrc check and the proper locking. ------------- Commit messages: - Fix NPE on ZipFile$Source access Changes: https://git.openjdk.java.net/crac/pull/17/files Webrev: https://webrevs.openjdk.java.net/?repo=crac&pr=17&range=00 Stats: 28 lines in 1 file changed: 13 ins; 10 del; 5 mod Patch: https://git.openjdk.java.net/crac/pull/17.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/17/head:pull/17 PR: https://git.openjdk.java.net/crac/pull/17 From akozlov at openjdk.java.net Thu Mar 3 14:05:34 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 3 Mar 2022 14:05:34 GMT Subject: [crac] RFR: [TEST] check if some j.l.* methods time out on restore immediately In-Reply-To: References: Message-ID: On Fri, 4 Feb 2022 15:44:33 GMT, Alexander Stepanov wrote: > add a test to check if Thread.join(timeout), Thread.sleep(timeout) and Object.wait(timeout) will be completed on restore immediately if their end time fell on the CRaC pause period > > checked on Ubuntu 20.04 Linux (x86-64), passed Marked as reviewed by akozlov (Lead). ------------- PR: https://git.openjdk.java.net/crac/pull/15 From akozlov at openjdk.java.net Thu Mar 3 14:27:35 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 3 Mar 2022 14:27:35 GMT Subject: [crac] RFR: [TEST] check if some j.l.* methods time out on restore immediately In-Reply-To: References: Message-ID: On Fri, 4 Feb 2022 15:44:33 GMT, Alexander Stepanov wrote: > add a test to check if Thread.join(timeout), Thread.sleep(timeout) and Object.wait(timeout) will be completed on restore immediately if their end time fell on the CRaC pause period > > checked on Ubuntu 20.04 Linux (x86-64), passed For the record, the intention of the test is to specify and check the behavior of methods which wait with the timeout, as written in the test description. I think these methods are a good start. AFAICS together with j.l.Process.waitFor() and j.l.r.ReferenceQueue.remove they consist complete set of such methods in java.lang package.The Process.waitFor() and ReferenceQueue.remove() are implemented via tested methods, so omitted. ------------- PR: https://git.openjdk.java.net/crac/pull/15 From abakhtin at openjdk.java.net Thu Mar 3 15:50:35 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Thu, 3 Mar 2022 15:50:35 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 12:13:26 GMT, Anton Kozlov wrote: >> This change adds a new API and implementation to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. >> >> The current implementation obligates the first argument of the new set not to start with the dash, otherwise, the java launcher will interpret it as its own parameter. So the first argument should be a "verb" similar to the Main class. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Rename to newArguments, add javax.crac mirror src/java.base/share/classes/javax/crac/Core.java line 87: > 85: > 86: /** > 87: * Gets new arguments provided after restore. May be description could be extended. I think it's worth mentioning that the new arguments do not replace original arguments, it is application-level arguments and it's application's responsibility to parse and apply these additional arguments. An empty array is returned in case of no additional arguments are provided. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From heidinga at openjdk.java.net Fri Mar 4 15:39:41 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 4 Mar 2022 15:39:41 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: References: Message-ID: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> On Tue, 1 Mar 2022 12:13:26 GMT, Anton Kozlov wrote: >> This change adds a new API and implementation to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. >> >> The current implementation obligates the first argument of the new set not to start with the dash, otherwise, the java launcher will interpret it as its own parameter. So the first argument should be a "verb" similar to the Main class. > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Rename to newArguments, add javax.crac mirror Is there a design doc that describes the intended behaviour and limitations of this approach? I've read the description in the PR but there's more to call out about what options can reasonably change. ie: jvm options are unlikely to be successfully changed while passing new application args may be OK (if the app is written to handle that) src/hotspot/os/linux/os_linux.cpp line 5846: > 5844: static int set_new_args(int id, const char *args) { > 5845: char shmpath[128]; > 5846: snprintf(shmpath, sizeof(shmpath), "/crac_%d", id); Should the return value be checked here? Something like: int written = snprintf(shmpath, sizeof(shmpath), "/crac_%d", id); if ((written < 0) || (written >= sizeof(shmpath))) { return -1; } src/hotspot/os/linux/os_linux.cpp line 5862: > 5860: fprintf(stderr, "write shm truncated"); > 5861: } > 5862: close(shmfd); Should this close the file? Given the data didn't write correctly I think we should `shm_unlink` the shared mapping so it gets released. src/hotspot/os/linux/os_linux.cpp line 5936: > 5934: > 5935: if (0 < info.si_int) { > 5936: *argp = get_new_args(info.si_int); Is `info.si_int` unique per restored process or will restoring the same image twice at the same time corrupt the shared memory? Basically, is the `inso.si_int` unique enough to be the id for the `/crac_%d` file? ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 07:45:29 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 07:45:29 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> References: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> Message-ID: On Fri, 4 Mar 2022 15:36:14 GMT, Dan Heidinga wrote: > Is there a design doc that describes the intended behaviour and limitations of this approach? I've read the description in the PR but there's more to call out about what options can reasonably change. ie: jvm options are unlikely to be successfully changed while passing new application args may be OK (if the app is written to handle that) At the momement only CLI arguments are passed. I think it won't be hard to extended this for system properties (-D...). Environment variables would be more tricky. JVM options are of course under a big question. I'm looking at CLI arguments as they cannot change, in contrast with properties or the environment (I'm thinking e.g. an another java thread that may change them). Some JVM properties can be changed by existing mechanisms (`MANAGEABLE`). But unlikely all JVM options can be made manageable -- although in theory possible, this won't be practical (resize heaps because Xmx, regenerate code because codegen options,..). But some of options are not manageable but can be allowed, like PrintCompilation. Either they need be turned to manageable, or another term should be invented, or we may use ad-hoc checks for each and warn the user the option may not be changed. Let's try to find a useful and practical set? It looks to be CLI arguments, system properties, environment, and a subset of JVM options. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 07:58:30 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 07:58:30 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> References: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> Message-ID: On Wed, 2 Mar 2022 18:53:41 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename to newArguments, add javax.crac mirror > > src/hotspot/os/linux/os_linux.cpp line 5936: > >> 5934: >> 5935: if (0 < info.si_int) { >> 5936: *argp = get_new_args(info.si_int); > > Is `info.si_int` unique per restored process or will restoring the same image twice at the same time corrupt the shared memory? > > Basically, is the `inso.si_int` unique enough to be the id for the `/crac_%d` file? Idea is exactly to avoid data corruption. For criu, it's a PID of the restoring process, that is, the PID that would java process receive with the usual start (without checkpoint/restore). The restoring process exists for the whole life of the restored java instance, so it is impossible for another restoring process to obtain the same ID. For pauseengine, there is a slight change of ID reuse, as restoring process may exit earlier than the target VM pick up the data. But since chances are small, and pauseengine is mostly for debugging, I don't think it is a big problem. To mitigate, another ID generation scheme could be used, e.g. the ID could be the PID of the target VM. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 09:17:26 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 09:17:26 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: > This change adds a new API and implementation to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > The current implementation obligates the first argument of the new set not to start with the dash, otherwise, the java launcher will interpret it as its own parameter. So the first argument should be a "verb" similar to the Main class. Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: - Better error handling in hotspot - Call the main of provided class ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/16/files - new: https://git.openjdk.java.net/crac/pull/16/files/b3a09209..b45a5b85 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=03 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=02-03 Stats: 73 lines in 5 files changed: 40 ins; 26 del; 7 mod Patch: https://git.openjdk.java.net/crac/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/16/head:pull/16 PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 09:17:28 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 09:17:28 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> References: <3nFFUKv-YOkeU9CiMHWsKW1bnmhtgshV1QgopdeX6vo=.89fdf6cd-cbbc-4f77-b7f8-6f91ede7806c@github.com> Message-ID: On Wed, 2 Mar 2022 18:50:01 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename to newArguments, add javax.crac mirror > > src/hotspot/os/linux/os_linux.cpp line 5862: > >> 5860: fprintf(stderr, "write shm truncated"); >> 5861: } >> 5862: close(shmfd); > > Should this close the file? Given the data didn't write correctly I think we should `shm_unlink` the shared mapping so it gets released. Nice catch, thanks! ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 09:19:16 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 09:19:16 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 15:47:14 GMT, Alexey Bakhtin wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename to newArguments, add javax.crac mirror > > src/java.base/share/classes/javax/crac/Core.java line 87: > >> 85: >> 86: /** >> 87: * Gets new arguments provided after restore. > > May be description could be extended. I think it's worth mentioning that the new arguments do not replace original arguments, it is application-level arguments and it's application's responsibility to parse and apply these additional arguments. An empty array is returned in case of no additional arguments are provided. I was trying to provide better docs without over-specifying current implementation and realized that the new functionality can be done without an API change. Please look at the latest state. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Sat Mar 5 09:39:32 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Sat, 5 Mar 2022 09:39:32 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: On Sat, 5 Mar 2022 09:17:26 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Better error handling in hotspot > - Call the main of provided class I think I found a better approach to achieve the same. I didn't like the first argument was going to be ignored most of the time (considering use-cases like in the demo). So instead, we can require arguments provided for restore need to be an actual Class, which main method will be called, and the arguments. So `java -XX:CRaCRestore Main arg1" will restore the instance and execute Main.main(arg1) in that instance. Comparing with the previous approach, it's like a program did checkpointRestore(); mainClass, arguments = getAndParseNewArguments() invoke(mainClass, "main", arguments) Now the target Main class may handle new arguments right away, or store them to be used for later, like it was done in versions up to v3[1]. The demo is extended for Compile to be the Class, so examples above are still valid. Start-up time is roughly the same (actually a bit better, but it should be a noise). [1] https://mail.openjdk.java.net/pipermail/crac-dev/2022-March/000132.html ------------- PR: https://git.openjdk.java.net/crac/pull/16 From abakhtin at openjdk.java.net Tue Mar 15 16:41:18 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 15 Mar 2022 16:41:18 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: On Sat, 5 Mar 2022 09:17:26 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Better error handling in hotspot > - Call the main of provided class Marked as reviewed by abakhtin (no project role). ------------- PR: https://git.openjdk.java.net/crac/pull/16 From abakhtin at openjdk.java.net Tue Mar 15 16:41:19 2022 From: abakhtin at openjdk.java.net (Alexey Bakhtin) Date: Tue, 15 Mar 2022 16:41:19 GMT Subject: [crac] RFR: Provide arguments for restore [v3] In-Reply-To: References: Message-ID: On Sat, 5 Mar 2022 09:16:57 GMT, Anton Kozlov wrote: >> src/java.base/share/classes/javax/crac/Core.java line 87: >> >>> 85: >>> 86: /** >>> 87: * Gets new arguments provided after restore. >> >> May be description could be extended. I think it's worth mentioning that the new arguments do not replace original arguments, it is application-level arguments and it's application's responsibility to parse and apply these additional arguments. An empty array is returned in case of no additional arguments are provided. > > I was trying to provide better docs without over-specifying current implementation and realized that the new functionality can be done without an API change. Please look at the latest state. New approach looks great. Thank you ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Tue Mar 15 17:32:23 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 15 Mar 2022 17:32:23 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: <9hAuCZwNIv7NDBKbwPR44ZZLqN_gaDun-CBTx4koHbs=.65fa7641-149a-4042-83d9-36c58ab0db8d@github.com> On Tue, 15 Mar 2022 16:37:48 GMT, Alexey Bakhtin wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Better error handling in hotspot >> - Call the main of provided class > > Marked as reviewed by abakhtin (no project role). @alexeybakhtin Thank you very much for the review! ------------- PR: https://git.openjdk.java.net/crac/pull/16 From avstepan at openjdk.java.net Wed Mar 16 11:23:59 2022 From: avstepan at openjdk.java.net (Alexander Stepanov) Date: Wed, 16 Mar 2022 11:23:59 GMT Subject: [crac] Integrated: [TEST] check if some j.l.* methods time out on restore immediately In-Reply-To: References: Message-ID: <4hDd3-dZQLYtWUF7_TMKOP6dnja_rz73mbPakzVjPvs=.b8bfce9d-2439-4f68-9308-437b075eee45@github.com> On Fri, 4 Feb 2022 15:44:33 GMT, Alexander Stepanov wrote: > add a test to check if Thread.join(timeout), Thread.sleep(timeout) and Object.wait(timeout) will be completed on restore immediately if their end time fell on the CRaC pause period > > checked on Ubuntu 20.04 Linux (x86-64), passed This pull request has now been integrated. Changeset: 5fdb727d Author: Alexander Stepanov Committer: Anton Kozlov URL: https://git.openjdk.java.net/crac/commit/5fdb727df01107ab3e09a45eb6873efc96cfbc44 Stats: 202 lines in 1 file changed: 202 ins; 0 del; 0 mod [TEST] check if some j.l.* methods time out on restore immediately Reviewed-by: akozlov ------------- PR: https://git.openjdk.java.net/crac/pull/15 From heidinga at openjdk.java.net Wed Mar 16 13:13:19 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Wed, 16 Mar 2022 13:13:19 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: On Sat, 5 Mar 2022 09:17:26 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Better error handling in hotspot > - Call the main of provided class src/java.base/share/classes/jdk/crac/Core.java line 172: > 170: newMain.invoke(null, > 171: (Object)Arrays.copyOfRange(args, 1, args.length)); > 172: } catch (Throwable e) { Is `Throwable` the right thing to catch here? I think Error-subclasses, like OutOfMemoryError or VerifyError, should propagate past here without being added to the suppressed set of the `restoreException`. src/java.base/share/classes/jdk/crac/Core.java line 216: > 214: try { > 215: checkpointInProgress = true; > 216: checkpointRestore1(Reflection.getCallerClass()); The caller Class is being passed in to get access to a Classloader object to load the new main class. This means that if the `Core::checkpointRestore` caller is loaded by for example, application specific classloader, it may not have access to the core classpath. Would it be a cleaner model if we require that the new main class must be a sibling of the original main class and therefore on the classpath/modulepath? We could avoid using `getCallerClass` and directly pass the `ClassLoader::getSystemClassLoader` in that case. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Mar 17 14:44:51 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 17 Mar 2022 14:44:51 GMT Subject: [crac] RFR: Provide arguments for restore [v4] In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 13:09:36 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Better error handling in hotspot >> - Call the main of provided class > > src/java.base/share/classes/jdk/crac/Core.java line 216: > >> 214: try { >> 215: checkpointInProgress = true; >> 216: checkpointRestore1(Reflection.getCallerClass()); > > The caller Class is being passed in to get access to a Classloader object to load the new main class. This means that if the `Core::checkpointRestore` caller is loaded by for example, application specific classloader, it may not have access to the core classpath. > > Would it be a cleaner model if we require that the new main class must be a sibling of the original main class and therefore on the classpath/modulepath? We could avoid using `getCallerClass` and directly pass the `ClassLoader::getSystemClassLoader` in that case. Thanks. This makes sense, the proposed behavior is indeed more clear. It will cut out some cases (e.g. running the app from .java file directly, its class loader won't be the System Class Loader). But also will make some cases easier to reason about (e.g. running checkpoint from jcmd -- the caller's class loader is the bootstrap, we may require user to place the class there, but this does not look good). Although with the existing behavior it's possible to implement the proposed one, and vice versa looks impossible, I still would stick to the one that is more clear. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Mar 17 19:20:50 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 17 Mar 2022 19:20:50 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: > This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: - Use System class loader - Make catch() preceise ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/16/files - new: https://git.openjdk.java.net/crac/pull/16/files/b45a5b85..745e75d1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=04 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=03-04 Stats: 10 lines in 1 file changed: 5 ins; 1 del; 4 mod Patch: https://git.openjdk.java.net/crac/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/16/head:pull/16 PR: https://git.openjdk.java.net/crac/pull/16 From heidinga at openjdk.java.net Fri Mar 18 14:00:08 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 18 Mar 2022 14:00:08 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 19:20:50 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Use System class loader > - Make catch() preceise src/java.base/share/classes/jdk/crac/Core.java line 171: > 169: Method newMain = newMainClass.getDeclaredMethod("main", > 170: String[].class); > 171: newMain.setAccessible(true); Good catch on adding the `setAccessible` call. Given that the VM is doing the lookup based on the command line parameters, we should be wrapping this with a `AccessController::doPrivileged` block as a SecurityManager shouldn't be able to block this accessibility request. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From heidinga at openjdk.java.net Fri Mar 18 14:12:00 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Fri, 18 Mar 2022 14:12:00 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 19:20:50 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Use System class loader > - Make catch() preceise src/java.base/share/classes/jdk/crac/Core.java line 177: > 175: InvocationTargetException | > 176: NoSuchMethodException | > 177: IllegalAccessException e) { Thanks for adapting this to be more specific on what it catches. I don't think it's quite right yet though as the current code will allow exceptions thrown by the `newMain` method (and its callees) to propagate past the checkpoint spot and makes the caller of the checkpoint code need to handle them. At the time `Core.checkpointRestore();` is called, we may have stack that looks llike: TOS Core.checkpointRestore(); Foo.bar(); Foo.foobar(); SomeOtherClass.method(); OriginalClass.main(); And when we restore, we load an execute a new `main()` method as though it was called where `Core.checkpointRestore();` was previously on the stack resulting in: TOS newMain.main(); Core.checkpointRestore(); Foo.bar(); Foo.foobar(); SomeOtherClass.method(); OriginalClass.main(); So exceptions thrown by code called from `newMain` should not propagate past `Core.checkpointRestore();` without being wrapped in a `RestoreException`. `Error`-subclasses should propagate. I think the code should be refactored to something like: } catch(Exception t) { assert checkpointException == null : "should not have new arguments"; if (restoreException == null) { restoreException = new RestoreException(); } restoreException.addSuppressed(e); } as that correctly catches all Exceptions but lets the Errors propagate past ------------- PR: https://git.openjdk.java.net/crac/pull/16 From duke at openjdk.java.net Fri Mar 18 14:51:00 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Fri, 18 Mar 2022 14:51:00 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 19:20:50 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Use System class loader > - Make catch() preceise I feel it is a useful feature be able to pass new arguments to the restored process. But looking from the user perspective, I see few concerns in the current approach to implement it: - the current implementation is forcing user to create two classes with the main() method - consequently the command line to run the application also differs depending on whether the user is running for the first time to take checkpoint, or restoring from the checkpoint - the stack trace of the main thread after restore has remnant frames from the "before checkpoint" state, which at the least would be confusing to the user When I was thinking of ways to work around these concerns, I ended up with a solution very similar to the initial approach where an additional API to get the new env variables was added without the requirement to introduce a new class with main method. I feel that would have been a better approach. I guess the concern with that approach is that the first argument is to be ignored as mentioned [here](https://github.com/openjdk/crac/pull/16#issuecomment-1059730025). I think that shouldn't matter from the user perspective. A user would prefer to use same class name every time to start the application and therefore the JVM can safely ignore the first argument. If the user specifies a different class name during restore, then the JVM should interpret it as a different application and the current checkpoint should not be used for restoring it. In this case the JVM would bail out with an exception. Does that make sense? ------------- PR: https://git.openjdk.java.net/crac/pull/16 From heidinga at openjdk.java.net Mon Mar 21 16:07:11 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Mon, 21 Mar 2022 16:07:11 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 19:20:50 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Use System class loader > - Make catch() preceise Ashu raises some good points. In my view, this PR is really looking at how to "daemonize" a java application such that the first invocation is used to create, initialize, and warmup the daemon process. This is similar to how Nailgun used to work to keep a "hot" copy of the process in memory and reuse it except here we don't need to keep the process running. And likely not that different from what some serverless frameworks (OpenWhisk for one) are doing with hot-standbys. Does this match your vision for where the feature is going? If so, I think we want to play around with both options - must specify new class vs just the new args - and see which model makes more sense when writing the code. As the example in the PR shows, users will need to rearchitect to fit this model so we should write some more examples to see which is more natural. To be explicit - which is better? Option 1 (current code in this PR) $ java Foo_Initializer arg1 arg2 arg3 // checkpoint taken $ java -XX:CRaCRestoreFrom=./cr New_Class arg1 arg2 arg3 Option 2 (original code in this PR with the `Core::new_arguments()` api) $ java Foo arg1 arg2 arg3 // checkpoint taken $ java -XX:CRaCRestoreFrom=./cr Foo arg1 arg2 arg3 ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Mon Mar 21 18:38:02 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 21 Mar 2022 18:38:02 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 19:20:50 GMT, Anton Kozlov wrote: >> This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Use System class loader > - Make catch() preceise I'm looking at the new arguments as the way to specify an additional service action on restore. In the javac example, the action is the main reason to restore the instance, after which the original program completes. But in general, it's a way to influence the program behavior on restore, after it has been executed for a while before the checkpoint. After the action is completed, the execution returns back to the original program. The purpose is to allow more flexible images that can be purposed better to the current environment, like in general command line arguments allow more flexible program. > * the current implementation is forcing user to create two classes with the main() method It's not necessary two (initializing and restoring) main methods, there could be more. With the old approach, users would be forced to dispatch execution themselves. I had some not pretty array juggling code, and it will be repeated over and over in every program that attempts to use this feature. The new arguments are optional. And JDK may provide predefined services. Such as a store for arguments for later consuming, to get back to the old approach. > * consequently the command line to run the application also differs depending on whether the user is running for the first time to take checkpoint, or restoring from the checkpoint It may be a good feature. Restore brings us into the middle of execution, not to the start of the original program. I think the opposite looks confusing, if the same class name means different things if used for checkpoint (start the main from the class) or to restore (continue from some point of the main of the specified class). > * the stack trace of the main thread after restore has remnant frames from the "before checkpoint" state, which at the least would be confusing to the user Some documentation indeed would be useful. I'm thinking about a comment for checkpointRestore(): "implementation may accept an additional action and arguments to be run after restore, after which they will be used to locate a class which `main` method will be called with provided arguments". There should be less surprise then. > I guess the concern with that approach is that the first argument is to be ignored as mentioned [here](https://github.com/openjdk/crac/pull/16#issuecomment-1059730025). It's better to avoid the first argument then. I just did not want to modify the launcher code too much. With the new approach, the first argument of restore means something very similar to the first argument of the normal start. > A user would prefer to use same class name every time to start the application and therefore the JVM can safely ignore the first argument. If the user specifies a different class name during restore, then the JVM should interpret it as a different application and the current checkpoint should not be used for restoring it. In this case, the JVM would bail out with an exception. For java -jar, would it be required to match -jar or with the name of the main class? I see this may be appealing, but then two commands in @DanHeidinga 's Option 2 are too similar to each other, and the second one means something very different. $ java -XX:CRaCRestoreFrom=./cr Foo arg1 arg2 arg3 On the opposite, treating Foo as the class name allows us to implement a fallback if CRaCRestoreFrom fails. `Foo` then may proceed as usual, or it may detect the VM was not restored and fail gracefully. -XX:+CRaCIgnoreRestoreIfUnavailable was designed for the fallback, but unfortunately, I broke it in #3. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Mon Mar 21 18:59:09 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 21 Mar 2022 18:59:09 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 14:08:42 GMT, Dan Heidinga wrote: > the current code will allow exceptions thrown by the `newMain` method (and its callees) to propagate past the checkpoint spot and makes the caller of the checkpoint code need to handle them. An exception from the newMain will be wrapped in InvocationTargetException, which is caught and is suppressed by the RestoreException. Right? Otherwise, this java code won't compile, as checkpointRestore is declared to throw only {Checkpoint,Restore}Exception. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From heidinga at openjdk.java.net Mon Mar 21 20:15:57 2022 From: heidinga at openjdk.java.net (Dan Heidinga) Date: Mon, 21 Mar 2022 20:15:57 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 18:56:00 GMT, Anton Kozlov wrote: >> src/java.base/share/classes/jdk/crac/Core.java line 177: >> >>> 175: InvocationTargetException | >>> 176: NoSuchMethodException | >>> 177: IllegalAccessException e) { >> >> Thanks for adapting this to be more specific on what it catches. >> >> I don't think it's quite right yet though as the current code will allow exceptions thrown by the `newMain` method (and its callees) to propagate past the checkpoint spot and makes the caller of the checkpoint code need to handle them. >> >> At the time `Core.checkpointRestore();` is called, we may have stack that looks llike: >> >> TOS >> Core.checkpointRestore(); >> Foo.bar(); >> Foo.foobar(); >> SomeOtherClass.method(); >> OriginalClass.main(); >> >> >> And when we restore, we load an execute a new `main()` method as though it was called where `Core.checkpointRestore();` was previously on the stack resulting in: >> >> TOS >> newMain.main(); >> Core.checkpointRestore(); >> Foo.bar(); >> Foo.foobar(); >> SomeOtherClass.method(); >> OriginalClass.main(); >> >> >> So exceptions thrown by code called from `newMain` should not propagate past `Core.checkpointRestore();` without being wrapped in a `RestoreException`. `Error`-subclasses should propagate. >> >> I think the code should be refactored to something like: >> >> } catch(Exception t) { >> assert checkpointException == null : >> "should not have new arguments"; >> if (restoreException == null) { >> restoreException = new RestoreException(); >> } >> restoreException.addSuppressed(e); >> } >> >> as that correctly catches all Exceptions but lets the Errors propagate past > >> the current code will allow exceptions thrown by the `newMain` method (and its callees) to propagate past the checkpoint spot and makes the caller of the checkpoint code need to handle them. > > An exception from the newMain will be wrapped in InvocationTargetException, which is caught and is suppressed by the RestoreException. Right? Otherwise, this java code won't compile, as checkpointRestore is declared to throw only {Checkpoint,Restore}Exception. I had to double check with `jshell` as I thought runtime exceptions would propagate past. Turns out your 100% right about the all Exceptions/Errors/Throwables being wrapped with `InvocationTargetException`. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Tue Mar 22 16:00:12 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 22 Mar 2022 16:00:12 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 13:56:41 GMT, Dan Heidinga wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Use System class loader >> - Make catch() preceise > > src/java.base/share/classes/jdk/crac/Core.java line 171: > >> 169: Method newMain = newMainClass.getDeclaredMethod("main", >> 170: String[].class); >> 171: newMain.setAccessible(true); > > Good catch on adding the `setAccessible` call. > > Given that the VM is doing the lookup based on the command line parameters, we should be wrapping this with a `AccessController::doPrivileged` block as a SecurityManager shouldn't be able to block this accessibility request. I was unsure about SecurityManager, having it will be removed soon. However, while it's still there, and adding necessary code is not hard, it's worth fixing. I'm trying to push the fix, but GitHub sends me "Internal Server Error". Hope this will be resolved soon. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From duke at openjdk.java.net Tue Mar 22 18:10:07 2022 From: duke at openjdk.java.net (Ashutosh Mehra) Date: Tue, 22 Mar 2022 18:10:07 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: References: Message-ID: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> On Mon, 21 Mar 2022 18:34:46 GMT, Anton Kozlov wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Use System class loader >> - Make catch() preceise > > I'm looking at the new arguments as the way to specify an additional service action on restore. In the javac example, the action is the main reason to restore the instance, after which the original program completes. But in general, it's a way to influence the program behavior on restore, after it has been executed for a while before the checkpoint. After the action is completed, the execution returns back to the original program. The purpose is to allow more flexible images that can be purposed better to the current environment, like in general command line arguments allow more flexible program. > >> * the current implementation is forcing user to create two classes with the main() method > > It's not necessary two (initializing and restoring) main methods, there could be more. With the old approach, users would be forced to dispatch execution themselves. I had some not pretty array juggling code, and it will be repeated over and over in every program that attempts to use this feature. > > The new arguments are optional. And JDK may provide predefined services. Such as a store for arguments for later consuming, to get back to the old approach. > >> * consequently the command line to run the application also differs depending on whether the user is running for the first time to take checkpoint, or restoring from the checkpoint > > It may be a good feature. Restore brings us into the middle of execution, not to the start of the original program. I think the opposite looks confusing, if the same class name means different things if used for checkpoint (start the main from the class) or to restore (continue from some point of the main of the specified class). > >> * the stack trace of the main thread after restore has remnant frames from the "before checkpoint" state, which at the least would be confusing to the user > > Some documentation indeed would be useful. I'm thinking about a comment for checkpointRestore(): "implementation may accept an additional action and arguments to be run after restore, after which they will be used to locate a class which `main` method will be called with provided arguments". There should be less surprise then. > >> I guess the concern with that approach is that the first argument is to be ignored as mentioned [here](https://github.com/openjdk/crac/pull/16#issuecomment-1059730025). > > It's better to avoid the first argument then. I just did not want to modify the launcher code too much. > > With the new approach, the first argument of restore means something very similar to the first argument of the normal start. > >> A user would prefer to use same class name every time to start the application and therefore the JVM can safely ignore the first argument. If the user specifies a different class name during restore, then the JVM should interpret it as a different application and the current checkpoint should not be used for restoring it. In this case, the JVM would bail out with an exception. > > For java -jar, would it be required to match -jar or with the name of the main class? > > I see this may be appealing, but then two commands in @DanHeidinga 's Option 2 are too similar to each other, and the second one means something very different. > > > $ java -XX:CRaCRestoreFrom=./cr Foo arg1 arg2 arg3 > > > On the opposite, treating Foo as the class name allows us to implement a fallback if CRaCRestoreFrom fails. `Foo` then may proceed as usual, or it may detect the VM was not restored and fail gracefully. > > -XX:+CRaCIgnoreRestoreIfUnavailable was designed for the fallback, but unfortunately, I broke it in #3. @AntonKozlov it appears I am looking at this feature from a different perspective. My model was the first invocation of the application would create the checkpoint, and for subsequent invocation, JVM would restore the process from the checkpoint. For javac kind of applications, user would want to pass different arguments for every invocation, and that's where this PR would be helpful. > it's a way to influence the program behavior on restore, after it has been executed for a while before the checkpoint. After the action is completed, the execution returns back to the original program. This is interesting approach as well. Can you share some scenarios where the application would benefit from this feature. While the javac example shows how this approach would work, in my opinion, it doesn't fall into the category of "influence the program behavior on restore" because the javac has already completed its original task when the checkpoint is taken. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Mar 24 13:03:59 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 24 Mar 2022 13:03:59 GMT Subject: [crac] RFR: Provide arguments for restore [v6] In-Reply-To: References: Message-ID: > This change adds an ability to receive a new set of command-line arguments in the restored Java instance. The supplied demo code shows a faster replacement for `javac`. Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Handle SecurityManager ------------- Changes: - all: https://git.openjdk.java.net/crac/pull/16/files - new: https://git.openjdk.java.net/crac/pull/16/files/745e75d1..44126ba4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=05 - incr: https://webrevs.openjdk.java.net/?repo=crac&pr=16&range=04-05 Stats: 19 lines in 1 file changed: 10 ins; 3 del; 6 mod Patch: https://git.openjdk.java.net/crac/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/crac pull/16/head:pull/16 PR: https://git.openjdk.java.net/crac/pull/16 From akozlov at openjdk.java.net Thu Mar 24 14:56:19 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 24 Mar 2022 14:56:19 GMT Subject: [crac] RFR: Provide arguments for restore [v5] In-Reply-To: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> References: <-5MLFvlRHgsD7CuAu4A80Lgx7tmd5mAs-eDnmU5U2Tc=.5129682a-5127-4bea-95da-b4f1a9d50e0d@github.com> Message-ID: <-A36RM5qsAi14sNVDIQtU-xZAiEiosF9SSeSt_trKKc=.2adb419a-c9f4-413d-b78a-c11746c67e37@github.com> On Tue, 22 Mar 2022 18:06:18 GMT, Ashutosh Mehra wrote: > the first invocation of the application would create the checkpoint, and for subsequent invocation, JVM would restore the process from the checkpoint. I don't want to give too much control to JVM (and take the control from the user). Not every program is sensible to restart. Now we always restore to the point of checkpoint. Defining it elsewhere (e.g. start of the app) is hard problem. > Can you share some scenarios where the application would benefit from this feature. While the javac example shows how this approach would work, in my opinion, it doesn't fall into the category of "influence the program behavior on restore" because the javac has already completed its original task when the checkpoint is taken. Here the example program is a loop over javac, not a pure javac. The current example exits right after restore https://github.com/openjdk/crac/pull/16/files#diff-4d20e6a15e5a09de1e430f18a7c84eb257eb3d2b98721a35f4853d3fd2dadf37R27, but with arguments we arrange another javac iteration before the exit. I can imagine that a micro-service may need to know coordinates of other services on restore. Command line arguments may be used to provide the path to a config file then. In general, I would like arguments on restore to be as powerful as arguments for normal java start, to parameterize program as necessary. ------------- PR: https://git.openjdk.java.net/crac/pull/16 From noreply at github.com Thu Mar 31 10:57:55 2022 From: noreply at github.com (Anton Kozlov) Date: Thu, 31 Mar 2022 03:57:55 -0700 Subject: [CRaC/criu] b3a210: Silence irrelevant iptables error Message-ID: Branch: refs/heads/crac Home: https://github.com/CRaC/criu Commit: b3a210513e5a76ba9c8d906d2460c11ec7ac4cb1 https://github.com/CRaC/criu/commit/b3a210513e5a76ba9c8d906d2460c11ec7ac4cb1 Author: Anton Kozlov Date: 2022-03-31 (Thu, 31 Mar 2022) Changed paths: M criu/netfilter.c Log Message: ----------- Silence irrelevant iptables error From akozlov at azul.com Thu Mar 31 11:47:35 2022 From: akozlov at azul.com (Anton Kozlov) Date: Thu, 31 Mar 2022 14:47:35 +0300 Subject: [CRaC/criu] b3a210: Silence irrelevant iptables error In-Reply-To: References: Message-ID: <715ee5dc-77af-aba5-6504-d33a14fb99ad@azul.com> I've set notifications on the repository for CRIU that we use for CRaC in its current form. Please let me know if you think this is not appropriate. I don't wait for the feedback before pushing to CRIU in the absence of reviewers. But if you'd like to review the code, let me know as well. Thanks, Anton On 3/31/22 13:57, Anton Kozlov wrote: > Branch: refs/heads/crac > Home: https://github.com/CRaC/criu > Commit: b3a210513e5a76ba9c8d906d2460c11ec7ac4cb1 > https://github.com/CRaC/criu/commit/b3a210513e5a76ba9c8d906d2460c11ec7ac4cb1 > Author: Anton Kozlov > Date: 2022-03-31 (Thu, 31 Mar 2022) > > Changed paths: > M criu/netfilter.c > > Log Message: > ----------- > Silence irrelevant iptables error > >