From tpushkin at openjdk.org Wed Apr 2 14:00:26 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 2 Apr 2025 14:00:26 GMT Subject: [crac] Integrated: 8352413: [CRaC] crexec fails to pass some options when CRAC_CRIU_OPTS is already used In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 12:04:49 GMT, Timofei Pushkin wrote: > ~~This contains the change from #216 so that should be merged first.~~ UPD: rebased. > > The fix itself is small but coming up with a way to test it was not trivial: > 1. I've split `jdk/crac/CracEngineOptionsTest.java` onto `jdk/crac/engineOptions/ParsingTest.java` and `jdk/crac/engineOptions/HelpTest.java` because it was getting too large (nothing added/removed, just split). > 2. Added `jdk/crac/engineOptions/CracCriuOptsTest.java` to regression-test the main fix of this PR (this test depends on #216). > 3. Removed a part that tested that `args` are actually applied by `crexec` from `jdk/crac/VMOptionsTest.java` because (2) is now effectively tests this (`VMOptionsTest` wasn't a proper place for this to begin with, it just was convenient). This pull request has now been integrated. Changeset: f1aa8900 Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/f1aa890020af46ae8903a58de68b475f34c53576 Stats: 601 lines in 6 files changed: 364 ins; 232 del; 5 mod 8352413: [CRaC] crexec fails to pass some options when CRAC_CRIU_OPTS is already used Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/217 From rvansa at openjdk.org Wed Apr 2 14:03:26 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 2 Apr 2025 14:03:26 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: On Mon, 31 Mar 2025 08:58:32 GMT, Timofei Pushkin wrote: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location exec_location src/hotspot/share/runtime/crac_engine.cpp line 415: > 413: } > 414: > 415: if (strcmp(id, "crexec") == 0) { I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2024895757 From rvansa at openjdk.org Wed Apr 2 14:34:31 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 2 Apr 2025 14:34:31 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v2] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: fixup ------------- Changes: - all: https://git.openjdk.org/crac/pull/219/files - new: https://git.openjdk.org/crac/pull/219/files/1f46d9eb..3aabf07c Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=219&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=219&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From tpushkin at openjdk.org Wed Apr 2 14:55:07 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 2 Apr 2025 14:55:07 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:00:49 GMT, Radim Vansa wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location exec_location > > src/hotspot/share/runtime/crac_engine.cpp line 415: > >> 413: } >> 414: >> 415: if (strcmp(id, "crexec") == 0) { > > I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. Regarding the logs, agree, I'll remove them. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2025002510 From tpushkin at openjdk.org Thu Apr 3 07:00:48 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Thu, 3 Apr 2025 07:00:48 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: Message-ID: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request incrementally with two additional commits since the last revision: - Use comma as a separator when printing controlled options - Simplify vm_controlled_options ------------- Changes: - all: https://git.openjdk.org/crac/pull/220/files - new: https://git.openjdk.org/crac/pull/220/files/71c4227f..9cf82953 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=220&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=220&range=00-01 Stats: 19 lines in 2 files changed: 6 ins; 6 del; 7 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From rvansa at openjdk.org Mon Apr 7 08:37:28 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 08:37:28 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: Message-ID: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> On Wed, 2 Apr 2025 14:52:52 GMT, Timofei Pushkin wrote: >> src/hotspot/share/runtime/crac_engine.cpp line 415: >> >>> 413: } >>> 414: >>> 415: if (strcmp(id, "crexec") == 0) { >> >> I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. > > It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. > > Regarding the logs, agree, I'll remove them. Let's ignore the theoretical ambiguity for `crexec`... > It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030733676 From tpushkin at openjdk.org Mon Apr 7 08:51:12 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 08:51:12 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 08:34:42 GMT, Radim Vansa wrote: >> It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. >> >> Regarding the logs, agree, I'll remove them. > > Let's ignore the theoretical ambiguity for `crexec`... > >> It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. > > I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. So no other engine should be able to access this usage of `exec_location`: - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030758945 From rvansa at openjdk.org Mon Apr 7 09:38:20 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 09:38:20 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 08:48:18 GMT, Timofei Pushkin wrote: >> Let's ignore the theoretical ambiguity for `crexec`... >> >>> It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. >> >> I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. > > I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. > > So no other engine should be able to access this usage of `exec_location`: > - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) > - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) Despite `crexec` is part of JVM codebase and it allows JVM to use executable-based engine implementations, I don't consider it a part of JVM; the separation line should be drawn at CRE API level. So it should not be a "JVM's implementation detail". In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. In fact, it is non-trivial to programmatically figure out from within a shared library was loaded from (if e.g. it needs to load some extra resource that should be in the same directory) so it might be useful. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030861225 From tpushkin at openjdk.org Mon Apr 7 11:23:11 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 11:23:11 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 09:35:55 GMT, Radim Vansa wrote: >> I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. >> >> So no other engine should be able to access this usage of `exec_location`: >> - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) >> - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) > > Despite `crexec` is part of JVM codebase and it allows JVM to use executable-based engine implementations, I don't consider it a part of JVM; the separation line should be drawn at CRE API level. So it should not be a "JVM's implementation detail". > > In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. In fact, it is non-trivial to programmatically figure out from within a shared library was loaded from (if e.g. it needs to load some extra resource that should be in the same directory) so it might be useful. I agree with what you say about "implementation detail", looks like we just interpret these words a bit differently. > In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. Yes, but just `library_path` is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API `crexec` ended up using `args` instead of `library_path` for this. We could pass `exec_location` to all engines that accept it (would be better to rename it to `engine_location` then) but I am not sure how useful this would be to other engines besides `crexec`: I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031028270 From rvansa at openjdk.org Mon Apr 7 11:58:09 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 11:58:09 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 11:20:48 GMT, Timofei Pushkin wrote: > Yes, but just library_path is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API crexec ended up using args instead of library_path for this. ... > would be better to rename it to `engine_location` In the current impl it is telling the `lib` path within JDK installation location, rather than location of the engine (shared library). Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). Looking into this again I see that it is actually quite simple using `dladdr`. I am not sure why I have resorted to reading `/proc/self/maps` in some of my code... However, if the engine is set using absolute path outside JVM it might be problematic to get the JVM path (`JAVA_HOME` not being set...). > We could pass exec_location to all engines that accept it ... but I am not sure how useful this would be to other engines I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. Btw. there's third option: we could use the `dladdr/GetModuleFileName` in `crexec` and drop `exec_location` completely. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031081534 From tpushkin at openjdk.org Mon Apr 7 12:37:16 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 12:37:16 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 11:55:28 GMT, Radim Vansa wrote: > I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. > Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable because it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031144476 From tpushkin at openjdk.org Mon Apr 7 13:06:27 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 13:06:27 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 12:33:32 GMT, Timofei Pushkin wrote: >>> Yes, but just library_path is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API crexec ended up using args instead of library_path for this. >> ... >>> would be better to rename it to `engine_location` >> >> In the current impl it is telling the `lib` path within JDK installation location, rather than location of the engine (shared library). Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. >> >>> I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). >> >> Looking into this again I see that it is actually quite simple using `dladdr`. I am not sure why I have resorted to reading `/proc/self/maps` in some of my code... However, if the engine is set using absolute path outside JVM it might be problematic to get the JVM path (`JAVA_HOME` not being set...). >> >>> We could pass exec_location to all engines that accept it ... but I am not sure how useful this would be to other engines >> >> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. >> >> Btw. there's third option: we could use the `dladdr/GetModuleFileName` in `crexec` and drop `exec_location` completely. > >> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. > > Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. > >> Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. > > Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. > Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. If JVM doesn't know about it then it should be passed by the user directly: `-XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu` ? this would be somewhat inconvenient from UX perspective ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031197494 From rvansa at openjdk.org Mon Apr 7 15:12:22 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 15:12:22 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 13:03:03 GMT, Timofei Pushkin wrote: >>> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. >> >> Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. >> >>> Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. >> >> Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. > >> Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > > If JVM doesn't know about it then it should be passed by the user directly: `-XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu` ? this would be somewhat inconvenient from UX perspective > But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, I had to re-read the code again - my understanding of the option meaning was really wrong, it is passing the resolved location of the executable. So now I understand your argument, and the "third option" is really not viable. And I have to withdraw the argument that it's useful to external implementations; in fact it's set to `nullptr` if `CRaCEngine` does not refer to a shared library. >> Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > If JVM doesn't know about it then it should be passed by the user directly: -XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu ? this would be somewhat inconvenient from UX perspective Due to my confusion we were talking about two different things. Let's imagine a hypotetical engine `foobar`; this engine would live in `$JAVA_HOME/lib/libfoobar.so` and be invoked with `-XX:CRaCEngine=foobar`. * if the engine would for some reason need load `libjsig.so` it would need the `library_path` * if the engine would need to execute binary `foobar-tool` it could likely refer to the same directory through `dladdr` * if the engine needs to load `libzip.so` from a non-standard location, you would pass `-XX:CRaCEngineOptions=foobar.lib_dir=/opt/myzip` <- by non-exhaustive I meant that you could do this rather than lookup in engine's directory or `$JAVA_HOME/lib` In any case, I would suggest to have a fixed list of reserved CRE options, and as you noted, omit engine-specific handling as much as possible. Let's keep things as simple as these can be. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031462524 From tpushkin at openjdk.org Tue Apr 8 10:43:54 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:43:54 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v3] In-Reply-To: References: Message-ID: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: Make list of VM-controlled options static ------------- Changes: - all: https://git.openjdk.org/crac/pull/220/files - new: https://git.openjdk.org/crac/pull/220/files/9cf82953..a3622886 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=220&range=02 - incr: https://webrevs.openjdk.org/?repo=crac&pr=220&range=01-02 Stats: 76 lines in 4 files changed: 19 ins; 34 del; 23 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From tpushkin at openjdk.org Tue Apr 8 10:53:22 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:53:22 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v3] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 10:43:54 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. Rebased on top of the current main branch just in case (GitHub showed a conflict), no changes ------------- PR Comment: https://git.openjdk.org/crac/pull/220#issuecomment-2786026131 From tpushkin at openjdk.org Tue Apr 8 10:53:21 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:53:21 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: References: Message-ID: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Make list of VM-controlled options static - Use comma as a separator when printing controlled options - Simplify vm_controlled_options - Show all options in engine help ------------- Changes: https://git.openjdk.org/crac/pull/220/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=220&range=03 Stats: 82 lines in 6 files changed: 61 ins; 8 del; 13 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From rvansa at openjdk.org Tue Apr 8 12:25:27 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 8 Apr 2025 12:25:27 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> References: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> Message-ID: On Tue, 8 Apr 2025 10:53:21 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Make list of VM-controlled options static > - Use comma as a separator when printing controlled options > - Simplify vm_controlled_options > - Show all options in engine help Marked as reviewed by rvansa (Committer). ------------- PR Review: https://git.openjdk.org/crac/pull/220#pullrequestreview-2749785631 From duke at openjdk.org Tue Apr 8 13:53:35 2025 From: duke at openjdk.org (duke) Date: Tue, 8 Apr 2025 13:53:35 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> References: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> Message-ID: On Tue, 8 Apr 2025 10:53:21 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Make list of VM-controlled options static > - Use comma as a separator when printing controlled options > - Simplify vm_controlled_options > - Show all options in engine help @TimPushkin Your change (at version 747d25065ea2f9d1071627a0ba451e66fb5f7005) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/crac/pull/220#issuecomment-2786510403 From tpushkin at openjdk.org Tue Apr 8 14:02:48 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 14:02:48 GMT Subject: [crac] Integrated: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: <3AJ1yy7T1GG-CaznlFMf6dgb1Ddv4-D3sUKl9zl0k_8=.22a0972f-9e3f-49b6-a8ba-a0c0fbbd3c7b@github.com> On Mon, 31 Mar 2025 08:58:32 GMT, Timofei Pushkin wrote: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location This pull request has now been integrated. Changeset: 410d0e16 Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/410d0e168c326b7d892af1b9e990eb4a2b5e0fa1 Stats: 82 lines in 6 files changed: 61 ins; 8 del; 13 mod 8353243: [CRaC] Show all options in engine help Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/220 From akozlov at azul.com Wed Apr 9 10:47:12 2025 From: akozlov at azul.com (Anton Kozlov) Date: Wed, 9 Apr 2025 13:47:12 +0300 Subject: CFV: New CRaC Committer: Timofei Pushkin Message-ID: <7816a446-b27a-4098-b061-01a001818d84@azul.com> I hereby nominate Timofei Pushkin to CRaC Committer. Timofei is an engineer at Azul who has contributed 45 patches [3]. He is an active contributor to the project, and we expect him to continue working on the project. Votes are due by Thu Apr 24 2025 9AM PST. Only current CRaC Committers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Anton Kozlov [1] https://openjdk.org/census [2] https://openjdk.org/projects/#committer-vote [3] https://github.com/openjdk/crac/pulls?q=is%3Apr+is%3Aclosed++author%3Atimpushkin+label%3Aintegrated From rvansa at azul.com Thu Apr 10 06:37:21 2025 From: rvansa at azul.com (Radim Vansa) Date: Thu, 10 Apr 2025 08:37:21 +0200 Subject: CFV: New CRaC Committer: Timofei Pushkin In-Reply-To: <7816a446-b27a-4098-b061-01a001818d84@azul.com> References: <7816a446-b27a-4098-b061-01a001818d84@azul.com> Message-ID: <5b923be7-0405-4fd7-adef-a997e044d2fd@azul.com> Vote: yes Radim On 09. 04. 25 12:47, Anton Kozlov wrote: > Caution: This email originated from outside of the organization. Do > not click links or open attachments unless you recognize the sender > and know the content is safe. > > > I hereby nominate Timofei Pushkin to CRaC Committer. > > Timofei is an engineer at Azul who has contributed 45 patches [3]. He is > an active contributor to the project, and we expect him to continue > working > on the project. > > Votes are due by Thu Apr 24 2025 9AM PST. > > Only current CRaC Committers [1] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Anton Kozlov > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#committer-vote > [3] > https://github.com/openjdk/crac/pulls?q=is%3Apr+is%3Aclosed++author%3Atimpushkin+label%3Aintegrated > From mz1999 at gmail.com Thu Apr 10 09:30:04 2025 From: mz1999 at gmail.com (ma zhen) Date: Thu, 10 Apr 2025 17:30:04 +0800 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC Message-ID: Hi CRaC developers, I'm currently exploring the integration of CRaC support into our company's middleware products. I'm also very interested in the underlying implementation details of CRaC and have been doing some research into its mechanics. As I understand it, CRaC leverages CRIU under the hood for checkpointing and restoring running processes. My research indicates that CRIU itself is capable of handling open file descriptors and established network connections during the checkpoint/restore cycle. However, the CRaC API requires developers to explicitly manage these resources, typically by closing them in the beforeCheckpoint() and re-establishing them in the afterRestore(). To understand the rationale behind this design choice, I looked into the initial CRaC prototype, specifically the first PR ( https://github.com/openjdk/crac/pull/1). It appears that even in this early version, the implementation iterated through all process file descriptors during checkpoint. It ignored certain FDs (like those related to classpath files, /dev/random, /dev/urandom, and files marked M_PERSISTENT - though I'm unclear on the exact meaning of M_PERSISTENT in this context). If any other application-opened files remained, the checkpoint process would fail. This suggests the requirement for manual resource management was present from the outset. As I'm not deeply familiar with JVM internals, I'm struggling to fully grasp the reasoning. Was this restriction primarily introduced to simplify the initial design and implementation of CRaC within the JVM? I also noticed that current versions of CRaC include File Descriptor Policies. These allow configuring an action: ignore for specific file descriptors, effectively delegating their handling to CRIU. This seems to demonstrate that letting CRIU manage certain open files is feasible within the CRaC framework. This leads me to wonder: if delegation to CRIU is possible and works (at least for some cases via policies), why isn't relying on CRIU for resource handling the default or more broadly encouraged approach? Why the strict requirement for manual closure and reopening in the general case? For instance, consider using System.getLogger() from the JDK Platform Logging API. As application developers, we don't typically manage the underlying file descriptor for the log file directly. To make this work with CRaC, we currently need to identify and configure a File Descriptor Policy for it, which can feel somewhat cumbersome. Wouldn't a smoother experience involve CRaC (perhaps optionally) defaulting to letting CRIU handle such internally managed resources, like those opened by standard JDK libraries? I would appreciate any insights or clarification you could offer on the design philosophy behind CRaC's approach to managing external resources like files and sockets, especially in contrast to CRIU's capabilities. Thanks for your time and any insights you can share. Best regards, mazhen -------------- next part -------------- An HTML attachment was scrubbed... URL: From rvansa at openjdk.org Thu Apr 10 21:27:11 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 10 Apr 2025 21:27:11 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge branch 'crac' into zgc - fixup - 8353241: CRaC ZGC support ------------- Changes: https://git.openjdk.org/crac/pull/219/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=219&range=02 Stats: 104 lines in 10 files changed: 85 ins; 7 del; 12 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From rvansa at openjdk.org Thu Apr 10 21:27:11 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 10 Apr 2025 21:27:11 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:34:31 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > fixup Conflicts resolved. ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2795198990 From rvansa at azul.com Fri Apr 11 07:17:16 2025 From: rvansa at azul.com (Radim Vansa) Date: Fri, 11 Apr 2025 09:17:16 +0200 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC In-Reply-To: References: Message-ID: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> Hi Ma Zhen, you have correctly observed that closing file descriptors is rather an architectural choice than purely a technical need. CRIU is really capable of restoring the process as-is, as its main motivation is migration of running containers. Containers already define the filesystem, and the runtime is in control of external connections - e.g. CRIU can checkpoint and later restore an open socket connection, and the container runtime restores the 'second half' of the socket so that the pause is transparent to the running process. If this is what you want, there's nothing preventing you from using CRIU on a Java process manually - at the risk of breaking the internal logic of the application. However the point of CRaC is not such a transparent restore: we want to preserve the valuable state of JVM and application but adapt it to the new environment. We want to do a conscious decision about any resource external to the process. Being forced to gracefully adapt to the restore is a feature. Yes, we have File Descriptor policies, but that's not a solution - it provides a workaround for proof-of-concepts, until some code that you can't easily fix gets updated to support CRaC properly. Ideas meet practicality, and you are responsible for realizing what should be done with particular external resource. You're right that ATM we don't handle JDK Platform Logging (and neither JUL) configured to write to a file, and since that is JDK code out of user control it is a bug. We attempt to fix those one by one (PRs are welcome!). I hope I have provided some insight to these choices - and yes, I understand the pain as we still have many places to fix. Cheers, Radim On 10. 04. 25 11:30, ma zhen wrote: > > > Caution: This email originated from outside of the organization. Do > not click links or open attachments unless you recognize the sender > and know the content is safe. > > > Hi CRaC developers, > > I'm currently exploring the integration of CRaC support into our > company's middleware products. I'm also very interested in the > underlying implementation details of CRaC and have been doing some > research into its mechanics. > > As I understand it, CRaC leverages CRIU under the hood for > checkpointing and restoring running processes. My research indicates > that CRIU itself is capable of handling open file descriptors and > established network connections during the checkpoint/restore cycle. > > However, the CRaC API requires developers to explicitly manage these > resources, typically by closing them in the beforeCheckpoint()?and > re-establishing them in the afterRestore(). > > To understand the rationale behind this design choice, I looked into > the initial CRaC prototype, specifically the first PR > (https://github.com/openjdk/crac/pull/1). It appears that even in this > early version, the implementation iterated through all process file > descriptors during checkpoint. It ignored certain FDs (like those > related to classpath files, /dev/random, /dev/urandom, and files > marked M_PERSISTENT?- though I'm unclear on the exact meaning of > M_PERSISTENT?in this context). If any other application-opened files > remained, the checkpoint process would fail. This suggests the > requirement for manual resource management was present from the outset. > > As I'm not deeply familiar with JVM internals, I'm struggling to fully > grasp the reasoning. Was this restriction primarily introduced to > simplify the initial design and implementation of CRaC within the JVM? > > I also noticed that current versions of CRaC include File Descriptor > Policies. These allow configuring an action: ignore?for specific file > descriptors, effectively delegating their handling to CRIU. This seems > to demonstrate that letting CRIU manage certain open files is?feasible > within the CRaC framework. > > This leads me to wonder: if delegation to CRIU is possible and works > (at least for some cases via policies), why isn't relying on CRIU for > resource handling the default or more broadly encouraged approach? Why > the strict requirement for manual closure and reopening in the general > case? > > For instance, consider using System.getLogger()?from the JDK Platform > Logging API. As application developers, we don't typically manage the > underlying file descriptor for the log file directly. To make this > work with CRaC, we currently need to identify and configure a File > Descriptor Policy for it, which can feel somewhat cumbersome. Wouldn't > a smoother experience involve CRaC (perhaps optionally) defaulting to > letting CRIU handle such internally managed resources, like those > opened by standard JDK libraries? > > I would appreciate any insights or clarification you could offer on > the design philosophy behind CRaC's approach to managing external > resources like files and sockets, especially in contrast to CRIU's > capabilities. > > Thanks for your time and any insights you can share. > > Best regards, > > mazhen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpushkin at openjdk.org Fri Apr 11 07:18:00 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Fri, 11 Apr 2025 07:18:00 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support My review ETA: next week ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2796055268 From mz1999 at gmail.com Fri Apr 11 08:57:35 2025 From: mz1999 at gmail.com (ma zhen) Date: Fri, 11 Apr 2025 16:57:35 +0800 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC In-Reply-To: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> References: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> Message-ID: Hi Radim, Thanks a lot for the detailed explanation! That completely cleared up my understanding of the design philosophy behind CRaC. It makes perfect sense now that the goal isn't purely transparent restoration, but rather preserving the valuable internal JVM/application state while enabling robust adaptation to the new environment after restore ? sacrificing some transparency for resilience by consciously managing external resources. Great project, and I appreciate the insight. Hope to be able to contribute down the line! Cheers, Ma Zhen Radim Vansa ?2025?4?11??? 15:17??? > Hi Ma Zhen, > > you have correctly observed that closing file descriptors is rather an > architectural choice than purely a technical need. CRIU is really capable > of restoring the process as-is, as its main motivation is migration of > running containers. Containers already define the filesystem, and the > runtime is in control of external connections - e.g. CRIU can checkpoint > and later restore an open socket connection, and the container runtime > restores the 'second half' of the socket so that the pause is transparent > to the running process. > > If this is what you want, there's nothing preventing you from using CRIU > on a Java process manually - at the risk of breaking the internal logic of > the application. However the point of CRaC is not such a transparent > restore: we want to preserve the valuable state of JVM and application but > adapt it to the new environment. We want to do a conscious decision about > any resource external to the process. Being forced to gracefully adapt to > the restore is a feature. > > Yes, we have File Descriptor policies, but that's not a solution - it > provides a workaround for proof-of-concepts, until some code that you can't > easily fix gets updated to support CRaC properly. Ideas meet practicality, > and you are responsible for realizing what should be done with particular > external resource. > > You're right that ATM we don't handle JDK Platform Logging (and neither > JUL) configured to write to a file, and since that is JDK code out of user > control it is a bug. We attempt to fix those one by one (PRs are welcome!). > > I hope I have provided some insight to these choices - and yes, I > understand the pain as we still have many places to fix. > > Cheers, > > Radim > On 10. 04. 25 11:30, ma zhen wrote: > > > Caution: This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > Hi CRaC developers, > > I'm currently exploring the integration of CRaC support into our company's > middleware products. I'm also very interested in the underlying > implementation details of CRaC and have been doing some research into its > mechanics. > > As I understand it, CRaC leverages CRIU under the hood for checkpointing > and restoring running processes. My research indicates that CRIU itself is > capable of handling open file descriptors and established network > connections during the checkpoint/restore cycle. > > However, the CRaC API requires developers to explicitly manage these > resources, typically by closing them in the beforeCheckpoint() and > re-establishing them in the afterRestore(). > > To understand the rationale behind this design choice, I looked into the > initial CRaC prototype, specifically the first PR ( > https://github.com/openjdk/crac/pull/1). It appears that even in this > early version, the implementation iterated through all process file > descriptors during checkpoint. It ignored certain FDs (like those related > to classpath files, /dev/random, /dev/urandom, and files marked > M_PERSISTENT - though I'm unclear on the exact meaning of M_PERSISTENT in > this context). If any other application-opened files remained, the > checkpoint process would fail. This suggests the requirement for manual > resource management was present from the outset. > > As I'm not deeply familiar with JVM internals, I'm struggling to fully > grasp the reasoning. Was this restriction primarily introduced to simplify > the initial design and implementation of CRaC within the JVM? > > I also noticed that current versions of CRaC include File Descriptor > Policies. These allow configuring an action: ignore for specific file > descriptors, effectively delegating their handling to CRIU. This seems to > demonstrate that letting CRIU manage certain open files is feasible > within the CRaC framework. > > This leads me to wonder: if delegation to CRIU is possible and works (at > least for some cases via policies), why isn't relying on CRIU for resource > handling the default or more broadly encouraged approach? Why the strict > requirement for manual closure and reopening in the general case? > > For instance, consider using System.getLogger() from the JDK Platform > Logging API. As application developers, we don't typically manage the > underlying file descriptor for the log file directly. To make this work > with CRaC, we currently need to identify and configure a File Descriptor > Policy for it, which can feel somewhat cumbersome. Wouldn't a smoother > experience involve CRaC (perhaps optionally) defaulting to letting CRIU > handle such internally managed resources, like those opened by standard JDK > libraries? > > I would appreciate any insights or clarification you could offer on the > design philosophy behind CRaC's approach to managing external resources > like files and sockets, especially in contrast to CRIU's capabilities. > > Thanks for your time and any insights you can share. > > Best regards, > > mazhen > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpushkin at openjdk.org Fri Apr 11 14:50:35 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Fri, 11 Apr 2025 14:50:35 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time Message-ID: This is an alternative to #209 that also fixes `TimedWaitingTest`. The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. ------------- Commit messages: - Make nanoTime offsetting more precise Changes: https://git.openjdk.org/crac/pull/221/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=221&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354432 Stats: 20 lines in 2 files changed: 8 ins; 0 del; 12 mod Patch: https://git.openjdk.org/crac/pull/221.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/221/head:pull/221 PR: https://git.openjdk.org/crac/pull/221 From duke at openjdk.org Mon Apr 14 12:00:33 2025 From: duke at openjdk.org (duke) Date: Mon, 14 Apr 2025 12:00:33 GMT Subject: git: openjdk/crac: created branch 8354514_remove_set_reopened based on the branch crac containing 1 unique commit Message-ID: <38d576e6-9d1c-46f9-9d40-ea91ef716600@openjdk.org> The following commits are unique to the 8354514_remove_set_reopened branch: ======================================================== cbe08b5e: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl From tpushkin at openjdk.org Mon Apr 14 13:14:09 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:09 GMT Subject: [crac] Withdrawn: 8351402: [CRaC] Use System.nanoTime() in TimedWaitingTest In-Reply-To: <6D05THOPyL0EtPFFTPj5d2jDBYJoUxodkJnK1chTae8=.79e61d28-bf43-42bd-96a4-3129a050552d@github.com> References: <6D05THOPyL0EtPFFTPj5d2jDBYJoUxodkJnK1chTae8=.79e61d28-bf43-42bd-96a4-3129a050552d@github.com> Message-ID: <_GxWTMXhcEouXGuGlakU-STvyX5XpvhB8ULHxs5yzb4=.3655ef11-1aae-4984-9268-505c22c66c9a@github.com> On Fri, 7 Mar 2025 12:15:50 GMT, Timofei Pushkin wrote: > Replaces `System.currentTimeMillis()` with `System.nanoTime()` in `TimedWaitingTest` since the former can, in theory, jump back and forth and that may lead to the test failures. > > Also adds a diagnostic assert. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/crac/pull/209 From tpushkin at openjdk.org Mon Apr 14 13:14:14 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:14 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: References: Message-ID: > This is an alternative to #209 that also fixes `TimedWaitingTest`. > > The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. > > I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: Move docs to header ------------- Changes: - all: https://git.openjdk.org/crac/pull/221/files - new: https://git.openjdk.org/crac/pull/221/files/4de87e38..5497193d Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=221&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=221&range=00-01 Stats: 35 lines in 2 files changed: 9 ins; 9 del; 17 mod Patch: https://git.openjdk.org/crac/pull/221.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/221/head:pull/221 PR: https://git.openjdk.org/crac/pull/221 From rvansa at openjdk.org Mon Apr 14 13:14:33 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:33 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Message-ID: Fix errors in JCK due to newly exposed methods. ------------- Commit messages: - 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Changes: https://git.openjdk.org/crac/pull/222/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=222&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354514 Stats: 63 lines in 6 files changed: 39 ins; 18 del; 6 mod Patch: https://git.openjdk.org/crac/pull/222.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/222/head:pull/222 PR: https://git.openjdk.org/crac/pull/222 From rvansa at openjdk.org Mon Apr 14 13:14:21 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:21 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: References: Message-ID: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> On Mon, 14 Apr 2025 11:03:38 GMT, Timofei Pushkin wrote: >> This is an alternative to #209 that also fixes `TimedWaitingTest`. >> >> The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. >> >> I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. > > Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: > > Move docs to header Great analysis, I am glad that we could have that test finally working! The change looks good. Originally you thought that the wallclock time is adjusting backwards; since the test enforces the first branch, I think it is still proof against going back during the checkpoint. The test would really fail if going back between checkpoint and the end of 1000 ms wait, correct? LGTM! src/hotspot/share/runtime/crac.cpp line 62: > 60: CracEngine *crac::_engine = nullptr; > 61: // Timestamps recorded before checkpoint > 62: jlong crac::checkpoint_wallclock_seconds; // Wall clock time, full seconds Not even a nitpick: I wonder if we should document private fields here or in the header file, WDYT? (I stand guilty for having docs for `javaTimeNanos_offset` here...). ------------- Marked as reviewed by rvansa (Committer). PR Review: https://git.openjdk.org/crac/pull/221#pullrequestreview-2763299969 PR Review: https://git.openjdk.org/crac/pull/221#pullrequestreview-2763744116 PR Review Comment: https://git.openjdk.org/crac/pull/221#discussion_r2041516657 From tpushkin at openjdk.org Mon Apr 14 13:14:24 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:24 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> References: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> Message-ID: On Mon, 14 Apr 2025 07:15:25 GMT, Radim Vansa wrote: > The test would really fail if going back between checkpoint and the end of 1000 ms wait, correct? Yes, it would (and still will) fail if either: - The time goes back between checkpoint and restore - The time goes back between the moment it is read in `record_time_before_checkpoint()` and in the test for `after` > src/hotspot/share/runtime/crac.cpp line 62: > >> 60: CracEngine *crac::_engine = nullptr; >> 61: // Timestamps recorded before checkpoint >> 62: jlong crac::checkpoint_wallclock_seconds; // Wall clock time, full seconds > > Not even a nitpick: I wonder if we should document private fields here or in the header file, WDYT? (I stand guilty for having docs for `javaTimeNanos_offset` here...). I think placing them in the header would be better ------------- PR Comment: https://git.openjdk.org/crac/pull/221#issuecomment-2800736723 PR Review Comment: https://git.openjdk.org/crac/pull/221#discussion_r2041562189 From tpushkin at openjdk.org Mon Apr 14 13:14:37 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:37 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl In-Reply-To: References: Message-ID: <9cu75gmGmFXdETU4ETV62okoRd9D67tbLsiYq1BFo8M=.a8c01265-0214-4ac8-80bf-32200ba99b2e@github.com> On Mon, 14 Apr 2025 12:00:30 GMT, Radim Vansa wrote: > Fix errors in JCK due to newly exposed methods. src/java.base/share/classes/jdk/internal/access/JavaNioChannelsSpiAccess.java line 1: > 1: package jdk.internal.access; A license header is missing ------------- PR Review Comment: https://git.openjdk.org/crac/pull/222#discussion_r2042075853 From tpushkin at openjdk.org Mon Apr 14 13:14:28 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:28 GMT Subject: [crac] Integrated: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time In-Reply-To: References: Message-ID: <7gNc72s5XOZF60Cu7sxVpbIpthypZerU3ek-xnEPtLU=.ddb008f7-873f-4ab4-b997-bd67c7be530c@github.com> On Fri, 11 Apr 2025 14:44:22 GMT, Timofei Pushkin wrote: > This is an alternative to #209 that also fixes `TimedWaitingTest`. > > The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. > > I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. This pull request has now been integrated. Changeset: ad63687e Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/ad63687e9a057517831af62c60275684bc668e3e Stats: 41 lines in 2 files changed: 15 ins; 7 del; 19 mod 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/221 From tpushkin at openjdk.org Mon Apr 14 13:14:31 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:31 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support src/hotspot/share/gc/z/zPageAllocator.cpp line 1067: > 1065: event.commit(uncommitted); > 1066: } > 1067: } No newline at the EOF src/hotspot/share/gc/z/zPageAllocator.hpp line 173: > 171: void threads_do(ThreadClosure* tc) const; > 172: > 173: void cleanup_unused(); This should not be addressed in this PR, but I find "cleanup_unused" name pretty undescriptive, both here and in `CollectedHeap` in general. It wasn't obvious to me that this is something used only by CRaC. I would propose `cleanup_before_checkpoint`, for example. src/hotspot/share/gc/z/zPageCache.cpp line 33: > 31: #include "gc/z/zStat.hpp" > 32: #include "gc/z/zValue.inline.hpp" > 33: #include "logging/log.hpp" There is no logging code added so these new imports shouldn't be needed. src/hotspot/share/gc/z/zPageCache.cpp line 292: > 290: _delay(delay) { > 291: // Set initial timeout > 292: *_timeout = ZUncommitDelay; Shouldn't this also use `delay`? This shouldn't have any real influence now (the code that has `delay != ZUncommitDelay` doesn't use the timeout) but anyway. src/hotspot/share/runtime/crac.cpp line 429: > 427: MemTracker::final_report(tty); > 428: } > 429: Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. test/jdk/jdk/crac/fileDescriptors/ZGCTest.java line 37: > 35: * @requires (os.family == "linux") > 36: */ > 37: public class ZGCTest implements CracTest { A bit unclear why this is in `jdk/crac/fileDescriptors` directory. I guess because of the required `memfd` support but I would still put it in the general `jdk/crac` directory. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041490955 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041524248 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041509158 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041503636 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041476340 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041517297 From rvansa at openjdk.org Mon Apr 14 13:14:14 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:14 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: <3Lco9IcmSYym-KcVGkgG1MYvqMHNjTI0Cwf81MH5gFw=.f841ff1b-ed91-446b-92fe-9648d05c373d@github.com> On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support Updated. ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2801615156 From rvansa at openjdk.org Mon Apr 14 13:14:36 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:36 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> Message-ID: On Mon, 14 Apr 2025 06:57:19 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge branch 'crac' into zgc >> - fixup >> - 8353241: CRaC ZGC support > > src/hotspot/share/gc/z/zPageAllocator.hpp line 173: > >> 171: void threads_do(ThreadClosure* tc) const; >> 172: >> 173: void cleanup_unused(); > > This should not be addressed in this PR, but I find "cleanup_unused" name pretty undescriptive, both here and in `CollectedHeap` in general. It wasn't obvious to me that this is something used only by CRaC. I would propose `cleanup_before_checkpoint`, for example. The name is not mandated by any interface so I can rename it even here... > src/hotspot/share/runtime/crac.cpp line 429: > >> 427: MemTracker::final_report(tty); >> 428: } >> 429: > > Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2042020387 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041549347 From tpushkin at openjdk.org Mon Apr 14 13:14:40 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:40 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> Message-ID: <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> On Mon, 14 Apr 2025 07:16:53 GMT, Radim Vansa wrote: >> src/hotspot/share/runtime/crac.cpp line 429: >> >>> 427: MemTracker::final_report(tty); >>> 428: } >>> 429: >> >> Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. > > Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. I actually think that this should be done in as a separate change then. I would propose to either document that `PrintNMTStatistics` prints on checkpoint, or make it have several modes (0 ? off, 1 ? print on exit, 2 ? print on checkpoint, 4 ? print on both exit and checkpoint), or add a separate `PrintNMTStatisticsOnCheckpoint`. The reason why I am proposing this is because I believe we had a related request from community for this somewhere, so it would be nice to have this documented. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041556788 From rvansa at openjdk.org Mon Apr 14 13:14:42 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:42 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> Message-ID: <2OQ64PbSvXOX5ryuplAmOmXLB2ZdsawET_9dpwFs3ww=.40b00c93-b591-4e00-98df-37b565fbb938@github.com> On Mon, 14 Apr 2025 07:22:12 GMT, Timofei Pushkin wrote: >> Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. > > I actually think that this should be done in as a separate change then. I would propose to either document that `PrintNMTStatistics` prints on checkpoint, or make it have several modes (0 ? off, 1 ? print on exit, 2 ? print on checkpoint, 4 ? print on both exit and checkpoint), or add a separate `PrintNMTStatisticsOnCheckpoint`. The reason why I am proposing this is because I believe we had a related request from community for this somewhere, so it would be nice to have this documented. OK, while I've used this for diagnostics on ZGC it would really make sense for its own PR. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041560779 From rvansa at openjdk.org Mon Apr 14 13:17:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:17:51 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v4] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/crac/pull/219/files - new: https://git.openjdk.org/crac/pull/219/files/75e2bc41..3b7d15e7 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=219&range=03 - incr: https://webrevs.openjdk.org/?repo=crac&pr=219&range=02-03 Stats: 12 lines in 6 files changed: 0 ins; 7 del; 5 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From duke at openjdk.org Mon Apr 14 13:58:52 2025 From: duke at openjdk.org (duke) Date: Mon, 14 Apr 2025 13:58:52 GMT Subject: git: openjdk/crac: 8354514_remove_set_reopened: Add license header Message-ID: <1c82a037-5e5d-4a24-bb92-52fa1323f3b5@openjdk.org> Changeset: a15a667b Branch: 8354514_remove_set_reopened Author: Radim Vansa Date: 2025-04-14 15:54:45 +0000 URL: https://git.openjdk.org/crac/commit/a15a667b1bc6af639d2fa1216a0d510490320791 Add license header ! src/java.base/share/classes/jdk/internal/access/JavaNioChannelsSpiAccess.java From rvansa at openjdk.org Mon Apr 14 14:00:54 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 14:00:54 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: Message-ID: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> > Fix errors in JCK due to newly exposed methods. Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add license header ------------- Changes: - all: https://git.openjdk.org/crac/pull/222/files - new: https://git.openjdk.org/crac/pull/222/files/cbe08b5e..a15a667b Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=222&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=222&range=00-01 Stats: 24 lines in 1 file changed: 24 ins; 0 del; 0 mod Patch: https://git.openjdk.org/crac/pull/222.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/222/head:pull/222 PR: https://git.openjdk.org/crac/pull/222 From tpushkin at openjdk.org Mon Apr 14 14:06:22 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 14:06:22 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:00:54 GMT, Radim Vansa wrote: >> Fix errors in JCK due to newly exposed methods. > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Add license header LGTM, but I wonder why the tests are not run? ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2801823465 From rvansa at openjdk.org Mon Apr 14 14:29:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 14:29:00 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:02:55 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Add license header > > LGTM, but I wonder why the tests are not run? > > I guess because the source branch is in this (openjdk/crac) repo, although not sure why exactly this is a problem. Anyway, I believe we should run the tests before merging. @TimPushkin Looks like GitHub was just stalled; the tests are running now. https://github.com/rvansa/crac/actions/runs/14447478163 ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2801897937 From rvansa at openjdk.org Mon Apr 14 17:48:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 17:48:00 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:02:55 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Add license header > > LGTM, but I wonder why the tests are not run? > > I guess because the source branch is in this (openjdk/crac) repo, although not sure why exactly this is a problem. Anyway, I believe we should run the tests before merging. @TimPushkin The test overview looks crazy for some reason, but when I click through it seems to have passed. When you give the green light I'll integrate. ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2802436767 From tpushkin at openjdk.org Tue Apr 15 06:22:07 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 15 Apr 2025 06:22:07 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:00:54 GMT, Radim Vansa wrote: >> Fix errors in JCK due to newly exposed methods. > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Add license header "1059 successful checks" ? clearly everything has been tested here ? ------------- Marked as reviewed by tpushkin (Author). PR Review: https://git.openjdk.org/crac/pull/222#pullrequestreview-2766941974 From rvansa at openjdk.org Tue Apr 15 06:31:10 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 15 Apr 2025 06:31:10 GMT Subject: [crac] Integrated: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl In-Reply-To: References: Message-ID: On Mon, 14 Apr 2025 12:00:30 GMT, Radim Vansa wrote: > Fix errors in JCK due to newly exposed methods. This pull request has now been integrated. Changeset: d64fb30c Author: Radim Vansa URL: https://git.openjdk.org/crac/commit/d64fb30c0874d93c986ad04ac3995a727b7a1ac8 Stats: 87 lines in 6 files changed: 63 ins; 18 del; 6 mod 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Reviewed-by: tpushkin ------------- PR: https://git.openjdk.org/crac/pull/222 From tpushkin at openjdk.org Tue Apr 15 06:37:16 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 15 Apr 2025 06:37:16 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v4] In-Reply-To: References: Message-ID: On Mon, 14 Apr 2025 13:17:51 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Marked as reviewed by tpushkin (Author). ------------- PR Review: https://git.openjdk.org/crac/pull/219#pullrequestreview-2766976843 From rvansa at openjdk.org Tue Apr 15 06:40:02 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 15 Apr 2025 06:40:02 GMT Subject: [crac] Integrated: 8353241: [CRaC] Support ZGC In-Reply-To: References: Message-ID: On Mon, 31 Mar 2025 08:18:43 GMT, Radim Vansa wrote: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). This pull request has now been integrated. Changeset: 64710538 Author: Radim Vansa URL: https://git.openjdk.org/crac/commit/647105388b66b7acedf03d049dc60323912a8fe7 Stats: 98 lines in 9 files changed: 78 ins; 7 del; 13 mod 8353241: [CRaC] Support ZGC Reviewed-by: tpushkin ------------- PR: https://git.openjdk.org/crac/pull/219 From dcherepanov at openjdk.org Tue Apr 15 08:16:56 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Tue, 15 Apr 2025 08:16:56 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 Message-ID: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Merge with jdk-25:3 There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` - added `#include "os_posix.hpp"` to define `RESTARTABLE` - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` Additional changes in `posix/attachListener_posix.cpp` - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result`
Conflicts commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) Merge: 410d0e168c3 23d6f747824 Author: Dmitry Cherepanov Date: Mon Apr 14 13:55:59 2025 +0400 Merge with jdk:jdk-25+3 diff --git a/.jcheck/conf b/.jcheck/conf remerge CONFLICT (content): Merge conflict in .jcheck/conf index 1d117b1d825..25bd8dd0b94 100644 --- a/.jcheck/conf +++ b/.jcheck/conf @@ -4,12 +4,7 @@ jbs=JDK version=25 [checks] -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) error=whitespace -======= -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright -warning=issuestitle,binary ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) [checks "reviewers"] committers=1 @@ -18,31 +13,3 @@ ignore=duke [census] version=0 domain=openjdk.org -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -======= - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp - -[checks "copyright"] -files=^(?!LICENSE|license.txt|.*.bin|.*.gif|.*.jpg|.*.png|.*.icon|.*.tiff|.*.dat|.*.patch|.*.wav|.*.class|.*-header|.*.jar|).* -oracle_locator=.*Copyright (c)(.*)Oracle and/or its affiliates. All rights reserved. -oracle_validator=.*Copyright (c) (\d{4})(?:, (\d{4}))?, Oracle and/or its affiliates. All rights reserved. ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) diff --git a/src/hotspot/os/posix/attachListener_posix.cpp b/src/hotspot/os/posix/attachListener_posix.cpp remerge CONFLICT (content): Merge conflict in src/hotspot/os/posix/attachListener_posix.cpp index f1ad8d81a14..49b53130608 100644 --- a/src/hotspot/os/posix/attachListener_posix.cpp +++ b/src/hotspot/os/posix/attachListener_posix.cpp @@ -45,10 +45,6 @@ #if INCLUDE_SERVICES #ifndef AIX -#ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) -#endif - // The attach mechanism on Linux and BSD uses a UNIX domain socket. An attach // listener thread is created at startup or is created on-demand via a signal // from the client tool. The attach listener creates a socket and binds it to a @@ -65,102 +61,6 @@ // obtain the credentials of client. We check that the effective uid // of the client matches this process. -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -======= -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - public: - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - static PosixAttachOperation* dequeue(); -}; - -class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { -private: - int _socket; -public: - SocketChannel(int socket) : _socket(socket) {} - ~SocketChannel() { - close(); - } - - bool opened() const { - return _socket != -1; - } - - void close() { - if (opened()) { - ::close(_socket); - _socket = -1; - } - } - - // RequestReader - int read(void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::read(_socket, buffer, (size_t)size), n); - return checked_cast(n); - } - - // ReplyWriter - int write(const void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::write(_socket, buffer, size), n); - return checked_cast(n); - } - // called after writing all data - void flush() override { - ::shutdown(_socket, SHUT_RDWR); - } -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - SocketChannel _socket_channel; - - public: - void complete(jint res, bufferedStream* st) override; - - PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { - } - - bool read_request() { - return AttachOperation::read_request(&_socket_channel, &_socket_channel); - } -}; - ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // statics char PosixAttachListener::_path[UNIX_PATH_MAX]; bool PosixAttachListener::_has_path; @@ -318,22 +218,6 @@ PosixAttachOperation* PosixAttachListener::dequeue() { } } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - // An operation completion is splitted into two parts. // For proper handling the jcmd connection at CRaC checkpoint action. // An effectively_complete_raw is called in checkpoint processing, before criu engine calls, for properly closing the socket. @@ -346,8 +230,6 @@ void PosixAttachOperation::complete(jint result, bufferedStream* st) { delete this; } -======= ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // Complete an operation by sending the operation result and any result // output to the client. At this time the socket is in blocking mode so // potentially we can block if there is a lot of data and the client is @@ -363,7 +245,6 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* return; } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) // write operation result Thread* thread = Thread::current(); if (thread->is_Java_thread()) { @@ -376,24 +257,10 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* } void PosixAttachOperation::write_operation_result(jint result, bufferedStream* st) { - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), SHUT_RDWR); - } - - // done - ::close(this->socket()); - st->reset(); -======= write_reply(&_socket_channel, result, st); - delete this; ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) + _socket_channel.close(); + st->reset(); } static void assert_listener_thread() { diff --git a/src/hotspot/os/posix/attachListener_posix.hpp b/src/hotspot/os/posix/attachListener_posix.hpp index b945020e20d..a0fca688b5f 100644 --- a/src/hotspot/os/posix/attachListener_posix.hpp +++ b/src/hotspot/os/posix/attachListener_posix.hpp @@ -36,7 +36,7 @@ class PosixAttachListener; #include #ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(((struct sockaddr_un *)0)->sun_path) +#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) #endif class PosixAttachListener: AllStatic { @@ -53,17 +53,7 @@ class PosixAttachListener: AllStatic { // this is for proper reporting JDK.Chekpoint processing to jcmd peer static PosixAttachOperation* _current_op; - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - static void set_path(char* path) { if (path == nullptr) { _path[0] = '\0'; @@ -84,9 +74,6 @@ class PosixAttachListener: AllStatic { static bool has_path() { return _has_path; } static int listener() { return _listener; } - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - static PosixAttachOperation* dequeue(); static PosixAttachOperation* get_current_op(); static void reset_current_op(); diff --git a/src/hotspot/os/posix/posixAttachOperation.hpp b/src/hotspot/os/posix/posixAttachOperation.hpp index 1d031d882da..10f253a3f76 100644 --- a/src/hotspot/os/posix/posixAttachOperation.hpp +++ b/src/hotspot/os/posix/posixAttachOperation.hpp @@ -26,31 +26,79 @@ #ifndef OS_POSIX_POSIXATTACHOPERATION_HPP #define OS_POSIX_POSIXATTACHOPERATION_HPP +#include "os_posix.hpp" #include "services/attachListener.hpp" class PosixAttachOperation; #if INCLUDE_SERVICES +class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { +private: + int _socket; +public: + SocketChannel(int socket) : _socket(socket) {} + ~SocketChannel() { + close(); + } + + int socket() const { + return _socket; + } + + bool opened() const { + return _socket != -1; + } + + void close() { + if (opened()) { + ::close(_socket); + _socket = -1; + } + } + + // RequestReader + int read(void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::read(_socket, buffer, (size_t)size), n); + return checked_cast(n); + } + + // ReplyWriter + int write(const void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::write(_socket, buffer, size), n); + return checked_cast(n); + } + // called after writing all data + void flush() override { + ::shutdown(_socket, SHUT_RDWR); + } +}; + class PosixAttachOperation: public AttachOperation { private: // the connection to the client - int _socket; + SocketChannel _socket_channel; bool _effectively_completed; void write_operation_result(jint result, bufferedStream* st); public: - void complete(jint res, bufferedStream* st); + void complete(jint res, bufferedStream* st) override; void effectively_complete_raw(jint res, bufferedStream* st); bool is_effectively_completed() { return _effectively_completed; } - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } + int socket() { + return _socket_channel.socket();; + } - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); + PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { _effectively_completed = false; } + + bool read_request() { + return AttachOperation::read_request(&_socket_channel, &_socket_channel); + } }; #endif // INCLUDE_SERVICES
Conflicts (diff3) commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) Merge: 410d0e168c3 23d6f747824 Author: Dmitry Cherepanov Date: Mon Apr 14 13:55:59 2025 +0400 Merge with jdk:jdk-25+3 diff --git a/.jcheck/conf b/.jcheck/conf remerge CONFLICT (content): Merge conflict in .jcheck/conf index 4816067cada..25bd8dd0b94 100644 --- a/.jcheck/conf +++ b/.jcheck/conf @@ -4,15 +4,7 @@ jbs=JDK version=25 [checks] -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) error=whitespace -||||||| ceb4366ebf0 -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists -warning=issuestitle,binary -======= -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright -warning=issuestitle,binary ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) [checks "reviewers"] committers=1 @@ -21,52 +13,3 @@ ignore=duke [census] version=0 domain=openjdk.org -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -||||||| ceb4366ebf0 - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp -======= - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp - -[checks "copyright"] -files=^(?!LICENSE|license.txt|.*.bin|.*.gif|.*.jpg|.*.png|.*.icon|.*.tiff|.*.dat|.*.patch|.*.wav|.*.class|.*-header|.*.jar|).* -oracle_locator=.*Copyright (c)(.*)Oracle and/or its affiliates. All rights reserved. -oracle_validator=.*Copyright (c) (\d{4})(?:, (\d{4}))?, Oracle and/or its affiliates. All rights reserved. ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) diff --git a/src/hotspot/os/posix/attachListener_posix.cpp b/src/hotspot/os/posix/attachListener_posix.cpp remerge CONFLICT (content): Merge conflict in src/hotspot/os/posix/attachListener_posix.cpp index b98f7d437a6..49b53130608 100644 --- a/src/hotspot/os/posix/attachListener_posix.cpp +++ b/src/hotspot/os/posix/attachListener_posix.cpp @@ -45,10 +45,6 @@ #if INCLUDE_SERVICES #ifndef AIX -#ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) -#endif - // The attach mechanism on Linux and BSD uses a UNIX domain socket. An attach // listener thread is created at startup or is created on-demand via a signal // from the client tool. The attach listener creates a socket and binds it to a @@ -65,170 +61,6 @@ // obtain the credentials of client. We check that the effective uid // of the client matches this process. -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -||||||| ceb4366ebf0 -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - - static PosixAttachOperation* dequeue(); -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - int _socket; - - public: - void complete(jint res, bufferedStream* st); - - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } - - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); - } -}; - -======= -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - public: - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - static PosixAttachOperation* dequeue(); -}; - -class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { -private: - int _socket; -public: - SocketChannel(int socket) : _socket(socket) {} - ~SocketChannel() { - close(); - } - - bool opened() const { - return _socket != -1; - } - - void close() { - if (opened()) { - ::close(_socket); - _socket = -1; - } - } - - // RequestReader - int read(void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::read(_socket, buffer, (size_t)size), n); - return checked_cast(n); - } - - // ReplyWriter - int write(const void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::write(_socket, buffer, size), n); - return checked_cast(n); - } - // called after writing all data - void flush() override { - ::shutdown(_socket, SHUT_RDWR); - } -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - SocketChannel _socket_channel; - - public: - void complete(jint res, bufferedStream* st) override; - - PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { - } - - bool read_request() { - return AttachOperation::read_request(&_socket_channel, &_socket_channel); - } -}; - ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // statics char PosixAttachListener::_path[UNIX_PATH_MAX]; bool PosixAttachListener::_has_path; @@ -386,22 +218,6 @@ PosixAttachOperation* PosixAttachListener::dequeue() { } } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - // An operation completion is splitted into two parts. // For proper handling the jcmd connection at CRaC checkpoint action. // An effectively_complete_raw is called in checkpoint processing, before criu engine calls, for properly closing the socket. @@ -414,24 +230,6 @@ void PosixAttachOperation::complete(jint result, bufferedStream* st) { delete this; } -||||||| ceb4366ebf0 -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - -======= ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // Complete an operation by sending the operation result and any result // output to the client. At this time the socket is in blocking mode so // potentially we can block if there is a lot of data and the client is @@ -447,7 +245,6 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* return; } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) // write operation result Thread* thread = Thread::current(); if (thread->is_Java_thread()) { @@ -460,40 +257,10 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* } void PosixAttachOperation::write_operation_result(jint result, bufferedStream* st) { - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), SHUT_RDWR); - } - - // done - ::close(this->socket()); - st->reset(); -||||||| ceb4366ebf0 - // write operation result - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), 2); - } - - // done - ::close(this->socket()); - - delete this; -======= write_reply(&_socket_channel, result, st); - delete this; ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) + _socket_channel.close(); + st->reset(); } static void assert_listener_thread() { diff --git a/src/hotspot/os/posix/attachListener_posix.hpp b/src/hotspot/os/posix/attachListener_posix.hpp index b945020e20d..a0fca688b5f 100644 --- a/src/hotspot/os/posix/attachListener_posix.hpp +++ b/src/hotspot/os/posix/attachListener_posix.hpp @@ -36,7 +36,7 @@ class PosixAttachListener; #include #ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(((struct sockaddr_un *)0)->sun_path) +#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) #endif class PosixAttachListener: AllStatic { @@ -53,17 +53,7 @@ class PosixAttachListener: AllStatic { // this is for proper reporting JDK.Chekpoint processing to jcmd peer static PosixAttachOperation* _current_op; - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - static void set_path(char* path) { if (path == nullptr) { _path[0] = '\0'; @@ -84,9 +74,6 @@ class PosixAttachListener: AllStatic { static bool has_path() { return _has_path; } static int listener() { return _listener; } - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - static PosixAttachOperation* dequeue(); static PosixAttachOperation* get_current_op(); static void reset_current_op(); diff --git a/src/hotspot/os/posix/posixAttachOperation.hpp b/src/hotspot/os/posix/posixAttachOperation.hpp index 1d031d882da..10f253a3f76 100644 --- a/src/hotspot/os/posix/posixAttachOperation.hpp +++ b/src/hotspot/os/posix/posixAttachOperation.hpp @@ -26,31 +26,79 @@ #ifndef OS_POSIX_POSIXATTACHOPERATION_HPP #define OS_POSIX_POSIXATTACHOPERATION_HPP +#include "os_posix.hpp" #include "services/attachListener.hpp" class PosixAttachOperation; #if INCLUDE_SERVICES +class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { +private: + int _socket; +public: + SocketChannel(int socket) : _socket(socket) {} + ~SocketChannel() { + close(); + } + + int socket() const { + return _socket; + } + + bool opened() const { + return _socket != -1; + } + + void close() { + if (opened()) { + ::close(_socket); + _socket = -1; + } + } + + // RequestReader + int read(void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::read(_socket, buffer, (size_t)size), n); + return checked_cast(n); + } + + // ReplyWriter + int write(const void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::write(_socket, buffer, size), n); + return checked_cast(n); + } + // called after writing all data + void flush() override { + ::shutdown(_socket, SHUT_RDWR); + } +}; + class PosixAttachOperation: public AttachOperation { private: // the connection to the client - int _socket; + SocketChannel _socket_channel; bool _effectively_completed; void write_operation_result(jint result, bufferedStream* st); public: - void complete(jint res, bufferedStream* st); + void complete(jint res, bufferedStream* st) override; void effectively_complete_raw(jint res, bufferedStream* st); bool is_effectively_completed() { return _effectively_completed; } - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } + int socket() { + return _socket_channel.socket();; + } - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); + PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { _effectively_completed = false; } + + bool read_request() { + return AttachOperation::read_request(&_socket_channel, &_socket_channel); + } }; #endif // INCLUDE_SERVICES
------------- Commit messages: - Merge with jdk:jdk-25+3 - 8346463: Add test coverage for deploying the default provider as a module - 8346306: Unattached thread can cause crash during VM exit if it calls wait_if_vm_exited - 8340401: DcmdMBeanPermissionsTest.java and SystemDumpMapTest.java fail with assert(_stack_base != nullptr) failed: Sanity check - 8346475: RISC-V: Small improvement for MacroAssembler::ctzc_bit - 8346016: Problemlist vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_a in virtual thread mode - 8346132: fallbacklinker.c failed compilation due to unused variable - 8346570: SM cleanup of tests for Beans and Serialization - 8346532: XXXVector::rearrangeTemplate misses null check - 8346300: Add @Test annotation to TCKZoneId.test_constant_OLD_IDS_POST_2024b test - ... and 84 more: https://git.openjdk.org/crac/compare/410d0e16...c54dd827 The webrevs contain the adjustments done while merging with regards to each parent branch: - crac: https://webrevs.openjdk.org/?repo=crac&pr=224&range=00.0 - jdk:jdk-25+3: https://webrevs.openjdk.org/?repo=crac&pr=224&range=00.1 Changes: https://git.openjdk.org/crac/pull/224/files Stats: 14344 lines in 943 files changed: 9675 ins; 2416 del; 2253 mod Patch: https://git.openjdk.org/crac/pull/224.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/224/head:pull/224 PR: https://git.openjdk.org/crac/pull/224 From tpushkin at openjdk.org Wed Apr 16 13:19:58 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 16 Apr 2025 13:19:58 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail Message-ID: Fixes the failing test, for simplicity pretending that both `jdk.crac` and `jdk.management/jdk.crac.management` were added in JDK 24 and before that there was no CRaC in the JDK. Otherwise we would need to retroactively generate symbols for JDKs 17?23 which is a decent amount of work (there are no public CRaC builds for some of these versions). JDK 24 symbols were updated this way: 1. Create a custom build from the last OpenJDK 24 CRaC commit 884d0746b168550f13bdc687b1d96d468aec4411 (the last commit before JDK 25 was merged). 2. Update the symbols from that build using `make/scripts/generate-symbol-data.sh`. 3. Manually remove the CRaC methods removed in d64fb30c0874d93c986ad04ac3995a727b7a1ac8 from the symbols. Also adds the since-checking tests to CI. I initially wanted to also add a since-checking test for `jdk.crac` module but `SinceChecker` seems to have a bug which makes the test fail with ?module: jdk.crac: `@since` version is 24 but the element exists before JDK 10?. I believe this is a `SinceChecker` bug because the same happens for other modules added after JDK 9 without a legacy preview, e.g. `jdk.graal.compiler`. ------------- Commit messages: - Fix since checker test Changes: https://git.openjdk.org/crac/pull/225/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=225&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354679 Stats: 80 lines in 7 files changed: 75 ins; 0 del; 5 mod Patch: https://git.openjdk.org/crac/pull/225.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/225/head:pull/225 PR: https://git.openjdk.org/crac/pull/225