From tpushkin at openjdk.org Wed Apr 2 14:00:26 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 2 Apr 2025 14:00:26 GMT Subject: [crac] Integrated: 8352413: [CRaC] crexec fails to pass some options when CRAC_CRIU_OPTS is already used In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 12:04:49 GMT, Timofei Pushkin wrote: > ~~This contains the change from #216 so that should be merged first.~~ UPD: rebased. > > The fix itself is small but coming up with a way to test it was not trivial: > 1. I've split `jdk/crac/CracEngineOptionsTest.java` onto `jdk/crac/engineOptions/ParsingTest.java` and `jdk/crac/engineOptions/HelpTest.java` because it was getting too large (nothing added/removed, just split). > 2. Added `jdk/crac/engineOptions/CracCriuOptsTest.java` to regression-test the main fix of this PR (this test depends on #216). > 3. Removed a part that tested that `args` are actually applied by `crexec` from `jdk/crac/VMOptionsTest.java` because (2) is now effectively tests this (`VMOptionsTest` wasn't a proper place for this to begin with, it just was convenient). This pull request has now been integrated. Changeset: f1aa8900 Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/f1aa890020af46ae8903a58de68b475f34c53576 Stats: 601 lines in 6 files changed: 364 ins; 232 del; 5 mod 8352413: [CRaC] crexec fails to pass some options when CRAC_CRIU_OPTS is already used Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/217 From rvansa at openjdk.org Wed Apr 2 14:03:26 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 2 Apr 2025 14:03:26 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: On Mon, 31 Mar 2025 08:58:32 GMT, Timofei Pushkin wrote: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location exec_location src/hotspot/share/runtime/crac_engine.cpp line 415: > 413: } > 414: > 415: if (strcmp(id, "crexec") == 0) { I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2024895757 From rvansa at openjdk.org Wed Apr 2 14:34:31 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 2 Apr 2025 14:34:31 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v2] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: fixup ------------- Changes: - all: https://git.openjdk.org/crac/pull/219/files - new: https://git.openjdk.org/crac/pull/219/files/1f46d9eb..3aabf07c Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=219&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=219&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From tpushkin at openjdk.org Wed Apr 2 14:55:07 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 2 Apr 2025 14:55:07 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:00:49 GMT, Radim Vansa wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location exec_location > > src/hotspot/share/runtime/crac_engine.cpp line 415: > >> 413: } >> 414: >> 415: if (strcmp(id, "crexec") == 0) { > > I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. Regarding the logs, agree, I'll remove them. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2025002510 From tpushkin at openjdk.org Thu Apr 3 07:00:48 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Thu, 3 Apr 2025 07:00:48 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: Message-ID: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request incrementally with two additional commits since the last revision: - Use comma as a separator when printing controlled options - Simplify vm_controlled_options ------------- Changes: - all: https://git.openjdk.org/crac/pull/220/files - new: https://git.openjdk.org/crac/pull/220/files/71c4227f..9cf82953 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=220&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=220&range=00-01 Stats: 19 lines in 2 files changed: 6 ins; 6 del; 7 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From rvansa at openjdk.org Mon Apr 7 08:37:28 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 08:37:28 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: Message-ID: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> On Wed, 2 Apr 2025 14:52:52 GMT, Timofei Pushkin wrote: >> src/hotspot/share/runtime/crac_engine.cpp line 415: >> >>> 413: } >>> 414: >>> 415: if (strcmp(id, "crexec") == 0) { >> >> I don't think you need the `crexec` check here. I would also drop the logging parts, unlikely to be useful in the practice. > > It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. > > Regarding the logs, agree, I'll remove them. Let's ignore the theoretical ambiguity for `crexec`... > It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030733676 From tpushkin at openjdk.org Mon Apr 7 08:51:12 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 08:51:12 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 08:34:42 GMT, Radim Vansa wrote: >> It may be an external C/R engine with `exec_location` option ? JVM won't block the user from using this option so it shouldn't be included here. It is possible for this engine to also be named "crexec" so this check is not fully robust (it would be more robust to record that we've loaded _our crexec_) but I've decided this is a decent balance between robustness and code simplicity. >> >> Regarding the logs, agree, I'll remove them. > > Let's ignore the theoretical ambiguity for `crexec`... > >> It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. > > I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. So no other engine should be able to access this usage of `exec_location`: - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030758945 From rvansa at openjdk.org Mon Apr 7 09:38:20 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 09:38:20 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 08:48:18 GMT, Timofei Pushkin wrote: >> Let's ignore the theoretical ambiguity for `crexec`... >> >>> It may be an external C/R engine with exec_location option ? JVM won't block the user from using this option so it shouldn't be included here. >> >> I do not get the message. If an external engine supports that option, then JVM should provide the information (maybe the engine wants to load some shared library that is in the lib directory?). It shouldn't treat `our crexec` differently than any external engine. > > I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. > > So no other engine should be able to access this usage of `exec_location`: > - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) > - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) Despite `crexec` is part of JVM codebase and it allows JVM to use executable-based engine implementations, I don't consider it a part of JVM; the separation line should be drawn at CRE API level. So it should not be a "JVM's implementation detail". In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. In fact, it is non-trivial to programmatically figure out from within a shared library was loaded from (if e.g. it needs to load some extra resource that should be in the same directory) so it might be useful. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2030861225 From tpushkin at openjdk.org Mon Apr 7 11:23:11 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 11:23:11 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 09:35:55 GMT, Radim Vansa wrote: >> I treat `exec_location` and the whole `crexec` as a JVM's implementation detail: if JVM determines that the engine passed to `CRaCEngine` is an executable then it uses `crexec` and passes it the location of the real engine via `exec_location` ? in no other case `exec_location` is used. >> >> So no other engine should be able to access this usage of `exec_location`: >> - if the engine is a library, JVM won't use `exec_location` at all (and it will allow the user to use an option with such name) >> - if the engine is an executable, JVM will pass `exec_location` to `crexec` but it won't be passed to the engine executable itself (the user can use `args` to pass arguments to the executable) > > Despite `crexec` is part of JVM codebase and it allows JVM to use executable-based engine implementations, I don't consider it a part of JVM; the separation line should be drawn at CRE API level. So it should not be a "JVM's implementation detail". > > In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. In fact, it is non-trivial to programmatically figure out from within a shared library was loaded from (if e.g. it needs to load some extra resource that should be in the same directory) so it might be useful. I agree with what you say about "implementation detail", looks like we just interpret these words a bit differently. > In the first version of the API it was called 'library_path`: informing the engine about a place where it should load other executables/libraries from. Yes, but just `library_path` is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API `crexec` ended up using `args` instead of `library_path` for this. We could pass `exec_location` to all engines that accept it (would be better to rename it to `engine_location` then) but I am not sure how useful this would be to other engines besides `crexec`: I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031028270 From rvansa at openjdk.org Mon Apr 7 11:58:09 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 11:58:09 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 11:20:48 GMT, Timofei Pushkin wrote: > Yes, but just library_path is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API crexec ended up using args instead of library_path for this. ... > would be better to rename it to `engine_location` In the current impl it is telling the `lib` path within JDK installation location, rather than location of the engine (shared library). Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). Looking into this again I see that it is actually quite simple using `dladdr`. I am not sure why I have resorted to reading `/proc/self/maps` in some of my code... However, if the engine is set using absolute path outside JVM it might be problematic to get the JVM path (`JAVA_HOME` not being set...). > We could pass exec_location to all engines that accept it ... but I am not sure how useful this would be to other engines I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. Btw. there's third option: we could use the `dladdr/GetModuleFileName` in `crexec` and drop `exec_location` completely. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031081534 From tpushkin at openjdk.org Mon Apr 7 12:37:16 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 12:37:16 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 11:55:28 GMT, Radim Vansa wrote: > I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. > Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable because it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031144476 From tpushkin at openjdk.org Mon Apr 7 13:06:27 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 7 Apr 2025 13:06:27 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 12:33:32 GMT, Timofei Pushkin wrote: >>> Yes, but just library_path is not enough, engine executable does not always come from there sines the user can provide an arbitrary absolute path. In the first version of the API crexec ended up using args instead of library_path for this. >> ... >>> would be better to rename it to `engine_location` >> >> In the current impl it is telling the `lib` path within JDK installation location, rather than location of the engine (shared library). Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. >> >>> I've never tried to get a file address of a shared library from within itself but after some googling it doesn't seem too complicated (for a not so commonly needed thing). >> >> Looking into this again I see that it is actually quite simple using `dladdr`. I am not sure why I have resorted to reading `/proc/self/maps` in some of my code... However, if the engine is set using absolute path outside JVM it might be problematic to get the JVM path (`JAVA_HOME` not being set...). >> >>> We could pass exec_location to all engines that accept it ... but I am not sure how useful this would be to other engines >> >> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. >> >> Btw. there's third option: we could use the `dladdr/GetModuleFileName` in `crexec` and drop `exec_location` completely. > >> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. > > Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. > >> Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. > > Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. > Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. If JVM doesn't know about it then it should be passed by the user directly: `-XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu` ? this would be somewhat inconvenient from UX perspective ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031197494 From rvansa at openjdk.org Mon Apr 7 15:12:22 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 7 Apr 2025 15:12:22 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v2] In-Reply-To: References: <17eW_jw6o7L6-p_JCfLavMTGOiqfw6jC92AFSrIKGFY=.844191e2-a2c0-4689-8a6f-2d544f9bf2cf@github.com> Message-ID: On Mon, 7 Apr 2025 13:03:03 GMT, Timofei Pushkin wrote: >>> I might have focused too much on this being "handy", but my main objective was to not have any code in JVM that would treat different engines differently. >> >> Well, JVM will treat `crexec` somewhat specially anyway because this is the only engine that is not requested by the user directly but rather by the JVM (when it finds that the user requested an executable engine), but I get it that you want to reduce such special handling as much as possible. >> >>> Btw. there's third option: we could use the dladdr/GetModuleFileName in crexec and drop exec_location completely. >> >> Through `dladdr`/`GetModuleFileName` crexec can find where it is located, i.e. this is indeed a substitution for `library_path`. But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, not only `lib` within JDK installation. The most convenient way would be to use an option which leads us to `exec_location` again. > >> Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > > If JVM doesn't know about it then it should be passed by the user directly: `-XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu` ? this would be somewhat inconvenient from UX perspective > But we'll still need to pass the location of the engine executable (otherwise crexec won't know what exactly to call), and it can be any user-provided absolute path, I had to re-read the code again - my understanding of the option meaning was really wrong, it is passing the resolved location of the executable. So now I understand your argument, and the "third option" is really not viable. And I have to withdraw the argument that it's useful to external implementations; in fact it's set to `nullptr` if `CRaCEngine` does not refer to a shared library. >> Anyway this doesn't have to be exhaustive: there might be other option (JVM wouldn't know about) to hint about some other path. > If JVM doesn't know about it then it should be passed by the user directly: -XX:CRaCEngine=crexec -XX:CRaCEngineOptions=engine=criu ? this would be somewhat inconvenient from UX perspective Due to my confusion we were talking about two different things. Let's imagine a hypotetical engine `foobar`; this engine would live in `$JAVA_HOME/lib/libfoobar.so` and be invoked with `-XX:CRaCEngine=foobar`. * if the engine would for some reason need load `libjsig.so` it would need the `library_path` * if the engine would need to execute binary `foobar-tool` it could likely refer to the same directory through `dladdr` * if the engine needs to load `libzip.so` from a non-standard location, you would pass `-XX:CRaCEngineOptions=foobar.lib_dir=/opt/myzip` <- by non-exhaustive I meant that you could do this rather than lookup in engine's directory or `$JAVA_HOME/lib` In any case, I would suggest to have a fixed list of reserved CRE options, and as you noted, omit engine-specific handling as much as possible. Let's keep things as simple as these can be. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/220#discussion_r2031462524 From tpushkin at openjdk.org Tue Apr 8 10:43:54 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:43:54 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v3] In-Reply-To: References: Message-ID: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: Make list of VM-controlled options static ------------- Changes: - all: https://git.openjdk.org/crac/pull/220/files - new: https://git.openjdk.org/crac/pull/220/files/9cf82953..a3622886 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=220&range=02 - incr: https://webrevs.openjdk.org/?repo=crac&pr=220&range=01-02 Stats: 76 lines in 4 files changed: 19 ins; 34 del; 23 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From tpushkin at openjdk.org Tue Apr 8 10:53:22 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:53:22 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v3] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 10:43:54 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. Rebased on top of the current main branch just in case (GitHub showed a conflict), no changes ------------- PR Comment: https://git.openjdk.org/crac/pull/220#issuecomment-2786026131 From tpushkin at openjdk.org Tue Apr 8 10:53:21 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 10:53:21 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: References: Message-ID: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Make list of VM-controlled options static - Use comma as a separator when printing controlled options - Simplify vm_controlled_options - Show all options in engine help ------------- Changes: https://git.openjdk.org/crac/pull/220/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=220&range=03 Stats: 82 lines in 6 files changed: 61 ins; 8 del; 13 mod Patch: https://git.openjdk.org/crac/pull/220.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/220/head:pull/220 PR: https://git.openjdk.org/crac/pull/220 From rvansa at openjdk.org Tue Apr 8 12:25:27 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 8 Apr 2025 12:25:27 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> References: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> Message-ID: On Tue, 8 Apr 2025 10:53:21 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Make list of VM-controlled options static > - Use comma as a separator when printing controlled options > - Simplify vm_controlled_options > - Show all options in engine help Marked as reviewed by rvansa (Committer). ------------- PR Review: https://git.openjdk.org/crac/pull/220#pullrequestreview-2749785631 From duke at openjdk.org Tue Apr 8 13:53:35 2025 From: duke at openjdk.org (duke) Date: Tue, 8 Apr 2025 13:53:35 GMT Subject: [crac] RFR: 8353243: [CRaC] Show all options in engine help [v4] In-Reply-To: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> References: <0gvDH_Ady16eBFVeu_UC1YjNlO95F8uVFsJ0gasTMjA=.25493308-9972-4d42-96ef-b1841ca4e4bb@github.com> Message-ID: On Tue, 8 Apr 2025 10:53:21 GMT, Timofei Pushkin wrote: >> C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. >> >> crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. >> >> This is how crexec's help looks with this change: >> >> $ java -XX:CRaCEngineOptions=help >> crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. >> >> Configuration options: >> * image_location= (no default) - path to a directory with checkpoint/restore files. >> * exec_location= (no default) - path to the engine executable. >> * keep_running= (default: false) - keep the process running after the checkpoint or kill it. >> * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. >> * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". >> >> Configuration options controlled by the JVM: image_location, exec_location > > Timofei Pushkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Make list of VM-controlled options static > - Use comma as a separator when printing controlled options > - Simplify vm_controlled_options > - Show all options in engine help @TimPushkin Your change (at version 747d25065ea2f9d1071627a0ba451e66fb5f7005) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/crac/pull/220#issuecomment-2786510403 From tpushkin at openjdk.org Tue Apr 8 14:02:48 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 8 Apr 2025 14:02:48 GMT Subject: [crac] Integrated: 8353243: [CRaC] Show all options in engine help In-Reply-To: References: Message-ID: <3AJ1yy7T1GG-CaznlFMf6dgb1Ddv4-D3sUKl9zl0k_8=.22a0972f-9e3f-49b6-a8ba-a0c0fbbd3c7b@github.com> On Mon, 31 Mar 2025 08:58:32 GMT, Timofei Pushkin wrote: > C/R engines are now advised to list all options in `configuration_doc`. If JVM does not let users to control some options it states that in the engine help message. > > crexec now documents internal options, such as `image_location` and `exec_location`, in its doc message. > > This is how crexec's help looks with this change: > > $ java -XX:CRaCEngineOptions=help > crexec - pseudo-CRaC-engine used to relay data from JVM to a "real" engine implemented as an executable (instead of a library). The engine executable is expected to have CRaC-CRIU-like CLI. Support of the configuration options also depends on the engine executable. > > Configuration options: > * image_location= (no default) - path to a directory with checkpoint/restore files. > * exec_location= (no default) - path to the engine executable. > * keep_running= (default: false) - keep the process running after the checkpoint or kill it. > * direct_map= (default: true) - on restore, map process data directly from saved files. This may speedup the restore but the resulting process will not be the same as before the checkpoint. > * args= (default: "") - free space-separated arguments passed directly to the engine executable, e.g. "--arg1 --arg2 --arg3". > > Configuration options controlled by the JVM: image_location, exec_location This pull request has now been integrated. Changeset: 410d0e16 Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/410d0e168c326b7d892af1b9e990eb4a2b5e0fa1 Stats: 82 lines in 6 files changed: 61 ins; 8 del; 13 mod 8353243: [CRaC] Show all options in engine help Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/220 From akozlov at azul.com Wed Apr 9 10:47:12 2025 From: akozlov at azul.com (Anton Kozlov) Date: Wed, 9 Apr 2025 13:47:12 +0300 Subject: CFV: New CRaC Committer: Timofei Pushkin Message-ID: <7816a446-b27a-4098-b061-01a001818d84@azul.com> I hereby nominate Timofei Pushkin to CRaC Committer. Timofei is an engineer at Azul who has contributed 45 patches [3]. He is an active contributor to the project, and we expect him to continue working on the project. Votes are due by Thu Apr 24 2025 9AM PST. Only current CRaC Committers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Anton Kozlov [1] https://openjdk.org/census [2] https://openjdk.org/projects/#committer-vote [3] https://github.com/openjdk/crac/pulls?q=is%3Apr+is%3Aclosed++author%3Atimpushkin+label%3Aintegrated From rvansa at azul.com Thu Apr 10 06:37:21 2025 From: rvansa at azul.com (Radim Vansa) Date: Thu, 10 Apr 2025 08:37:21 +0200 Subject: CFV: New CRaC Committer: Timofei Pushkin In-Reply-To: <7816a446-b27a-4098-b061-01a001818d84@azul.com> References: <7816a446-b27a-4098-b061-01a001818d84@azul.com> Message-ID: <5b923be7-0405-4fd7-adef-a997e044d2fd@azul.com> Vote: yes Radim On 09. 04. 25 12:47, Anton Kozlov wrote: > Caution: This email originated from outside of the organization. Do > not click links or open attachments unless you recognize the sender > and know the content is safe. > > > I hereby nominate Timofei Pushkin to CRaC Committer. > > Timofei is an engineer at Azul who has contributed 45 patches [3]. He is > an active contributor to the project, and we expect him to continue > working > on the project. > > Votes are due by Thu Apr 24 2025 9AM PST. > > Only current CRaC Committers [1] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Anton Kozlov > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#committer-vote > [3] > https://github.com/openjdk/crac/pulls?q=is%3Apr+is%3Aclosed++author%3Atimpushkin+label%3Aintegrated > From mz1999 at gmail.com Thu Apr 10 09:30:04 2025 From: mz1999 at gmail.com (ma zhen) Date: Thu, 10 Apr 2025 17:30:04 +0800 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC Message-ID: Hi CRaC developers, I'm currently exploring the integration of CRaC support into our company's middleware products. I'm also very interested in the underlying implementation details of CRaC and have been doing some research into its mechanics. As I understand it, CRaC leverages CRIU under the hood for checkpointing and restoring running processes. My research indicates that CRIU itself is capable of handling open file descriptors and established network connections during the checkpoint/restore cycle. However, the CRaC API requires developers to explicitly manage these resources, typically by closing them in the beforeCheckpoint() and re-establishing them in the afterRestore(). To understand the rationale behind this design choice, I looked into the initial CRaC prototype, specifically the first PR ( https://github.com/openjdk/crac/pull/1). It appears that even in this early version, the implementation iterated through all process file descriptors during checkpoint. It ignored certain FDs (like those related to classpath files, /dev/random, /dev/urandom, and files marked M_PERSISTENT - though I'm unclear on the exact meaning of M_PERSISTENT in this context). If any other application-opened files remained, the checkpoint process would fail. This suggests the requirement for manual resource management was present from the outset. As I'm not deeply familiar with JVM internals, I'm struggling to fully grasp the reasoning. Was this restriction primarily introduced to simplify the initial design and implementation of CRaC within the JVM? I also noticed that current versions of CRaC include File Descriptor Policies. These allow configuring an action: ignore for specific file descriptors, effectively delegating their handling to CRIU. This seems to demonstrate that letting CRIU manage certain open files is feasible within the CRaC framework. This leads me to wonder: if delegation to CRIU is possible and works (at least for some cases via policies), why isn't relying on CRIU for resource handling the default or more broadly encouraged approach? Why the strict requirement for manual closure and reopening in the general case? For instance, consider using System.getLogger() from the JDK Platform Logging API. As application developers, we don't typically manage the underlying file descriptor for the log file directly. To make this work with CRaC, we currently need to identify and configure a File Descriptor Policy for it, which can feel somewhat cumbersome. Wouldn't a smoother experience involve CRaC (perhaps optionally) defaulting to letting CRIU handle such internally managed resources, like those opened by standard JDK libraries? I would appreciate any insights or clarification you could offer on the design philosophy behind CRaC's approach to managing external resources like files and sockets, especially in contrast to CRIU's capabilities. Thanks for your time and any insights you can share. Best regards, mazhen -------------- next part -------------- An HTML attachment was scrubbed... URL: From rvansa at openjdk.org Thu Apr 10 21:27:11 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 10 Apr 2025 21:27:11 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge branch 'crac' into zgc - fixup - 8353241: CRaC ZGC support ------------- Changes: https://git.openjdk.org/crac/pull/219/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=219&range=02 Stats: 104 lines in 10 files changed: 85 ins; 7 del; 12 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From rvansa at openjdk.org Thu Apr 10 21:27:11 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 10 Apr 2025 21:27:11 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:34:31 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > fixup Conflicts resolved. ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2795198990 From rvansa at azul.com Fri Apr 11 07:17:16 2025 From: rvansa at azul.com (Radim Vansa) Date: Fri, 11 Apr 2025 09:17:16 +0200 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC In-Reply-To: References: Message-ID: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> Hi Ma Zhen, you have correctly observed that closing file descriptors is rather an architectural choice than purely a technical need. CRIU is really capable of restoring the process as-is, as its main motivation is migration of running containers. Containers already define the filesystem, and the runtime is in control of external connections - e.g. CRIU can checkpoint and later restore an open socket connection, and the container runtime restores the 'second half' of the socket so that the pause is transparent to the running process. If this is what you want, there's nothing preventing you from using CRIU on a Java process manually - at the risk of breaking the internal logic of the application. However the point of CRaC is not such a transparent restore: we want to preserve the valuable state of JVM and application but adapt it to the new environment. We want to do a conscious decision about any resource external to the process. Being forced to gracefully adapt to the restore is a feature. Yes, we have File Descriptor policies, but that's not a solution - it provides a workaround for proof-of-concepts, until some code that you can't easily fix gets updated to support CRaC properly. Ideas meet practicality, and you are responsible for realizing what should be done with particular external resource. You're right that ATM we don't handle JDK Platform Logging (and neither JUL) configured to write to a file, and since that is JDK code out of user control it is a bug. We attempt to fix those one by one (PRs are welcome!). I hope I have provided some insight to these choices - and yes, I understand the pain as we still have many places to fix. Cheers, Radim On 10. 04. 25 11:30, ma zhen wrote: > > > Caution: This email originated from outside of the organization. Do > not click links or open attachments unless you recognize the sender > and know the content is safe. > > > Hi CRaC developers, > > I'm currently exploring the integration of CRaC support into our > company's middleware products. I'm also very interested in the > underlying implementation details of CRaC and have been doing some > research into its mechanics. > > As I understand it, CRaC leverages CRIU under the hood for > checkpointing and restoring running processes. My research indicates > that CRIU itself is capable of handling open file descriptors and > established network connections during the checkpoint/restore cycle. > > However, the CRaC API requires developers to explicitly manage these > resources, typically by closing them in the beforeCheckpoint()?and > re-establishing them in the afterRestore(). > > To understand the rationale behind this design choice, I looked into > the initial CRaC prototype, specifically the first PR > (https://github.com/openjdk/crac/pull/1). It appears that even in this > early version, the implementation iterated through all process file > descriptors during checkpoint. It ignored certain FDs (like those > related to classpath files, /dev/random, /dev/urandom, and files > marked M_PERSISTENT?- though I'm unclear on the exact meaning of > M_PERSISTENT?in this context). If any other application-opened files > remained, the checkpoint process would fail. This suggests the > requirement for manual resource management was present from the outset. > > As I'm not deeply familiar with JVM internals, I'm struggling to fully > grasp the reasoning. Was this restriction primarily introduced to > simplify the initial design and implementation of CRaC within the JVM? > > I also noticed that current versions of CRaC include File Descriptor > Policies. These allow configuring an action: ignore?for specific file > descriptors, effectively delegating their handling to CRIU. This seems > to demonstrate that letting CRIU manage certain open files is?feasible > within the CRaC framework. > > This leads me to wonder: if delegation to CRIU is possible and works > (at least for some cases via policies), why isn't relying on CRIU for > resource handling the default or more broadly encouraged approach? Why > the strict requirement for manual closure and reopening in the general > case? > > For instance, consider using System.getLogger()?from the JDK Platform > Logging API. As application developers, we don't typically manage the > underlying file descriptor for the log file directly. To make this > work with CRaC, we currently need to identify and configure a File > Descriptor Policy for it, which can feel somewhat cumbersome. Wouldn't > a smoother experience involve CRaC (perhaps optionally) defaulting to > letting CRIU handle such internally managed resources, like those > opened by standard JDK libraries? > > I would appreciate any insights or clarification you could offer on > the design philosophy behind CRaC's approach to managing external > resources like files and sockets, especially in contrast to CRIU's > capabilities. > > Thanks for your time and any insights you can share. > > Best regards, > > mazhen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpushkin at openjdk.org Fri Apr 11 07:18:00 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Fri, 11 Apr 2025 07:18:00 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support My review ETA: next week ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2796055268 From mz1999 at gmail.com Fri Apr 11 08:57:35 2025 From: mz1999 at gmail.com (ma zhen) Date: Fri, 11 Apr 2025 16:57:35 +0800 Subject: Question regarding the design rationale for handling file descriptors/network connections in CRaC In-Reply-To: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> References: <3d944bd2-5a31-4081-b7f0-dc3a81a51151@azul.com> Message-ID: Hi Radim, Thanks a lot for the detailed explanation! That completely cleared up my understanding of the design philosophy behind CRaC. It makes perfect sense now that the goal isn't purely transparent restoration, but rather preserving the valuable internal JVM/application state while enabling robust adaptation to the new environment after restore ? sacrificing some transparency for resilience by consciously managing external resources. Great project, and I appreciate the insight. Hope to be able to contribute down the line! Cheers, Ma Zhen Radim Vansa ?2025?4?11??? 15:17??? > Hi Ma Zhen, > > you have correctly observed that closing file descriptors is rather an > architectural choice than purely a technical need. CRIU is really capable > of restoring the process as-is, as its main motivation is migration of > running containers. Containers already define the filesystem, and the > runtime is in control of external connections - e.g. CRIU can checkpoint > and later restore an open socket connection, and the container runtime > restores the 'second half' of the socket so that the pause is transparent > to the running process. > > If this is what you want, there's nothing preventing you from using CRIU > on a Java process manually - at the risk of breaking the internal logic of > the application. However the point of CRaC is not such a transparent > restore: we want to preserve the valuable state of JVM and application but > adapt it to the new environment. We want to do a conscious decision about > any resource external to the process. Being forced to gracefully adapt to > the restore is a feature. > > Yes, we have File Descriptor policies, but that's not a solution - it > provides a workaround for proof-of-concepts, until some code that you can't > easily fix gets updated to support CRaC properly. Ideas meet practicality, > and you are responsible for realizing what should be done with particular > external resource. > > You're right that ATM we don't handle JDK Platform Logging (and neither > JUL) configured to write to a file, and since that is JDK code out of user > control it is a bug. We attempt to fix those one by one (PRs are welcome!). > > I hope I have provided some insight to these choices - and yes, I > understand the pain as we still have many places to fix. > > Cheers, > > Radim > On 10. 04. 25 11:30, ma zhen wrote: > > > Caution: This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > Hi CRaC developers, > > I'm currently exploring the integration of CRaC support into our company's > middleware products. I'm also very interested in the underlying > implementation details of CRaC and have been doing some research into its > mechanics. > > As I understand it, CRaC leverages CRIU under the hood for checkpointing > and restoring running processes. My research indicates that CRIU itself is > capable of handling open file descriptors and established network > connections during the checkpoint/restore cycle. > > However, the CRaC API requires developers to explicitly manage these > resources, typically by closing them in the beforeCheckpoint() and > re-establishing them in the afterRestore(). > > To understand the rationale behind this design choice, I looked into the > initial CRaC prototype, specifically the first PR ( > https://github.com/openjdk/crac/pull/1). It appears that even in this > early version, the implementation iterated through all process file > descriptors during checkpoint. It ignored certain FDs (like those related > to classpath files, /dev/random, /dev/urandom, and files marked > M_PERSISTENT - though I'm unclear on the exact meaning of M_PERSISTENT in > this context). If any other application-opened files remained, the > checkpoint process would fail. This suggests the requirement for manual > resource management was present from the outset. > > As I'm not deeply familiar with JVM internals, I'm struggling to fully > grasp the reasoning. Was this restriction primarily introduced to simplify > the initial design and implementation of CRaC within the JVM? > > I also noticed that current versions of CRaC include File Descriptor > Policies. These allow configuring an action: ignore for specific file > descriptors, effectively delegating their handling to CRIU. This seems to > demonstrate that letting CRIU manage certain open files is feasible > within the CRaC framework. > > This leads me to wonder: if delegation to CRIU is possible and works (at > least for some cases via policies), why isn't relying on CRIU for resource > handling the default or more broadly encouraged approach? Why the strict > requirement for manual closure and reopening in the general case? > > For instance, consider using System.getLogger() from the JDK Platform > Logging API. As application developers, we don't typically manage the > underlying file descriptor for the log file directly. To make this work > with CRaC, we currently need to identify and configure a File Descriptor > Policy for it, which can feel somewhat cumbersome. Wouldn't a smoother > experience involve CRaC (perhaps optionally) defaulting to letting CRIU > handle such internally managed resources, like those opened by standard JDK > libraries? > > I would appreciate any insights or clarification you could offer on the > design philosophy behind CRaC's approach to managing external resources > like files and sockets, especially in contrast to CRIU's capabilities. > > Thanks for your time and any insights you can share. > > Best regards, > > mazhen > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpushkin at openjdk.org Fri Apr 11 14:50:35 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Fri, 11 Apr 2025 14:50:35 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time Message-ID: This is an alternative to #209 that also fixes `TimedWaitingTest`. The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. ------------- Commit messages: - Make nanoTime offsetting more precise Changes: https://git.openjdk.org/crac/pull/221/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=221&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354432 Stats: 20 lines in 2 files changed: 8 ins; 0 del; 12 mod Patch: https://git.openjdk.org/crac/pull/221.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/221/head:pull/221 PR: https://git.openjdk.org/crac/pull/221 From duke at openjdk.org Mon Apr 14 12:00:33 2025 From: duke at openjdk.org (duke) Date: Mon, 14 Apr 2025 12:00:33 GMT Subject: git: openjdk/crac: created branch 8354514_remove_set_reopened based on the branch crac containing 1 unique commit Message-ID: <38d576e6-9d1c-46f9-9d40-ea91ef716600@openjdk.org> The following commits are unique to the 8354514_remove_set_reopened branch: ======================================================== cbe08b5e: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl From tpushkin at openjdk.org Mon Apr 14 13:14:09 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:09 GMT Subject: [crac] Withdrawn: 8351402: [CRaC] Use System.nanoTime() in TimedWaitingTest In-Reply-To: <6D05THOPyL0EtPFFTPj5d2jDBYJoUxodkJnK1chTae8=.79e61d28-bf43-42bd-96a4-3129a050552d@github.com> References: <6D05THOPyL0EtPFFTPj5d2jDBYJoUxodkJnK1chTae8=.79e61d28-bf43-42bd-96a4-3129a050552d@github.com> Message-ID: <_GxWTMXhcEouXGuGlakU-STvyX5XpvhB8ULHxs5yzb4=.3655ef11-1aae-4984-9268-505c22c66c9a@github.com> On Fri, 7 Mar 2025 12:15:50 GMT, Timofei Pushkin wrote: > Replaces `System.currentTimeMillis()` with `System.nanoTime()` in `TimedWaitingTest` since the former can, in theory, jump back and forth and that may lead to the test failures. > > Also adds a diagnostic assert. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/crac/pull/209 From tpushkin at openjdk.org Mon Apr 14 13:14:14 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:14 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: References: Message-ID: > This is an alternative to #209 that also fixes `TimedWaitingTest`. > > The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. > > I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: Move docs to header ------------- Changes: - all: https://git.openjdk.org/crac/pull/221/files - new: https://git.openjdk.org/crac/pull/221/files/4de87e38..5497193d Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=221&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=221&range=00-01 Stats: 35 lines in 2 files changed: 9 ins; 9 del; 17 mod Patch: https://git.openjdk.org/crac/pull/221.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/221/head:pull/221 PR: https://git.openjdk.org/crac/pull/221 From rvansa at openjdk.org Mon Apr 14 13:14:33 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:33 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Message-ID: Fix errors in JCK due to newly exposed methods. ------------- Commit messages: - 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Changes: https://git.openjdk.org/crac/pull/222/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=222&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354514 Stats: 63 lines in 6 files changed: 39 ins; 18 del; 6 mod Patch: https://git.openjdk.org/crac/pull/222.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/222/head:pull/222 PR: https://git.openjdk.org/crac/pull/222 From rvansa at openjdk.org Mon Apr 14 13:14:21 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:21 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: References: Message-ID: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> On Mon, 14 Apr 2025 11:03:38 GMT, Timofei Pushkin wrote: >> This is an alternative to #209 that also fixes `TimedWaitingTest`. >> >> The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. >> >> I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. > > Timofei Pushkin has updated the pull request incrementally with one additional commit since the last revision: > > Move docs to header Great analysis, I am glad that we could have that test finally working! The change looks good. Originally you thought that the wallclock time is adjusting backwards; since the test enforces the first branch, I think it is still proof against going back during the checkpoint. The test would really fail if going back between checkpoint and the end of 1000 ms wait, correct? LGTM! src/hotspot/share/runtime/crac.cpp line 62: > 60: CracEngine *crac::_engine = nullptr; > 61: // Timestamps recorded before checkpoint > 62: jlong crac::checkpoint_wallclock_seconds; // Wall clock time, full seconds Not even a nitpick: I wonder if we should document private fields here or in the header file, WDYT? (I stand guilty for having docs for `javaTimeNanos_offset` here...). ------------- Marked as reviewed by rvansa (Committer). PR Review: https://git.openjdk.org/crac/pull/221#pullrequestreview-2763299969 PR Review: https://git.openjdk.org/crac/pull/221#pullrequestreview-2763744116 PR Review Comment: https://git.openjdk.org/crac/pull/221#discussion_r2041516657 From tpushkin at openjdk.org Mon Apr 14 13:14:24 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:24 GMT Subject: [crac] RFR: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time [v2] In-Reply-To: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> References: <7vCpEIO7uGwxwCH01TfQaoXRZedT-iMeu7qkpGpKys0=.ad7893cf-e608-44b5-b0e8-b450fa9c12f5@github.com> Message-ID: On Mon, 14 Apr 2025 07:15:25 GMT, Radim Vansa wrote: > The test would really fail if going back between checkpoint and the end of 1000 ms wait, correct? Yes, it would (and still will) fail if either: - The time goes back between checkpoint and restore - The time goes back between the moment it is read in `record_time_before_checkpoint()` and in the test for `after` > src/hotspot/share/runtime/crac.cpp line 62: > >> 60: CracEngine *crac::_engine = nullptr; >> 61: // Timestamps recorded before checkpoint >> 62: jlong crac::checkpoint_wallclock_seconds; // Wall clock time, full seconds > > Not even a nitpick: I wonder if we should document private fields here or in the header file, WDYT? (I stand guilty for having docs for `javaTimeNanos_offset` here...). I think placing them in the header would be better ------------- PR Comment: https://git.openjdk.org/crac/pull/221#issuecomment-2800736723 PR Review Comment: https://git.openjdk.org/crac/pull/221#discussion_r2041562189 From tpushkin at openjdk.org Mon Apr 14 13:14:37 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:37 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl In-Reply-To: References: Message-ID: <9cu75gmGmFXdETU4ETV62okoRd9D67tbLsiYq1BFo8M=.a8c01265-0214-4ac8-80bf-32200ba99b2e@github.com> On Mon, 14 Apr 2025 12:00:30 GMT, Radim Vansa wrote: > Fix errors in JCK due to newly exposed methods. src/java.base/share/classes/jdk/internal/access/JavaNioChannelsSpiAccess.java line 1: > 1: package jdk.internal.access; A license header is missing ------------- PR Review Comment: https://git.openjdk.org/crac/pull/222#discussion_r2042075853 From tpushkin at openjdk.org Mon Apr 14 13:14:28 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:28 GMT Subject: [crac] Integrated: 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time In-Reply-To: References: Message-ID: <7gNc72s5XOZF60Cu7sxVpbIpthypZerU3ek-xnEPtLU=.ddb008f7-873f-4ab4-b997-bd67c7be530c@github.com> On Fri, 11 Apr 2025 14:44:22 GMT, Timofei Pushkin wrote: > This is an alternative to #209 that also fixes `TimedWaitingTest`. > > The solution is to offset `os::javaTimeNanos()` using nanosecond-precision wall clock time instead of millisecond-precision. See the explanation in the related JBS issue. > > I've run the test 100 times in CI and haven't witnessed it failing. To compare, without the fix it would usually fail in the first 5-10 runs. This pull request has now been integrated. Changeset: ad63687e Author: Timofei Pushkin Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/ad63687e9a057517831af62c60275684bc668e3e Stats: 41 lines in 2 files changed: 15 ins; 7 del; 19 mod 8354432: [CRaC] Timed waiting finishes early w.r.t. wall clock time Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/221 From tpushkin at openjdk.org Mon Apr 14 13:14:31 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:31 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support src/hotspot/share/gc/z/zPageAllocator.cpp line 1067: > 1065: event.commit(uncommitted); > 1066: } > 1067: } No newline at the EOF src/hotspot/share/gc/z/zPageAllocator.hpp line 173: > 171: void threads_do(ThreadClosure* tc) const; > 172: > 173: void cleanup_unused(); This should not be addressed in this PR, but I find "cleanup_unused" name pretty undescriptive, both here and in `CollectedHeap` in general. It wasn't obvious to me that this is something used only by CRaC. I would propose `cleanup_before_checkpoint`, for example. src/hotspot/share/gc/z/zPageCache.cpp line 33: > 31: #include "gc/z/zStat.hpp" > 32: #include "gc/z/zValue.inline.hpp" > 33: #include "logging/log.hpp" There is no logging code added so these new imports shouldn't be needed. src/hotspot/share/gc/z/zPageCache.cpp line 292: > 290: _delay(delay) { > 291: // Set initial timeout > 292: *_timeout = ZUncommitDelay; Shouldn't this also use `delay`? This shouldn't have any real influence now (the code that has `delay != ZUncommitDelay` doesn't use the timeout) but anyway. src/hotspot/share/runtime/crac.cpp line 429: > 427: MemTracker::final_report(tty); > 428: } > 429: Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. test/jdk/jdk/crac/fileDescriptors/ZGCTest.java line 37: > 35: * @requires (os.family == "linux") > 36: */ > 37: public class ZGCTest implements CracTest { A bit unclear why this is in `jdk/crac/fileDescriptors` directory. I guess because of the required `memfd` support but I would still put it in the general `jdk/crac` directory. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041490955 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041524248 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041509158 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041503636 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041476340 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041517297 From rvansa at openjdk.org Mon Apr 14 13:14:14 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:14 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: Message-ID: <3Lco9IcmSYym-KcVGkgG1MYvqMHNjTI0Cwf81MH5gFw=.f841ff1b-ed91-446b-92fe-9648d05c373d@github.com> On Thu, 10 Apr 2025 21:27:11 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'crac' into zgc > - fixup > - 8353241: CRaC ZGC support Updated. ------------- PR Comment: https://git.openjdk.org/crac/pull/219#issuecomment-2801615156 From rvansa at openjdk.org Mon Apr 14 13:14:36 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:36 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> Message-ID: On Mon, 14 Apr 2025 06:57:19 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge branch 'crac' into zgc >> - fixup >> - 8353241: CRaC ZGC support > > src/hotspot/share/gc/z/zPageAllocator.hpp line 173: > >> 171: void threads_do(ThreadClosure* tc) const; >> 172: >> 173: void cleanup_unused(); > > This should not be addressed in this PR, but I find "cleanup_unused" name pretty undescriptive, both here and in `CollectedHeap` in general. It wasn't obvious to me that this is something used only by CRaC. I would propose `cleanup_before_checkpoint`, for example. The name is not mandated by any interface so I can rename it even here... > src/hotspot/share/runtime/crac.cpp line 429: > >> 427: MemTracker::final_report(tty); >> 428: } >> 429: > > Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2042020387 PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041549347 From tpushkin at openjdk.org Mon Apr 14 13:14:40 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 13:14:40 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> Message-ID: <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> On Mon, 14 Apr 2025 07:16:53 GMT, Radim Vansa wrote: >> src/hotspot/share/runtime/crac.cpp line 429: >> >>> 427: MemTracker::final_report(tty); >>> 428: } >>> 429: >> >> Is this left intentionally? If yes, I think nothing will be printed after this on the real VM exit (after restore) since there is `Atomic::cmpxchg(&g_final_report_did_run, false, true) == false` in `MemTracker::final_report()`. > > Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. I actually think that this should be done in as a separate change then. I would propose to either document that `PrintNMTStatistics` prints on checkpoint, or make it have several modes (0 ? off, 1 ? print on exit, 2 ? print on checkpoint, 4 ? print on both exit and checkpoint), or add a separate `PrintNMTStatisticsOnCheckpoint`. The reason why I am proposing this is because I believe we had a related request from community for this somewhere, so it would be nice to have this documented. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041556788 From rvansa at openjdk.org Mon Apr 14 13:14:42 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:14:42 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v3] In-Reply-To: <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> References: <8pyKCaIA6RDkPMSZs_T7g0JpvU4lL5cTKqTWPCZgMEs=.eeaabb19-d8ed-4acb-9796-6dff495ae0b8@github.com> <3V-y2KH4UaJnZr-TBd4UgquBy-pq7Y92-Fs_iAPEXU0=.05f03d53-2933-41aa-9c1e-c4177bc4db89@github.com> Message-ID: <2OQ64PbSvXOX5ryuplAmOmXLB2ZdsawET_9dpwFs3ww=.40b00c93-b591-4e00-98df-37b565fbb938@github.com> On Mon, 14 Apr 2025 07:22:12 GMT, Timofei Pushkin wrote: >> Yes, I've left this intentionally. Well spotted, I'll make sure that the final report will run as well. > > I actually think that this should be done in as a separate change then. I would propose to either document that `PrintNMTStatistics` prints on checkpoint, or make it have several modes (0 ? off, 1 ? print on exit, 2 ? print on checkpoint, 4 ? print on both exit and checkpoint), or add a separate `PrintNMTStatisticsOnCheckpoint`. The reason why I am proposing this is because I believe we had a related request from community for this somewhere, so it would be nice to have this documented. OK, while I've used this for diagnostics on ZGC it would really make sense for its own PR. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/219#discussion_r2041560779 From rvansa at openjdk.org Mon Apr 14 13:17:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 13:17:51 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v4] In-Reply-To: References: Message-ID: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/crac/pull/219/files - new: https://git.openjdk.org/crac/pull/219/files/75e2bc41..3b7d15e7 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=219&range=03 - incr: https://webrevs.openjdk.org/?repo=crac&pr=219&range=02-03 Stats: 12 lines in 6 files changed: 0 ins; 7 del; 5 mod Patch: https://git.openjdk.org/crac/pull/219.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/219/head:pull/219 PR: https://git.openjdk.org/crac/pull/219 From duke at openjdk.org Mon Apr 14 13:58:52 2025 From: duke at openjdk.org (duke) Date: Mon, 14 Apr 2025 13:58:52 GMT Subject: git: openjdk/crac: 8354514_remove_set_reopened: Add license header Message-ID: <1c82a037-5e5d-4a24-bb92-52fa1323f3b5@openjdk.org> Changeset: a15a667b Branch: 8354514_remove_set_reopened Author: Radim Vansa Date: 2025-04-14 15:54:45 +0000 URL: https://git.openjdk.org/crac/commit/a15a667b1bc6af639d2fa1216a0d510490320791 Add license header ! src/java.base/share/classes/jdk/internal/access/JavaNioChannelsSpiAccess.java From rvansa at openjdk.org Mon Apr 14 14:00:54 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 14:00:54 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: Message-ID: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> > Fix errors in JCK due to newly exposed methods. Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add license header ------------- Changes: - all: https://git.openjdk.org/crac/pull/222/files - new: https://git.openjdk.org/crac/pull/222/files/cbe08b5e..a15a667b Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=222&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=222&range=00-01 Stats: 24 lines in 1 file changed: 24 ins; 0 del; 0 mod Patch: https://git.openjdk.org/crac/pull/222.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/222/head:pull/222 PR: https://git.openjdk.org/crac/pull/222 From tpushkin at openjdk.org Mon Apr 14 14:06:22 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Mon, 14 Apr 2025 14:06:22 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:00:54 GMT, Radim Vansa wrote: >> Fix errors in JCK due to newly exposed methods. > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Add license header LGTM, but I wonder why the tests are not run? ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2801823465 From rvansa at openjdk.org Mon Apr 14 14:29:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 14:29:00 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:02:55 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Add license header > > LGTM, but I wonder why the tests are not run? > > I guess because the source branch is in this (openjdk/crac) repo, although not sure why exactly this is a problem. Anyway, I believe we should run the tests before merging. @TimPushkin Looks like GitHub was just stalled; the tests are running now. https://github.com/rvansa/crac/actions/runs/14447478163 ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2801897937 From rvansa at openjdk.org Mon Apr 14 17:48:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 14 Apr 2025 17:48:00 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:02:55 GMT, Timofei Pushkin wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Add license header > > LGTM, but I wonder why the tests are not run? > > I guess because the source branch is in this (openjdk/crac) repo, although not sure why exactly this is a problem. Anyway, I believe we should run the tests before merging. @TimPushkin The test overview looks crazy for some reason, but when I click through it seems to have passed. When you give the green light I'll integrate. ------------- PR Comment: https://git.openjdk.org/crac/pull/222#issuecomment-2802436767 From tpushkin at openjdk.org Tue Apr 15 06:22:07 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 15 Apr 2025 06:22:07 GMT Subject: [crac] RFR: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl [v2] In-Reply-To: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> References: <2CUfo9NcCW_hkyiDtD2q_lJIzrdHU9Rz2xlUCUdgCsc=.59ada9fb-f15c-4b62-8e58-b1123d0989a5@github.com> Message-ID: On Mon, 14 Apr 2025 14:00:54 GMT, Radim Vansa wrote: >> Fix errors in JCK due to newly exposed methods. > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Add license header "1059 successful checks" ? clearly everything has been tested here ? ------------- Marked as reviewed by tpushkin (Author). PR Review: https://git.openjdk.org/crac/pull/222#pullrequestreview-2766941974 From rvansa at openjdk.org Tue Apr 15 06:31:10 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 15 Apr 2025 06:31:10 GMT Subject: [crac] Integrated: 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl In-Reply-To: References: Message-ID: On Mon, 14 Apr 2025 12:00:30 GMT, Radim Vansa wrote: > Fix errors in JCK due to newly exposed methods. This pull request has now been integrated. Changeset: d64fb30c Author: Radim Vansa URL: https://git.openjdk.org/crac/commit/d64fb30c0874d93c986ad04ac3995a727b7a1ac8 Stats: 87 lines in 6 files changed: 63 ins; 18 del; 6 mod 8354514: [CRaC] Remove new methods from AbstractInterruptibleChannel and SocketImpl Reviewed-by: tpushkin ------------- PR: https://git.openjdk.org/crac/pull/222 From tpushkin at openjdk.org Tue Apr 15 06:37:16 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 15 Apr 2025 06:37:16 GMT Subject: [crac] RFR: 8353241: [CRaC] Support ZGC [v4] In-Reply-To: References: Message-ID: On Mon, 14 Apr 2025 13:17:51 GMT, Radim Vansa wrote: >> During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Marked as reviewed by tpushkin (Author). ------------- PR Review: https://git.openjdk.org/crac/pull/219#pullrequestreview-2766976843 From rvansa at openjdk.org Tue Apr 15 06:40:02 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 15 Apr 2025 06:40:02 GMT Subject: [crac] Integrated: 8353241: [CRaC] Support ZGC In-Reply-To: References: Message-ID: On Mon, 31 Mar 2025 08:18:43 GMT, Radim Vansa wrote: > During my tests with https://github.com/CRaC/example-spring-boot I could not get the image size as low as with G1, but the presented changes improve the image footprint. As as anecdotal data, the image is 177 MB with G1 while 215 MB with ZGC (fastdebug build, `-Xmx1G`). This pull request has now been integrated. Changeset: 64710538 Author: Radim Vansa URL: https://git.openjdk.org/crac/commit/647105388b66b7acedf03d049dc60323912a8fe7 Stats: 98 lines in 9 files changed: 78 ins; 7 del; 13 mod 8353241: [CRaC] Support ZGC Reviewed-by: tpushkin ------------- PR: https://git.openjdk.org/crac/pull/219 From dcherepanov at openjdk.org Tue Apr 15 08:16:56 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Tue, 15 Apr 2025 08:16:56 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 Message-ID: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Merge with jdk-25:3 There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` - added `#include "os_posix.hpp"` to define `RESTARTABLE` - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` Additional changes in `posix/attachListener_posix.cpp` - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result`
Conflicts commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) Merge: 410d0e168c3 23d6f747824 Author: Dmitry Cherepanov Date: Mon Apr 14 13:55:59 2025 +0400 Merge with jdk:jdk-25+3 diff --git a/.jcheck/conf b/.jcheck/conf remerge CONFLICT (content): Merge conflict in .jcheck/conf index 1d117b1d825..25bd8dd0b94 100644 --- a/.jcheck/conf +++ b/.jcheck/conf @@ -4,12 +4,7 @@ jbs=JDK version=25 [checks] -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) error=whitespace -======= -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright -warning=issuestitle,binary ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) [checks "reviewers"] committers=1 @@ -18,31 +13,3 @@ ignore=duke [census] version=0 domain=openjdk.org -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -======= - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp - -[checks "copyright"] -files=^(?!LICENSE|license.txt|.*.bin|.*.gif|.*.jpg|.*.png|.*.icon|.*.tiff|.*.dat|.*.patch|.*.wav|.*.class|.*-header|.*.jar|).* -oracle_locator=.*Copyright (c)(.*)Oracle and/or its affiliates. All rights reserved. -oracle_validator=.*Copyright (c) (\d{4})(?:, (\d{4}))?, Oracle and/or its affiliates. All rights reserved. ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) diff --git a/src/hotspot/os/posix/attachListener_posix.cpp b/src/hotspot/os/posix/attachListener_posix.cpp remerge CONFLICT (content): Merge conflict in src/hotspot/os/posix/attachListener_posix.cpp index f1ad8d81a14..49b53130608 100644 --- a/src/hotspot/os/posix/attachListener_posix.cpp +++ b/src/hotspot/os/posix/attachListener_posix.cpp @@ -45,10 +45,6 @@ #if INCLUDE_SERVICES #ifndef AIX -#ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) -#endif - // The attach mechanism on Linux and BSD uses a UNIX domain socket. An attach // listener thread is created at startup or is created on-demand via a signal // from the client tool. The attach listener creates a socket and binds it to a @@ -65,102 +61,6 @@ // obtain the credentials of client. We check that the effective uid // of the client matches this process. -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -======= -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - public: - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - static PosixAttachOperation* dequeue(); -}; - -class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { -private: - int _socket; -public: - SocketChannel(int socket) : _socket(socket) {} - ~SocketChannel() { - close(); - } - - bool opened() const { - return _socket != -1; - } - - void close() { - if (opened()) { - ::close(_socket); - _socket = -1; - } - } - - // RequestReader - int read(void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::read(_socket, buffer, (size_t)size), n); - return checked_cast(n); - } - - // ReplyWriter - int write(const void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::write(_socket, buffer, size), n); - return checked_cast(n); - } - // called after writing all data - void flush() override { - ::shutdown(_socket, SHUT_RDWR); - } -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - SocketChannel _socket_channel; - - public: - void complete(jint res, bufferedStream* st) override; - - PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { - } - - bool read_request() { - return AttachOperation::read_request(&_socket_channel, &_socket_channel); - } -}; - ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // statics char PosixAttachListener::_path[UNIX_PATH_MAX]; bool PosixAttachListener::_has_path; @@ -318,22 +218,6 @@ PosixAttachOperation* PosixAttachListener::dequeue() { } } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - // An operation completion is splitted into two parts. // For proper handling the jcmd connection at CRaC checkpoint action. // An effectively_complete_raw is called in checkpoint processing, before criu engine calls, for properly closing the socket. @@ -346,8 +230,6 @@ void PosixAttachOperation::complete(jint result, bufferedStream* st) { delete this; } -======= ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // Complete an operation by sending the operation result and any result // output to the client. At this time the socket is in blocking mode so // potentially we can block if there is a lot of data and the client is @@ -363,7 +245,6 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* return; } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) // write operation result Thread* thread = Thread::current(); if (thread->is_Java_thread()) { @@ -376,24 +257,10 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* } void PosixAttachOperation::write_operation_result(jint result, bufferedStream* st) { - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), SHUT_RDWR); - } - - // done - ::close(this->socket()); - st->reset(); -======= write_reply(&_socket_channel, result, st); - delete this; ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) + _socket_channel.close(); + st->reset(); } static void assert_listener_thread() { diff --git a/src/hotspot/os/posix/attachListener_posix.hpp b/src/hotspot/os/posix/attachListener_posix.hpp index b945020e20d..a0fca688b5f 100644 --- a/src/hotspot/os/posix/attachListener_posix.hpp +++ b/src/hotspot/os/posix/attachListener_posix.hpp @@ -36,7 +36,7 @@ class PosixAttachListener; #include #ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(((struct sockaddr_un *)0)->sun_path) +#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) #endif class PosixAttachListener: AllStatic { @@ -53,17 +53,7 @@ class PosixAttachListener: AllStatic { // this is for proper reporting JDK.Chekpoint processing to jcmd peer static PosixAttachOperation* _current_op; - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - static void set_path(char* path) { if (path == nullptr) { _path[0] = '\0'; @@ -84,9 +74,6 @@ class PosixAttachListener: AllStatic { static bool has_path() { return _has_path; } static int listener() { return _listener; } - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - static PosixAttachOperation* dequeue(); static PosixAttachOperation* get_current_op(); static void reset_current_op(); diff --git a/src/hotspot/os/posix/posixAttachOperation.hpp b/src/hotspot/os/posix/posixAttachOperation.hpp index 1d031d882da..10f253a3f76 100644 --- a/src/hotspot/os/posix/posixAttachOperation.hpp +++ b/src/hotspot/os/posix/posixAttachOperation.hpp @@ -26,31 +26,79 @@ #ifndef OS_POSIX_POSIXATTACHOPERATION_HPP #define OS_POSIX_POSIXATTACHOPERATION_HPP +#include "os_posix.hpp" #include "services/attachListener.hpp" class PosixAttachOperation; #if INCLUDE_SERVICES +class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { +private: + int _socket; +public: + SocketChannel(int socket) : _socket(socket) {} + ~SocketChannel() { + close(); + } + + int socket() const { + return _socket; + } + + bool opened() const { + return _socket != -1; + } + + void close() { + if (opened()) { + ::close(_socket); + _socket = -1; + } + } + + // RequestReader + int read(void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::read(_socket, buffer, (size_t)size), n); + return checked_cast(n); + } + + // ReplyWriter + int write(const void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::write(_socket, buffer, size), n); + return checked_cast(n); + } + // called after writing all data + void flush() override { + ::shutdown(_socket, SHUT_RDWR); + } +}; + class PosixAttachOperation: public AttachOperation { private: // the connection to the client - int _socket; + SocketChannel _socket_channel; bool _effectively_completed; void write_operation_result(jint result, bufferedStream* st); public: - void complete(jint res, bufferedStream* st); + void complete(jint res, bufferedStream* st) override; void effectively_complete_raw(jint res, bufferedStream* st); bool is_effectively_completed() { return _effectively_completed; } - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } + int socket() { + return _socket_channel.socket();; + } - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); + PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { _effectively_completed = false; } + + bool read_request() { + return AttachOperation::read_request(&_socket_channel, &_socket_channel); + } }; #endif // INCLUDE_SERVICES
Conflicts (diff3) commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) Merge: 410d0e168c3 23d6f747824 Author: Dmitry Cherepanov Date: Mon Apr 14 13:55:59 2025 +0400 Merge with jdk:jdk-25+3 diff --git a/.jcheck/conf b/.jcheck/conf remerge CONFLICT (content): Merge conflict in .jcheck/conf index 4816067cada..25bd8dd0b94 100644 --- a/.jcheck/conf +++ b/.jcheck/conf @@ -4,15 +4,7 @@ jbs=JDK version=25 [checks] -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) error=whitespace -||||||| ceb4366ebf0 -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists -warning=issuestitle,binary -======= -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright -warning=issuestitle,binary ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) [checks "reviewers"] committers=1 @@ -21,52 +13,3 @@ ignore=duke [census] version=0 domain=openjdk.org -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -||||||| ceb4366ebf0 - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp -======= - -[checks "whitespace"] -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.*.m4|.*.ac|Makefile -ignore-tabs=.*.gmk|Makefile - -[checks "merge"] -message=Merge - -[checks "reviewers"] -reviewers=1 -ignore=duke - -[checks "committer"] -role=committer - -[checks "issues"] -pattern=^([124-8][0-9]{6}): (\S.*)$ - -[checks "problemlists"] -dirs=test/jdk|test/langtools|test/lib-test|test/hotspot/jtreg|test/jaxp - -[checks "copyright"] -files=^(?!LICENSE|license.txt|.*.bin|.*.gif|.*.jpg|.*.png|.*.icon|.*.tiff|.*.dat|.*.patch|.*.wav|.*.class|.*-header|.*.jar|).* -oracle_locator=.*Copyright (c)(.*)Oracle and/or its affiliates. All rights reserved. -oracle_validator=.*Copyright (c) (\d{4})(?:, (\d{4}))?, Oracle and/or its affiliates. All rights reserved. ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) diff --git a/src/hotspot/os/posix/attachListener_posix.cpp b/src/hotspot/os/posix/attachListener_posix.cpp remerge CONFLICT (content): Merge conflict in src/hotspot/os/posix/attachListener_posix.cpp index b98f7d437a6..49b53130608 100644 --- a/src/hotspot/os/posix/attachListener_posix.cpp +++ b/src/hotspot/os/posix/attachListener_posix.cpp @@ -45,10 +45,6 @@ #if INCLUDE_SERVICES #ifndef AIX -#ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) -#endif - // The attach mechanism on Linux and BSD uses a UNIX domain socket. An attach // listener thread is created at startup or is created on-demand via a signal // from the client tool. The attach listener creates a socket and binds it to a @@ -65,170 +61,6 @@ // obtain the credentials of client. We check that the effective uid // of the client matches this process. -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -||||||| ceb4366ebf0 -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - - static PosixAttachOperation* dequeue(); -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - int _socket; - - public: - void complete(jint res, bufferedStream* st); - - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } - - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); - } -}; - -======= -// forward reference -class PosixAttachOperation; - -class PosixAttachListener: AllStatic { - private: - // the path to which we bind the UNIX domain socket - static char _path[UNIX_PATH_MAX]; - static bool _has_path; - - // the file descriptor for the listening socket - static volatile int _listener; - - static bool _atexit_registered; - - public: - static void set_path(char* path) { - if (path == nullptr) { - _path[0] = '\0'; - _has_path = false; - } else { - strncpy(_path, path, UNIX_PATH_MAX); - _path[UNIX_PATH_MAX-1] = '\0'; - _has_path = true; - } - } - - static void set_listener(int s) { _listener = s; } - - // initialize the listener, returns 0 if okay - static int init(); - - static char* path() { return _path; } - static bool has_path() { return _has_path; } - static int listener() { return _listener; } - - static PosixAttachOperation* dequeue(); -}; - -class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { -private: - int _socket; -public: - SocketChannel(int socket) : _socket(socket) {} - ~SocketChannel() { - close(); - } - - bool opened() const { - return _socket != -1; - } - - void close() { - if (opened()) { - ::close(_socket); - _socket = -1; - } - } - - // RequestReader - int read(void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::read(_socket, buffer, (size_t)size), n); - return checked_cast(n); - } - - // ReplyWriter - int write(const void* buffer, int size) override { - ssize_t n; - RESTARTABLE(::write(_socket, buffer, size), n); - return checked_cast(n); - } - // called after writing all data - void flush() override { - ::shutdown(_socket, SHUT_RDWR); - } -}; - -class PosixAttachOperation: public AttachOperation { - private: - // the connection to the client - SocketChannel _socket_channel; - - public: - void complete(jint res, bufferedStream* st) override; - - PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { - } - - bool read_request() { - return AttachOperation::read_request(&_socket_channel, &_socket_channel); - } -}; - ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // statics char PosixAttachListener::_path[UNIX_PATH_MAX]; bool PosixAttachListener::_has_path; @@ -386,22 +218,6 @@ PosixAttachOperation* PosixAttachListener::dequeue() { } } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - // An operation completion is splitted into two parts. // For proper handling the jcmd connection at CRaC checkpoint action. // An effectively_complete_raw is called in checkpoint processing, before criu engine calls, for properly closing the socket. @@ -414,24 +230,6 @@ void PosixAttachOperation::complete(jint result, bufferedStream* st) { delete this; } -||||||| ceb4366ebf0 -// write the given buffer to the socket -int PosixAttachListener::write_fully(int s, char* buf, size_t len) { - do { - ssize_t n = ::write(s, buf, len); - if (n == -1) { - if (errno != EINTR) return -1; - } else { - buf += n; - len -= n; - } - } - while (len > 0); - return 0; -} - -======= ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) // Complete an operation by sending the operation result and any result // output to the client. At this time the socket is in blocking mode so // potentially we can block if there is a lot of data and the client is @@ -447,7 +245,6 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* return; } -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) // write operation result Thread* thread = Thread::current(); if (thread->is_Java_thread()) { @@ -460,40 +257,10 @@ void PosixAttachOperation::effectively_complete_raw(jint result, bufferedStream* } void PosixAttachOperation::write_operation_result(jint result, bufferedStream* st) { - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), SHUT_RDWR); - } - - // done - ::close(this->socket()); - st->reset(); -||||||| ceb4366ebf0 - // write operation result - char msg[32]; - os::snprintf_checked(msg, sizeof(msg), "%d\n", result); - int rc = PosixAttachListener::write_fully(this->socket(), msg, strlen(msg)); - - // write any result data - if (rc == 0) { - PosixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size()); - ::shutdown(this->socket(), 2); - } - - // done - ::close(this->socket()); - - delete this; -======= write_reply(&_socket_channel, result, st); - delete this; ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) + _socket_channel.close(); + st->reset(); } static void assert_listener_thread() { diff --git a/src/hotspot/os/posix/attachListener_posix.hpp b/src/hotspot/os/posix/attachListener_posix.hpp index b945020e20d..a0fca688b5f 100644 --- a/src/hotspot/os/posix/attachListener_posix.hpp +++ b/src/hotspot/os/posix/attachListener_posix.hpp @@ -36,7 +36,7 @@ class PosixAttachListener; #include #ifndef UNIX_PATH_MAX -#define UNIX_PATH_MAX sizeof(((struct sockaddr_un *)0)->sun_path) +#define UNIX_PATH_MAX sizeof(sockaddr_un::sun_path) #endif class PosixAttachListener: AllStatic { @@ -53,17 +53,7 @@ class PosixAttachListener: AllStatic { // this is for proper reporting JDK.Chekpoint processing to jcmd peer static PosixAttachOperation* _current_op; - // reads a request from the given connected socket - static PosixAttachOperation* read_request(int s); - public: - enum { - ATTACH_PROTOCOL_VER = 1 // protocol version - }; - enum { - ATTACH_ERROR_BADVERSION = 101 // error codes - }; - static void set_path(char* path) { if (path == nullptr) { _path[0] = '\0'; @@ -84,9 +74,6 @@ class PosixAttachListener: AllStatic { static bool has_path() { return _has_path; } static int listener() { return _listener; } - // write the given buffer to a socket - static int write_fully(int s, char* buf, size_t len); - static PosixAttachOperation* dequeue(); static PosixAttachOperation* get_current_op(); static void reset_current_op(); diff --git a/src/hotspot/os/posix/posixAttachOperation.hpp b/src/hotspot/os/posix/posixAttachOperation.hpp index 1d031d882da..10f253a3f76 100644 --- a/src/hotspot/os/posix/posixAttachOperation.hpp +++ b/src/hotspot/os/posix/posixAttachOperation.hpp @@ -26,31 +26,79 @@ #ifndef OS_POSIX_POSIXATTACHOPERATION_HPP #define OS_POSIX_POSIXATTACHOPERATION_HPP +#include "os_posix.hpp" #include "services/attachListener.hpp" class PosixAttachOperation; #if INCLUDE_SERVICES +class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter { +private: + int _socket; +public: + SocketChannel(int socket) : _socket(socket) {} + ~SocketChannel() { + close(); + } + + int socket() const { + return _socket; + } + + bool opened() const { + return _socket != -1; + } + + void close() { + if (opened()) { + ::close(_socket); + _socket = -1; + } + } + + // RequestReader + int read(void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::read(_socket, buffer, (size_t)size), n); + return checked_cast(n); + } + + // ReplyWriter + int write(const void* buffer, int size) override { + ssize_t n; + RESTARTABLE(::write(_socket, buffer, size), n); + return checked_cast(n); + } + // called after writing all data + void flush() override { + ::shutdown(_socket, SHUT_RDWR); + } +}; + class PosixAttachOperation: public AttachOperation { private: // the connection to the client - int _socket; + SocketChannel _socket_channel; bool _effectively_completed; void write_operation_result(jint result, bufferedStream* st); public: - void complete(jint res, bufferedStream* st); + void complete(jint res, bufferedStream* st) override; void effectively_complete_raw(jint res, bufferedStream* st); bool is_effectively_completed() { return _effectively_completed; } - void set_socket(int s) { _socket = s; } - int socket() const { return _socket; } + int socket() { + return _socket_channel.socket();; + } - PosixAttachOperation(char* name) : AttachOperation(name) { - set_socket(-1); + PosixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) { _effectively_completed = false; } + + bool read_request() { + return AttachOperation::read_request(&_socket_channel, &_socket_channel); + } }; #endif // INCLUDE_SERVICES
------------- Commit messages: - Merge with jdk:jdk-25+3 - 8346463: Add test coverage for deploying the default provider as a module - 8346306: Unattached thread can cause crash during VM exit if it calls wait_if_vm_exited - 8340401: DcmdMBeanPermissionsTest.java and SystemDumpMapTest.java fail with assert(_stack_base != nullptr) failed: Sanity check - 8346475: RISC-V: Small improvement for MacroAssembler::ctzc_bit - 8346016: Problemlist vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_a in virtual thread mode - 8346132: fallbacklinker.c failed compilation due to unused variable - 8346570: SM cleanup of tests for Beans and Serialization - 8346532: XXXVector::rearrangeTemplate misses null check - 8346300: Add @Test annotation to TCKZoneId.test_constant_OLD_IDS_POST_2024b test - ... and 84 more: https://git.openjdk.org/crac/compare/410d0e16...c54dd827 The webrevs contain the adjustments done while merging with regards to each parent branch: - crac: https://webrevs.openjdk.org/?repo=crac&pr=224&range=00.0 - jdk:jdk-25+3: https://webrevs.openjdk.org/?repo=crac&pr=224&range=00.1 Changes: https://git.openjdk.org/crac/pull/224/files Stats: 14344 lines in 943 files changed: 9675 ins; 2416 del; 2253 mod Patch: https://git.openjdk.org/crac/pull/224.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/224/head:pull/224 PR: https://git.openjdk.org/crac/pull/224 From tpushkin at openjdk.org Wed Apr 16 13:19:58 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 16 Apr 2025 13:19:58 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail Message-ID: Fixes the failing test, for simplicity pretending that both `jdk.crac` and `jdk.management/jdk.crac.management` were added in JDK 24 and before that there was no CRaC in the JDK. Otherwise we would need to retroactively generate symbols for JDKs 17?23 which is a decent amount of work (there are no public CRaC builds for some of these versions). JDK 24 symbols were updated this way: 1. Create a custom build from the last OpenJDK 24 CRaC commit 884d0746b168550f13bdc687b1d96d468aec4411 (the last commit before JDK 25 was merged). 2. Update the symbols from that build using `make/scripts/generate-symbol-data.sh`. 3. Manually remove the CRaC methods removed in d64fb30c0874d93c986ad04ac3995a727b7a1ac8 from the symbols. Also adds the since-checking tests to CI. I initially wanted to also add a since-checking test for `jdk.crac` module but `SinceChecker` seems to have a bug which makes the test fail with ?module: jdk.crac: `@since` version is 24 but the element exists before JDK 10?. I believe this is a `SinceChecker` bug because the same happens for other modules added after JDK 9 without a legacy preview, e.g. `jdk.graal.compiler`. ------------- Commit messages: - Fix since checker test Changes: https://git.openjdk.org/crac/pull/225/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=225&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354679 Stats: 80 lines in 7 files changed: 75 ins; 0 del; 5 mod Patch: https://git.openjdk.org/crac/pull/225.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/225/head:pull/225 PR: https://git.openjdk.org/crac/pull/225 From rvansa at openjdk.org Thu Apr 17 07:26:06 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 17 Apr 2025 07:26:06 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail In-Reply-To: References: Message-ID: On Wed, 16 Apr 2025 13:13:20 GMT, Timofei Pushkin wrote: > Fixes the failing test, for simplicity pretending that both `jdk.crac` and `jdk.management/jdk.crac.management` were added in JDK 24 and before that there was no CRaC in the JDK. Otherwise we would need to retroactively generate symbols for JDKs 17?23 which is a decent amount of work (there are no public CRaC builds for some of these versions). > > JDK 24 symbols were updated this way: > 1. Create a custom build from the last OpenJDK 24 CRaC commit 884d0746b168550f13bdc687b1d96d468aec4411 (the last commit before JDK 25 was merged). > 2. Update the symbols from that build using `make/scripts/generate-symbol-data.sh`. > 3. Manually remove the CRaC methods removed in d64fb30c0874d93c986ad04ac3995a727b7a1ac8 from the symbols. > > Also adds the since-checking tests to CI. > > I initially wanted to also add a since-checking test for `jdk.crac` module but `SinceChecker` seems to have a bug which makes the test fail with ?module: jdk.crac: `@since` version is 24 but the element exists before JDK 10?. I believe this is a `SinceChecker` bug because the same happens for other modules added after JDK 9 without a legacy preview, e.g. `jdk.graal.compiler`. Marked as reviewed by rvansa (Committer). I think that it's OK to add the symbols to 24 only; CRaC versioning can be considered orthogonal to upstream JDK versioning. At this stage we keep backward compatibility for users' convenience, not as a rule. Testing API compatibility is useful as CRaC'ed JDK should be 100% usable as a drop-in replacement for upstream JDKs. If `jdk.crac` can't be addressed, we can leave it out of scope. However could you file a bug for `SinceChecker` in upstream? Last but not least; please outline the steps (can be just a comment on this PR) what will we have to do when we rebase on top of JDK 26. ------------- PR Review: https://git.openjdk.org/crac/pull/225#pullrequestreview-2774834418 PR Comment: https://git.openjdk.org/crac/pull/225#issuecomment-2812026344 From tpushkin at openjdk.org Thu Apr 17 08:36:02 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Thu, 17 Apr 2025 08:36:02 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 07:21:24 GMT, Radim Vansa wrote: > However could you file a bug for SinceChecker in upstream? Filed the `SinceChecker` bug as [JDK-8354921](https://bugs.openjdk.org/browse/JDK-8354921). > Last but not least; please outline the steps (can be just a comment on this PR) what will we have to do when we rebase on top of JDK 26. Whenever we merge an update of the symbols from the mainline (can happen multiple times between a rampdown and the respective GA) we'll need to first overwrite our symbols with the incoming changes and then run `make/scripts/generate-symbol-data.sh` to update the symbols with CRaC's ones. Whenever we add a method to the public CRaC Java API we also need to run the script to update the symbols. I also found [JDK-8345212](https://bugs.openjdk.org/browse/JDK-8345212), which we don't yet have in this fork. It should allow us to continue using `@since TBD` (or `@since CRaC`, or whatever) and the test will pretend like the CRaC API has been added in the current version ? we won't need to do the symbols updates. @rvansa WDYT? ------------- PR Comment: https://git.openjdk.org/crac/pull/225#issuecomment-2812179267 From rvansa at openjdk.org Tue Apr 22 07:01:08 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 22 Apr 2025 07:01:08 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 08:33:15 GMT, Timofei Pushkin wrote: > Whenever we merge an update of the symbols from the mainline (can happen multiple times between a rampdown and the respective GA) we'll need to first overwrite our symbols with the incoming changes and then run make/scripts/generate-symbol-data.sh to update the symbols with CRaC's ones. OK, I thought that we could somehow preserve the oldest CRaC JDK version where the symbol appeared. But we don't need to change the taglet on too many places, so let's use this as is. > I also found [JDK-8345212](https://bugs.openjdk.org/browse/JDK-8345212), which we don't yet have in this fork. It should allow us to continue using @since TBD (or @since CRaC, or whatever) and the test will pretend like the CRaC API has been added in the current version ? we won't need to do the symbols updates. That sounds useful, but we don't need to merge changes out-of-band, this PR is good as it is. Let's integrate this and use the `-ignoreSince` the next time this code needs updating anyway. ------------- PR Comment: https://git.openjdk.org/crac/pull/225#issuecomment-2820259677 From tpushkin at openjdk.org Tue Apr 22 07:21:03 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 22 Apr 2025 07:21:03 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail In-Reply-To: References: Message-ID: On Tue, 22 Apr 2025 06:57:50 GMT, Radim Vansa wrote: > OK, I thought that we could somehow preserve the oldest CRaC JDK version where the symbol appeared. But we don't need to change the taglet on too many places, so let's use this as is. Symbols are fixed after GA so after GA they'll be preserved. But between rampdown and GA we'll need to be updating them (only the version that has not yet been GA-ed) when they are updated upstream. Example for JDK 26: 1. 25 rampdown happens, 26 is created and symbols for 25 along with it ? when merging this, we'll take the new 25 symbols and run the script to update them from the last 25-CRaC build. 2. 25's symbols get updated ? when merging this, we'll overwrite our 25's symbols with the incoming ones, then run the script again to update them with the last 25-CRaC build (the same build as in step one), then fixup the symbols manually to remove the changes unrelated to CRaC (there will be such changes because the last 25-CRaC build will be based on an older 25 than the updated symbols). This step can happen a few times. 3. 25 GA happens ? after this moment 25's symbols are fixed, we won't need to touch them ever again. > That sounds useful, but we don't need to merge changes out-of-band, this PR is good as it is. Let's integrate this and use the -ignoreSince the next time this code needs updating anyway. I was proposing to wait until that change gets merged-in. I believe we need to decide now: we either merge this PR and start generating CRaC symbols from now on or wait for the `SinceChecker` update that should allow us to continue without generating the symbols (I actually need to try this locally by cherry-picking to see how it works before we decide). It will be weird if now we start generating the symbols but then we suddenly stop. ------------- PR Comment: https://git.openjdk.org/crac/pull/225#issuecomment-2820335231 From tpushkin at openjdk.org Tue Apr 22 07:24:23 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 22 Apr 2025 07:24:23 GMT Subject: [crac] RFR: 8354679: [CRaC] jdk.crac.management makes JdkManagementCheckSince fail In-Reply-To: References: Message-ID: On Wed, 16 Apr 2025 13:13:20 GMT, Timofei Pushkin wrote: > Fixes the failing test, for simplicity pretending that both `jdk.crac` and `jdk.management/jdk.crac.management` were added in JDK 24 and before that there was no CRaC in the JDK. Otherwise we would need to retroactively generate symbols for JDKs 17?23 which is a decent amount of work (there are no public CRaC builds for some of these versions). > > JDK 24 symbols were updated this way: > 1. Create a custom build from the last OpenJDK 24 CRaC commit 884d0746b168550f13bdc687b1d96d468aec4411 (the last commit before JDK 25 was merged). > 2. Update the symbols from that build using `make/scripts/generate-symbol-data.sh`. > 3. Manually remove the CRaC methods removed in d64fb30c0874d93c986ad04ac3995a727b7a1ac8 from the symbols. > > Also adds the since-checking tests to CI. > > I initially wanted to also add a since-checking test for `jdk.crac` module but `SinceChecker` seems to have a bug which makes the test fail with ?module: jdk.crac: `@since` version is 24 but the element exists before JDK 10?. I believe this is a `SinceChecker` bug because the same happens for other modules added after JDK 9 without a legacy preview, e.g. `jdk.graal.compiler`. Oh, actually, I believe we are still merging 25 updates that occurred before 24 GA, so we'll go through the steps 2 and 3 above even for 24's symbols. ------------- PR Comment: https://git.openjdk.org/crac/pull/225#issuecomment-2820347779 From jkratochvil at openjdk.org Tue Apr 22 10:25:23 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 22 Apr 2025 10:25:23 GMT Subject: [crac] RFR: 8344647: Make java.se participate in the preview language feature `requires transitive java.base` Message-ID: CRaC trunk (647105388b66b7acedf03d049dc60323912a8fe7) fails to compile for me the same way as `jdk-25+2`: === Output from failing command(s) repeated here === * For target buildtools_depend__the.COMPILE_DEPEND_batch: error: cannot access module-info bad class file: /modules/java.se/module-info.class bad requires flag: ACC_TRANSITIVE (0x0020 Please remove or make sure it appears in the correct subdirectory of the classpath. 1 error * For target buildtools_jdk_tools_classes__the.BUILD_TOOLS_JDK_batch: error: cannot access module-info bad class file: /modules/java.se/module-info.class bad requires flag: ACC_TRANSITIVE (0x0020 Please remove or make sure it appears in the correct subdirectory of the classpath. 1 error It has been fixed upstream which is what I am backporting. ------------- Commit messages: - 8344647: Make java.se participate in the preview language feature `requires transitive java.base` Changes: https://git.openjdk.org/crac/pull/226/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=226&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8344647 Stats: 61 lines in 8 files changed: 51 ins; 4 del; 6 mod Patch: https://git.openjdk.org/crac/pull/226.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/226/head:pull/226 PR: https://git.openjdk.org/crac/pull/226 From jkratochvil at openjdk.org Tue Apr 22 10:30:09 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 22 Apr 2025 10:30:09 GMT Subject: [crac] RFR: 8344647: Make java.se participate in the preview language feature `requires transitive java.base` In-Reply-To: References: Message-ID: On Tue, 22 Apr 2025 10:20:26 GMT, Jan Kratochvil wrote: > CRaC trunk (647105388b66b7acedf03d049dc60323912a8fe7) fails to compile for me the same way as `jdk-25+2`: > > === Output from failing command(s) repeated here === > * For target buildtools_depend__the.COMPILE_DEPEND_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > * For target buildtools_jdk_tools_classes__the.BUILD_TOOLS_JDK_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > > It has been fixed upstream which is what I am backporting. It would get also fixed by #224. ------------- PR Comment: https://git.openjdk.org/crac/pull/226#issuecomment-2820877098 From rvansa at openjdk.org Tue Apr 22 13:12:05 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 22 Apr 2025 13:12:05 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 In-Reply-To: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Tue, 15 Apr 2025 08:11:19 GMT, Dmitry Cherepanov wrote: > Merge with jdk-25:3 > > There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 > > https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to > - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` > - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` > > As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` > - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` > - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` > - added `#include "os_posix.hpp"` to define `RESTARTABLE` > - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) > - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` > > Additional changes in `posix/attachListener_posix.cpp` > - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` > >
> > Conflicts > > > commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) > Merge: 410d0e168c3 23d6f747824 > Author: Dmitry Cherepanov > Date: Mon Apr 14 13:55:59 2025 +0400 > > Merge with jdk:jdk-25+3 > > diff --git a/.jcheck/conf b/.jcheck/conf > remerge CONFLICT (content): Merge conflict in .jcheck/conf > index 1d117b1d825..25bd8dd0b94 100644 > --- a/.jcheck/conf > +++ b/.jcheck/conf > @@ -4,12 +4,7 @@ jbs=JDK > version=25 > > [checks] > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > error=whitespace > -======= > -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright > -warning=issuestitle,binary > ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) > > [checks "reviewers"] > committers=1 > @@ -18,31 +13,3 @@ ignore=duke > [census] > version=0 > domain=openjdk.org > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > -======= > - > -[checks "whitespace"] > -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.... I'll try to have a look this evening. ------------- PR Comment: https://git.openjdk.org/crac/pull/224#issuecomment-2821280171 From rvansa at openjdk.org Tue Apr 22 20:37:09 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 22 Apr 2025 20:37:09 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 In-Reply-To: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Tue, 15 Apr 2025 08:11:19 GMT, Dmitry Cherepanov wrote: > Merge with jdk-25:3 > > There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 > > https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to > - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` > - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` > > As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` > - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` > - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` > - added `#include "os_posix.hpp"` to define `RESTARTABLE` > - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) > - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` > > Additional changes in `posix/attachListener_posix.cpp` > - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` > >
> > Conflicts > > > commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) > Merge: 410d0e168c3 23d6f747824 > Author: Dmitry Cherepanov > Date: Mon Apr 14 13:55:59 2025 +0400 > > Merge with jdk:jdk-25+3 > > diff --git a/.jcheck/conf b/.jcheck/conf > remerge CONFLICT (content): Merge conflict in .jcheck/conf > index 1d117b1d825..25bd8dd0b94 100644 > --- a/.jcheck/conf > +++ b/.jcheck/conf > @@ -4,12 +4,7 @@ jbs=JDK > version=25 > > [checks] > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > error=whitespace > -======= > -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright > -warning=issuestitle,binary > ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) > > [checks "reviewers"] > committers=1 > @@ -18,31 +13,3 @@ ignore=duke > [census] > version=0 > domain=openjdk.org > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > -======= > - > -[checks "whitespace"] > -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.... The changes seem to be applied correctly. I wonder though, whether should we upstream our changes to those listeners (refactoring) to synchronize with JDK and not have to deal with conflicts all the time. However we would still run into trouble during backports, as we would like to have CRaC changes synchronized across JDK majors (or not, if we don't expect further development in the Attach Listeners?) src/hotspot/os/posix/posixAttachOperation.hpp line 91: > 89: bool is_effectively_completed() { return _effectively_completed; } > 90: > 91: int socket() { the method should be `const` ------------- PR Review: https://git.openjdk.org/crac/pull/224#pullrequestreview-2785248837 PR Review Comment: https://git.openjdk.org/crac/pull/224#discussion_r2054802697 From dcherepanov at openjdk.org Wed Apr 23 07:12:22 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Wed, 23 Apr 2025 07:12:22 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 [v2] In-Reply-To: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: > Merge with jdk-25:3 > > There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 > > https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to > - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` > - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` > > As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` > - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` > - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` > - added `#include "os_posix.hpp"` to define `RESTARTABLE` > - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) > - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` > > Additional changes in `posix/attachListener_posix.cpp` > - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` > >
> > Conflicts > > > commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) > Merge: 410d0e168c3 23d6f747824 > Author: Dmitry Cherepanov > Date: Mon Apr 14 13:55:59 2025 +0400 > > Merge with jdk:jdk-25+3 > > diff --git a/.jcheck/conf b/.jcheck/conf > remerge CONFLICT (content): Merge conflict in .jcheck/conf > index 1d117b1d825..25bd8dd0b94 100644 > --- a/.jcheck/conf > +++ b/.jcheck/conf > @@ -4,12 +4,7 @@ jbs=JDK > version=25 > > [checks] > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > error=whitespace > -======= > -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright > -warning=issuestitle,binary > ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) > > [checks "reviewers"] > committers=1 > @@ -18,31 +13,3 @@ ignore=duke > [census] > version=0 > domain=openjdk.org > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > -======= > - > -[checks "whitespace"] > -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.... Dmitry Cherepanov has updated the pull request incrementally with one additional commit since the last revision: socket function should be const ------------- Changes: - all: https://git.openjdk.org/crac/pull/224/files - new: https://git.openjdk.org/crac/pull/224/files/c54dd827..a4b8b754 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=224&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=224&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/crac/pull/224.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/224/head:pull/224 PR: https://git.openjdk.org/crac/pull/224 From dcherepanov at openjdk.org Wed Apr 23 07:17:00 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Wed, 23 Apr 2025 07:17:00 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 [v2] In-Reply-To: References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Tue, 22 Apr 2025 20:34:24 GMT, Radim Vansa wrote: > The changes seem to be applied correctly. I wonder though, whether should we upstream our changes to those listeners (refactoring) to synchronize with JDK and not have to deal with conflicts all the time. However we would still run into trouble during backports, as we would like to have CRaC changes synchronized across JDK majors (or not, if we don't expect further development in the Attach Listeners?) Agree that it's better to upstream our changes. I'll follow up on it. I hope that upstreaming work shouldn't block the merge. > src/hotspot/os/posix/posixAttachOperation.hpp line 91: > >> 89: bool is_effectively_completed() { return _effectively_completed; } >> 90: >> 91: int socket() { > > the method should be `const` Fixed, thanks. ------------- PR Comment: https://git.openjdk.org/crac/pull/224#issuecomment-2823291185 PR Review Comment: https://git.openjdk.org/crac/pull/224#discussion_r2055400176 From rvansa at openjdk.org Wed Apr 23 07:57:09 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 23 Apr 2025 07:57:09 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 [v2] In-Reply-To: References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Wed, 23 Apr 2025 07:12:22 GMT, Dmitry Cherepanov wrote: >> Merge with jdk-25:3 >> >> There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 >> >> https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to >> - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` >> - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` >> >> As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` >> - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` >> - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` >> - added `#include "os_posix.hpp"` to define `RESTARTABLE` >> - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) >> - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` >> >> Additional changes in `posix/attachListener_posix.cpp` >> - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` >> >>
>> >> Conflicts >> >> >> commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) >> Merge: 410d0e168c3 23d6f747824 >> Author: Dmitry Cherepanov >> Date: Mon Apr 14 13:55:59 2025 +0400 >> >> Merge with jdk:jdk-25+3 >> >> diff --git a/.jcheck/conf b/.jcheck/conf >> remerge CONFLICT (content): Merge conflict in .jcheck/conf >> index 1d117b1d825..25bd8dd0b94 100644 >> --- a/.jcheck/conf >> +++ b/.jcheck/conf >> @@ -4,12 +4,7 @@ jbs=JDK >> version=25 >> >> [checks] >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> error=whitespace >> -======= >> -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright >> -warning=issuestitle,binary >> ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) >> >> [checks "reviewers"] >> committers=1 >> @@ -18,31 +13,3 @@ ignore=duke >> [census] >> version=0 >> domain=openjdk.org >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> -======= >> - >> -[chec... > > Dmitry Cherepanov has updated the pull request incrementally with one additional commit since the last revision: > > socket function should be const Marked as reviewed by rvansa (Committer). ------------- PR Review: https://git.openjdk.org/crac/pull/224#pullrequestreview-2786314228 From duke at openjdk.org Wed Apr 23 08:19:00 2025 From: duke at openjdk.org (duke) Date: Wed, 23 Apr 2025 08:19:00 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 [v2] In-Reply-To: References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Wed, 23 Apr 2025 07:12:22 GMT, Dmitry Cherepanov wrote: >> Merge with jdk-25:3 >> >> There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 >> >> https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to >> - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` >> - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` >> >> As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` >> - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` >> - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` >> - added `#include "os_posix.hpp"` to define `RESTARTABLE` >> - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) >> - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` >> >> Additional changes in `posix/attachListener_posix.cpp` >> - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` >> >>
>> >> Conflicts >> >> >> commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) >> Merge: 410d0e168c3 23d6f747824 >> Author: Dmitry Cherepanov >> Date: Mon Apr 14 13:55:59 2025 +0400 >> >> Merge with jdk:jdk-25+3 >> >> diff --git a/.jcheck/conf b/.jcheck/conf >> remerge CONFLICT (content): Merge conflict in .jcheck/conf >> index 1d117b1d825..25bd8dd0b94 100644 >> --- a/.jcheck/conf >> +++ b/.jcheck/conf >> @@ -4,12 +4,7 @@ jbs=JDK >> version=25 >> >> [checks] >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> error=whitespace >> -======= >> -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright >> -warning=issuestitle,binary >> ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) >> >> [checks "reviewers"] >> committers=1 >> @@ -18,31 +13,3 @@ ignore=duke >> [census] >> version=0 >> domain=openjdk.org >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> -======= >> - >> -[chec... > > Dmitry Cherepanov has updated the pull request incrementally with one additional commit since the last revision: > > socket function should be const @dimitryc Your change (at version a4b8b754bff8b366f15e02a9d027c9fdedaef35a) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/crac/pull/224#issuecomment-2823449069 From dcherepanov at openjdk.org Wed Apr 23 08:19:00 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Wed, 23 Apr 2025 08:19:00 GMT Subject: [crac] RFR: Merge jdk:jdk-25+3 In-Reply-To: References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Tue, 22 Apr 2025 13:08:51 GMT, Radim Vansa wrote: >> Merge with jdk-25:3 >> >> There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 >> >> https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to >> - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` >> - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` >> >> As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` >> - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` >> - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` >> - added `#include "os_posix.hpp"` to define `RESTARTABLE` >> - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) >> - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` >> >> Additional changes in `posix/attachListener_posix.cpp` >> - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` >> >>
>> >> Conflicts >> >> >> commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) >> Merge: 410d0e168c3 23d6f747824 >> Author: Dmitry Cherepanov >> Date: Mon Apr 14 13:55:59 2025 +0400 >> >> Merge with jdk:jdk-25+3 >> >> diff --git a/.jcheck/conf b/.jcheck/conf >> remerge CONFLICT (content): Merge conflict in .jcheck/conf >> index 1d117b1d825..25bd8dd0b94 100644 >> --- a/.jcheck/conf >> +++ b/.jcheck/conf >> @@ -4,12 +4,7 @@ jbs=JDK >> version=25 >> >> [checks] >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> error=whitespace >> -======= >> -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright >> -warning=issuestitle,binary >> ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) >> >> [checks "reviewers"] >> committers=1 >> @@ -18,31 +13,3 @@ ignore=duke >> [census] >> version=0 >> domain=openjdk.org >> -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) >> -======= >> - >> -[chec... > > I'll try to have a look this evening. @rvansa Thanks for the review! Could you please sponsor this? ------------- PR Comment: https://git.openjdk.org/crac/pull/224#issuecomment-2823449312 From dcherepanov at openjdk.org Wed Apr 23 09:32:22 2025 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Wed, 23 Apr 2025 09:32:22 GMT Subject: [crac] Integrated: Merge jdk:jdk-25+3 In-Reply-To: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> References: <8NoautXbgKmcu8gkl8vaxuNNkyt8rUSSgpdgR3ODhAE=.2a27e384-7564-4554-8096-f488f9673749@github.com> Message-ID: On Tue, 15 Apr 2025 08:11:19 GMT, Dmitry Cherepanov wrote: > Merge with jdk-25:3 > > There are several conflicts between CRaC specific changes (https://github.com/openjdk/crac/pull/10) and incoming JDK changes for https://bugs.openjdk.org/browse/JDK-8342995 > > https://github.com/openjdk/crac/pull/10 moved some parts from `linux/attachListener_linux.cpp` to > - `linux/linuxAttachOperation.hpp` which later were renamed to `posix/posixAttachOperation.hpp` > - `linux/attachListener_linux.hpp` which later were renamed to `posix/attachListener_posix.hpp` > > As a part of this merge, I manually applied JDK changes for `posix/attachListener_posix.cpp` to `posix/posixAttachOperation.hpp` & `posix/attachListener_posix.hpp` > - new `SocketChannel` class moved to `posix/posixAttachOperation.hpp` > - changes in `PosixAttachOperation` class incorporated into `posix/posixAttachOperation.hpp` > - added `#include "os_posix.hpp"` to define `RESTARTABLE` > - kept `socket()` function in `PosixAttachOperation` class as it?s used by [VM_Crac::is_socket_from_jcmd](https://github.com/openjdk/crac/blob/647105388b66b7acedf03d049dc60323912a8fe7/src/hotspot/os/linux/crac_linux.cpp#L279) > - changes in `PosixAttachListener` class incorporated into `posix/attachListener_posix.hpp` > > Additional changes in `posix/attachListener_posix.cpp` > - changes in `PosixAttachOperation::complete` incorporated into `write_operation_result` > >
> > Conflicts > > > commit c54dd827b39e7e0066959e4985e4aaefd5452a10 (HEAD -> merge-jdk, dmitry-crac/merge-jdk) > Merge: 410d0e168c3 23d6f747824 > Author: Dmitry Cherepanov > Date: Mon Apr 14 13:55:59 2025 +0400 > > Merge with jdk:jdk-25+3 > > diff --git a/.jcheck/conf b/.jcheck/conf > remerge CONFLICT (content): Merge conflict in .jcheck/conf > index 1d117b1d825..25bd8dd0b94 100644 > --- a/.jcheck/conf > +++ b/.jcheck/conf > @@ -4,12 +4,7 @@ jbs=JDK > version=25 > > [checks] > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > error=whitespace > -======= > -error=author,committer,reviewers,merge,issues,executable,symlink,message,hg-tag,whitespace,problemlists,copyright > -warning=issuestitle,binary > ->>>>>>> 23d6f747824 (8346463: Add test coverage for deploying the default provider as a module) > > [checks "reviewers"] > committers=1 > @@ -18,31 +13,3 @@ ignore=duke > [census] > version=0 > domain=openjdk.org > -<<<<<<< 410d0e168c3 (8353243: [CRaC] Show all options in engine help) > -======= > - > -[checks "whitespace"] > -files=.*.cpp|.*.hpp|.*.c|.*.h|.*.java|.*.cc|.*.hh|.*.m|.*.mm|.*.S|.*.md|.*.properties|.*.gmk|.... This pull request has now been integrated. Changeset: ca1f5f67 Author: Dmitry Cherepanov Committer: Radim Vansa URL: https://git.openjdk.org/crac/commit/ca1f5f678d503dd26738c3fd733389f35bdf54d0 Stats: 14344 lines in 943 files changed: 9675 ins; 2416 del; 2253 mod Merge jdk:jdk-25+3 Reviewed-by: rvansa ------------- PR: https://git.openjdk.org/crac/pull/224 From jkratochvil at openjdk.org Fri Apr 25 09:33:21 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 25 Apr 2025 09:33:21 GMT Subject: [crac] Withdrawn: 8344647: Make java.se participate in the preview language feature `requires transitive java.base` In-Reply-To: References: Message-ID: On Tue, 22 Apr 2025 10:20:26 GMT, Jan Kratochvil wrote: > CRaC trunk (647105388b66b7acedf03d049dc60323912a8fe7) fails to compile for me the same way as `jdk-25+2`: > > === Output from failing command(s) repeated here === > * For target buildtools_depend__the.COMPILE_DEPEND_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > * For target buildtools_jdk_tools_classes__the.BUILD_TOOLS_JDK_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > > It has been fixed upstream which is what I am backporting. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/crac/pull/226 From jkratochvil at openjdk.org Fri Apr 25 09:33:21 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 25 Apr 2025 09:33:21 GMT Subject: [crac] RFR: 8344647: Make java.se participate in the preview language feature `requires transitive java.base` In-Reply-To: References: Message-ID: <4I97m5Tw652pp-iulXgl0kN8YEG5mwSobRDdKqgk9Vs=.1e7247bb-b707-4fa8-933e-18eb295eef02@github.com> On Tue, 22 Apr 2025 10:20:26 GMT, Jan Kratochvil wrote: > CRaC trunk (647105388b66b7acedf03d049dc60323912a8fe7) fails to compile for me the same way as `jdk-25+2`: > > === Output from failing command(s) repeated here === > * For target buildtools_depend__the.COMPILE_DEPEND_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > * For target buildtools_jdk_tools_classes__the.BUILD_TOOLS_JDK_batch: > error: cannot access module-info > bad class file: /modules/java.se/module-info.class > bad requires flag: ACC_TRANSITIVE (0x0020 > Please remove or make sure it appears in the correct subdirectory of the classpath. > 1 error > > It has been fixed upstream which is what I am backporting. Superseded by #224. ------------- PR Comment: https://git.openjdk.org/crac/pull/226#issuecomment-2829883350 From jkratochvil at openjdk.org Tue Apr 29 06:42:46 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 06:42:46 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM Message-ID: There was originally a mistake: - restoring JVM did restore the image - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. This patch changes it to: - restoring JVM checks `cpufeatures` user data in the image against current CPU Features - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. ------------- Commit messages: - whitespace fixes - Fix missing errno on OSX - Fix missing PATH_MAX on Windows - List Fedora 42 as a comment - patch cleanup - Revert "Fedora 41->42" - Fedora 41->42 - CPUFeatures.sh comment path fix - compilation fixes - Merge branch 'crac-builderror-cpufeaturesparent' into crac-cpufeaturesparent - ... and 3 more: https://git.openjdk.org/crac/compare/ca1f5f67...e65c3c9b Changes: https://git.openjdk.org/crac/pull/227/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=227&range=00 Stats: 1057 lines in 23 files changed: 861 ins; 100 del; 96 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From tpushkin at openjdk.org Tue Apr 29 09:50:14 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 29 Apr 2025 09:50:14 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: References: Message-ID: On Sun, 27 Apr 2025 15:18:17 GMT, Jan Kratochvil wrote: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. I have not finished the review, mainly just looked at the `runtime` changes. Will continue later, or maybe someone else will. src/hotspot/share/runtime/crac.hpp line 80: > 78: > 79: static bool cpufeatures_store(); > 80: static bool cpufeatures_restore(); These can be removed since they are not implemented src/hotspot/share/runtime/crac_engine.cpp line 414: > 412: } > 413: > 414: const char CracEngine::userdata_name[] = "cpufeatures"; Let's name this `cpufeatures_userdata_name` since it only applies to CPU features. I would also make in a .cpp-level static since it is only used in this .cpp file and mark it `constexpr` (though in practice it probably makes no difference here), but up to you. src/hotspot/share/runtime/crac_engine.cpp line 421: > 419: log_error(crac)("Installed CRaC engine does not support user_data"); > 420: } > 421: return api; Please do checks and error messages like in other API-loading functions (e.g. `prepare_description_api`) for consistency src/hotspot/share/runtime/crac_engine.cpp line 439: > 437: > 438: // Return success. > 439: bool CracEngine::cpufeatures_restore() { To me "restore" means that the function restores some state of the restored VM but here we just want to check if the features are compatible. So I think a better name would be `cpufeatures_check` or `cpufeatures_load_and_check`. I would also place a verb first (`store_cpufeatures`, `check_cpufeatures`) but this is obviously a matter of preference. src/hotspot/share/runtime/crac_engine.cpp line 444: > 442: // s3->set_image_bitmask did handle it already, load_user_data() is too expensive for S3. > 443: return true; > 444: } There is no special handling for `s3://` paths in this project src/hotspot/share/runtime/crac_engine.cpp line 452: > 450: if (!(user_data = user_data_api->load_user_data(_conf))) { > 451: return false; > 452: } Just for consistency with the rest of the code: Suggestion: if (!(user_data = user_data_api->load_user_data(_conf))) { log_error(crac)("CRaC engine failed to load user data"); return false; } src/hotspot/share/runtime/crac_engine.cpp line 461: > 459: return false; > 460: } > 461: assert(datap, "lookup_user_data should return non-null data pointer"); CRaC engine can be a user-provided program so maybe it is better to do a real check and log error if it fails src/hotspot/share/runtime/crac_engine.hpp line 70: > 68: > 69: bool cpufeatures_store(); > 70: bool cpufeatures_restore(); These can be marked `const` src/hotspot/share/runtime/crac_engine.hpp line 82: > 80: static const char userdata_name[]; > 81: > 82: crlib_user_data_t *user_data_api_get(); Could you please rewrite this part to work the same way as the other extensions: public `ApiStatus prepare_user_data_api()` loads and caches the API, then the dependent methods use the cached one? - The API can be loaded multiple times when repeated checkpoint (checkpoint -> restore -> checkpoint again) is performed - We may later add more methods that use user data API - It makes `CracEngine` more consistent src/hotspot/share/runtime/threads.cpp line 481: > 479: // Output stream module should be already initialized for error reporting during restore. > 480: // JDK version should also be intialized for arguments parsing. > 481: if (check_for_restore(args) != JNI_OK) return JNI_ERR; Now we do almost the same thing twice on restore: here we parse the args and check whether they are restore-settable, and then in `Arguments::parse` below we parse the args again. I'm not sure if it breaks anything (`Arguments::parse` may do the parsing a bit differently and overwrite some values with different results). If not, it's worth to at least add a TODO here to clean this up in the future. src/hotspot/share/runtime/threads.cpp line 606: > 604: > 605: // Output stream module should be already initialized for error reporting during restore. > 606: // JDK version should also be intialized. This comment should probably be updated saying that a lot of things (probably no point in listing them all) should be initialized to be able to check CPU features src/java.base/share/man/java.md line 1838: > 1836: `-XX:CRaCCheckpointTo` when you get an error during `-XX:CRaCRestoreFrom` > 1837: on a different machine. `-XX:CPUFeatures=native` is the default. > 1838: `-XX:CPUFeatures=generic` is compatible with any CPU. There is also `CPUFeatures=ignore`, what is the difference between it and `IgnoreCPUFeatures`? The fact that `IgnoreCPUFeatures` is to be set on restore? ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2838145657 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065737258 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065789850 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065781726 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065818710 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065797780 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065801325 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065807680 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065794912 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065763641 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065926059 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065929561 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2065946444 From jkratochvil at openjdk.org Tue Apr 29 13:29:20 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 13:29:20 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: References: Message-ID: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> On Sun, 27 Apr 2025 15:18:17 GMT, Jan Kratochvil wrote: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. That one failure: `Pre-submit tests - linux-x64 / test - Build / test` = `jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java` will IMO disappear after the change gets accepted. The testcase has been changed so that if you run it from the freshly built JVM it works. But Github IIUC runs the parent JVM pre-built, not from the fresh build: `/home/runner/work/crac/crac/bundles/jdk/jdk-25/bin/java`. Or where is this `bundle` JVM from? ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2838909237 From tpushkin at openjdk.org Tue Apr 29 13:34:13 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 29 Apr 2025 13:34:13 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: References: Message-ID: On Sun, 27 Apr 2025 15:18:17 GMT, Jan Kratochvil wrote: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. I believe the tests in CI are running on the JDK built from the source branch. The bundle is built and uploaded [here](https://github.com/openjdk/crac/blob/ca1f5f678d503dd26738c3fd733389f35bdf54d0/.github/workflows/build-linux.yml#L145) and is downloaded for testing [here](https://github.com/openjdk/crac/blob/ca1f5f678d503dd26738c3fd733389f35bdf54d0/.github/workflows/test.yml#L141). ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2838933706 From jkratochvil at openjdk.org Tue Apr 29 15:16:08 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 15:16:08 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 08:35:52 GMT, Timofei Pushkin wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > src/hotspot/share/runtime/crac_engine.hpp line 70: > >> 68: >> 69: bool cpufeatures_store(); >> 70: bool cpufeatures_restore(); > > These can be marked `const` No longer: crac_engine.cpp:443:32: error: passing ?const CracEngine? as ?this? argument discards qualifiers [-fpermissive] 443 | switch (prepare_user_data_api()) { ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2066792194 From jkratochvil at openjdk.org Tue Apr 29 15:24:56 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 15:24:56 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v2] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with four additional commits since the last revision: - Print load_user_data error - Rename userdata_name - Use the standardized prepare_user_data_api() style - Removed unused declaration ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/e65c3c9b..0e06582b Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=01 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=00-01 Stats: 54 lines in 3 files changed: 27 ins; 6 del; 21 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From jkratochvil at openjdk.org Tue Apr 29 16:04:04 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 16:04:04 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v2] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 09:34:29 GMT, Timofei Pushkin wrote: >> Jan Kratochvil has updated the pull request incrementally with four additional commits since the last revision: >> >> - Print load_user_data error >> - Rename userdata_name >> - Use the standardized prepare_user_data_api() style >> - Removed unused declaration > > src/hotspot/share/runtime/threads.cpp line 481: > >> 479: // Output stream module should be already initialized for error reporting during restore. >> 480: // JDK version should also be intialized for arguments parsing. >> 481: if (check_for_restore(args) != JNI_OK) return JNI_ERR; > > Now we do almost the same thing twice on restore: here we parse the args and check whether they are restore-settable, and then in `Arguments::parse` below we parse the args again. I'm not sure if it breaks anything (`Arguments::parse` may do the parsing a bit differently and overwrite some values with different results). If not, it's worth to at least add a TODO here to clean this up in the future. It could be somehow cleaned up. But it is not so simple. Filed a TODO item internally as ZULU-75247. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2066889460 From jkratochvil at openjdk.org Tue Apr 29 16:09:08 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 16:09:08 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v2] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 09:36:07 GMT, Timofei Pushkin wrote: >> Jan Kratochvil has updated the pull request incrementally with four additional commits since the last revision: >> >> - Print load_user_data error >> - Rename userdata_name >> - Use the standardized prepare_user_data_api() style >> - Removed unused declaration > > src/hotspot/share/runtime/threads.cpp line 606: > >> 604: >> 605: // Output stream module should be already initialized for error reporting during restore. >> 606: // JDK version should also be intialized. > > This comment should probably be updated saying that a lot of things (probably no point in listing them all) should be initialized to be able to check CPU features Described in 03200e67c33bedf197c31dd574d9a202d0e5e40d. > src/java.base/share/man/java.md line 1838: > >> 1836: `-XX:CRaCCheckpointTo` when you get an error during `-XX:CRaCRestoreFrom` >> 1837: on a different machine. `-XX:CPUFeatures=native` is the default. >> 1838: `-XX:CPUFeatures=generic` is compatible with any CPU. > > There is also `CPUFeatures=ignore`, what is the difference between it and `IgnoreCPUFeatures`? The fact that `IgnoreCPUFeatures` is to be set on restore? Could we leave this for a different patch? IIRC I even have some patch done for it already. I have filed a TODO item internally as ZULU-75248. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2066894752 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2066897842 From jkratochvil at openjdk.org Tue Apr 29 16:17:50 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 29 Apr 2025 16:17:50 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v3] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with three additional commits since the last revision: - Update required initialization comment - Rename cpufeatures_restore() to cpufeatures_check() - Use log error, not assert ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/0e06582b..03200e67 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=02 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=01-02 Stats: 11 lines in 4 files changed: 4 ins; 2 del; 5 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From tpushkin at openjdk.org Tue Apr 29 17:46:21 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Tue, 29 Apr 2025 17:46:21 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v3] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 15:12:59 GMT, Jan Kratochvil wrote: >> src/hotspot/share/runtime/crac_engine.hpp line 70: >> >>> 68: >>> 69: bool cpufeatures_store(); >>> 70: bool cpufeatures_restore(); >> >> These can be marked `const` > > No longer: > > crac_engine.cpp:443:32: error: passing ?const CracEngine? as ?this? argument discards qualifiers [-fpermissive] > 443 | switch (prepare_user_data_api()) { I actually was imagining `prepare_user_data_api` to be called in `crac.cpp` like with the other `prepare_*_api` methods, not inside these functions. Then if we get `ApiStatus::UNSUPPORTED` we can do something else instead of failing like now. Sorry, I forgot to write that initially: API extensions, like user data, ideally should not be mandatory, i.e. we shouldn't fail if an engine does not support an extension. In the case of CPU features I think this can be done meaningfully: if user data is not supported and `-XX:CPUFeatures` was not set we can emit a warning (not sure if only on checkpoint or also on every restore) and try to restore without verifying the features. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2067045799 From rvansa at openjdk.org Wed Apr 30 07:43:03 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 30 Apr 2025 07:43:03 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> References: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> Message-ID: On Tue, 29 Apr 2025 13:25:55 GMT, Jan Kratochvil wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > That one failure: `Pre-submit tests - linux-x64 / test - Build / test` = `jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java` will IMO disappear after the change gets accepted. The testcase has been changed so that if you run it from the freshly built JVM it works. But Github IIUC runs the parent JVM pre-built, not from the fresh build: `/home/runner/work/crac/crac/bundles/jdk/jdk-25/bin/java`. Or where is this `bundle` JVM from? @jankratochvil The test is failing for me locally, too - does it pass on your machine? Could you create JDK issue and refer to that in the title? ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2841090563 From rvansa at openjdk.org Wed Apr 30 08:26:07 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 30 Apr 2025 08:26:07 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM [v3] In-Reply-To: References: Message-ID: <2dvbT7umsHsuBU-RK8LK3LX5NG2kTrX7Zh_iZS8Q1Go=.8e17c865-5659-401c-b051-db0c0c0c9122@github.com> On Tue, 29 Apr 2025 16:17:50 GMT, Jan Kratochvil wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > Jan Kratochvil has updated the pull request incrementally with three additional commits since the last revision: > > - Update required initialization comment > - Rename cpufeatures_restore() to cpufeatures_check() > - Use log error, not assert src/hotspot/cpu/x86/vm_version_x86.cpp line 2654: > 2652: _glibc_features = GLIBCFeatures_x64; > 2653: > 2654: if (ShowCPUFeatures && !CRaCRestoreFrom) nitpick: please use parentheses even for single statement src/hotspot/cpu/x86/vm_version_x86.cpp line 2658: > 2656: > 2657: #ifdef LINUX > 2658: if (!glibc_not_using()) { If we are ignoring the result I would just move the comment and not use an empty if test/jdk/jdk/crac/CPUFeatures/SimpleCPUFeatures.sh line 36: > 34: unset GLIBC_TUNABLES > 35: $JAVA_HOME/bin/java -XX:CPUFeatures=generic -XX:+ShowCPUFeatures -version 2>&1 | tee /proc/self/fd/2 | grep -q 'openjdk version' > 36: Could we add some tests to assert parsing with `glibc.cpu.hwcaps` ? test/jdk/jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java line 79: > 77: } finally { > 78: assertFalse(Files.exists(logPathO)); > 79: if (scenario1) { The change of behaviour is not clear to me; could you explain this in a comment? ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068038414 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068041593 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068133735 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068147831 From jkratochvil at openjdk.org Wed Apr 30 09:37:19 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 09:37:19 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> References: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> Message-ID: On Tue, 29 Apr 2025 13:25:55 GMT, Jan Kratochvil wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > That one failure: `Pre-submit tests - linux-x64 / test - Build / test` = `jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java` will IMO disappear after the change gets accepted. The testcase has been changed so that if you run it from the freshly built JVM it works. But Github IIUC runs the parent JVM pre-built, not from the fresh build: `/home/runner/work/crac/crac/bundles/jdk/jdk-25/bin/java`. Or where is this `bundle` JVM from? > @jankratochvil The test is failing for me locally, too - does it pass on your machine? > Could you create JDK issue and refer to that in the title? https://bugs.openjdk.org/browse/JDK-8355973 ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2841385398 From tpushkin at openjdk.org Wed Apr 30 10:39:15 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 30 Apr 2025 10:39:15 GMT Subject: [crac] RFR: Move CPUFeatures verification to the parent process of JVM In-Reply-To: References: <367ZGD-U2Rx61ELf96qVcMf3Q1LAz5OERUhPHTWCvJ8=.7c955698-19e2-4769-a883-5bef04056878@github.com> Message-ID: On Wed, 30 Apr 2025 09:34:53 GMT, Jan Kratochvil wrote: >> That one failure: `Pre-submit tests - linux-x64 / test - Build / test` = `jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java` will IMO disappear after the change gets accepted. The testcase has been changed so that if you run it from the freshly built JVM it works. But Github IIUC runs the parent JVM pre-built, not from the fresh build: `/home/runner/work/crac/crac/bundles/jdk/jdk-25/bin/java`. Or where is this `bundle` JVM from? > >> @jankratochvil The test is failing for me locally, too - does it pass on your machine? >> Could you create JDK issue and refer to that in the title? > > https://bugs.openjdk.org/browse/JDK-8355973 @jankratochvil I think Radim meant creating a JBS issue for this PR itself and updating the title of the PR to be ": Move CPUFeatures...". The test should ideally be fixed as part of this PR, not separately, but I haven't looked into it. ------------- PR Comment: https://git.openjdk.org/crac/pull/227#issuecomment-2841551050 From jkratochvil at openjdk.org Wed Apr 30 11:19:50 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 11:19:50 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v4] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with three additional commits since the last revision: - Comment the testcase - Reformat glibc_not_using() caller - Coding style ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/03200e67..7e912283 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=03 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=02-03 Stats: 10 lines in 2 files changed: 4 ins; 1 del; 5 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From jkratochvil at openjdk.org Wed Apr 30 11:19:51 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 11:19:51 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v3] In-Reply-To: <2dvbT7umsHsuBU-RK8LK3LX5NG2kTrX7Zh_iZS8Q1Go=.8e17c865-5659-401c-b051-db0c0c0c9122@github.com> References: <2dvbT7umsHsuBU-RK8LK3LX5NG2kTrX7Zh_iZS8Q1Go=.8e17c865-5659-401c-b051-db0c0c0c9122@github.com> Message-ID: <2zE0S0Xd7asU5BTPlPiIhnUYtKaDV7RP_Iy6OfrP0hs=.c9a8405d-d2f7-42ae-b287-08f46780c309@github.com> On Wed, 30 Apr 2025 07:22:05 GMT, Radim Vansa wrote: >> Jan Kratochvil has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update required initialization comment >> - Rename cpufeatures_restore() to cpufeatures_check() >> - Use log error, not assert > > src/hotspot/cpu/x86/vm_version_x86.cpp line 2658: > >> 2656: >> 2657: #ifdef LINUX >> 2658: if (!glibc_not_using()) { > > If we are ignoring the result I would just move the comment and not use an empty if Changed in 35da8647272d9e1ec4376075c2100b4de7e173f0. > test/jdk/jdk/crac/fileDescriptors/LoggingVMlogOpenTestNegative.java line 79: > >> 77: } finally { >> 78: assertFalse(Files.exists(logPathO)); >> 79: if (scenario1) { > > The change of behaviour is not clear to me; could you explain this in a comment? Added comments in: 7e912283a6b30667b5da1b758fe4b257be8af6b5 ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068455809 PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068458833 From jkratochvil at openjdk.org Wed Apr 30 12:40:16 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 12:40:16 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Refactor the usage of prepare_user_data_api() ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/7e912283..b0c8012b Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=04 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=03-04 Stats: 55 lines in 3 files changed: 25 ins; 24 del; 6 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From jkratochvil at openjdk.org Wed Apr 30 12:40:16 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 12:40:16 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: References: Message-ID: <_mpVM5WkGXkf-SDVz1RLn5zv7mx8dqU3HPye3eJukfc=.91879bfa-de22-4381-99cc-936c762ef15b@github.com> On Tue, 29 Apr 2025 17:43:12 GMT, Timofei Pushkin wrote: >> No longer: >> >> crac_engine.cpp:443:32: error: passing ?const CracEngine? as ?this? argument discards qualifiers [-fpermissive] >> 443 | switch (prepare_user_data_api()) { > > I actually was imagining `prepare_user_data_api` to be called in `crac.cpp` like with the other `prepare_*_api` methods, not inside these functions. And also when getting `ApiStatus::UNSUPPORTED` do something else instead of failing like now. > > Sorry, I forgot to write that initially: API extensions, like user data, ideally should not be mandatory, i.e. we shouldn't fail if an engine does not support an extension. In the case of CPU features I think this can be done meaningfully: if user data is not supported and `-XX:CPUFeatures` was not set we can emit a warning (on checkpoint and on every subsequent restore, `-XX:CPUFeatures=ignore`/`IgnoreCPUFeatures` can be used to silence the warning) and try to restore without verifying the features. Refactored in b0c8012b5173fd049844f0d4f33e2f58f10af96f but I am not sure it is as you intended. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068575308 From tpushkin at openjdk.org Wed Apr 30 13:00:17 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 30 Apr 2025 13:00:17 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: <_mpVM5WkGXkf-SDVz1RLn5zv7mx8dqU3HPye3eJukfc=.91879bfa-de22-4381-99cc-936c762ef15b@github.com> References: <_mpVM5WkGXkf-SDVz1RLn5zv7mx8dqU3HPye3eJukfc=.91879bfa-de22-4381-99cc-936c762ef15b@github.com> Message-ID: On Wed, 30 Apr 2025 12:37:08 GMT, Jan Kratochvil wrote: >> I actually was imagining `prepare_user_data_api` to be called in `crac.cpp` like with the other `prepare_*_api` methods, not inside these functions. And also when getting `ApiStatus::UNSUPPORTED` do something else instead of failing like now. >> >> Sorry, I forgot to write that initially: API extensions, like user data, ideally should not be mandatory, i.e. we shouldn't fail if an engine does not support an extension. In the case of CPU features I think this can be done meaningfully: if user data is not supported and `-XX:CPUFeatures` was not set we can emit a warning (on checkpoint and on every subsequent restore, `-XX:CPUFeatures=ignore`/`IgnoreCPUFeatures` can be used to silence the warning) and try to restore without verifying the features. > > Refactored in b0c8012b5173fd049844f0d4f33e2f58f10af96f but I am not sure it is as you intended. Yes, thank you, this is roughly how I envisioned it in this regard ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068612175 From tpushkin at openjdk.org Wed Apr 30 13:55:58 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 30 Apr 2025 13:55:58 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 12:40:16 GMT, Jan Kratochvil wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Refactor the usage of prepare_user_data_api() src/hotspot/share/runtime/crac_engine.cpp line 473: > 471: datap = nullptr; > 472: } > 473: if (!VM_Version::cpu_features_binary_check(datap)) { >From the architectural point of view, I don't like that the checking is performed inside `CracEngine` because the class represents the engine API (originally it just encapsulated handles for lib, APIs, conf to have RAII). When calling `crac_engine.cpufeatures_check()` it's like we are asking the engine to check the features but this is not what the engine itself is doing. I would suggest: - `CracEngine::cpufeatures_store` receives a pre-filled `VM_Version::CPUFeaturesBinary` and just stores it. - `CracEngine::cpufeatures_load` loads `VM_Version::CPUFeaturesBinary`, validates the size and not-null-ness and returns a copy of it (copying is to be able to destroy the user data). This is just a suggestion, feel free to not implement this if you don't like it. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068716468 From tpushkin at openjdk.org Wed Apr 30 14:01:59 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 30 Apr 2025 14:01:59 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 13:53:09 GMT, Timofei Pushkin wrote: >> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: >> >> Refactor the usage of prepare_user_data_api() > > src/hotspot/share/runtime/crac_engine.cpp line 473: > >> 471: datap = nullptr; >> 472: } >> 473: if (!VM_Version::cpu_features_binary_check(datap)) { > > From the architectural point of view, I don't like that the checking is performed inside `CracEngine` because the class represents the engine API (originally it just encapsulated handles for lib, APIs, conf to have RAII). When calling `crac_engine.cpufeatures_check()` it's like we are asking the engine to check the features but this is not what the engine itself is doing. > > I would suggest: > - `CracEngine::cpufeatures_store` receives a pre-filled `VM_Version::CPUFeaturesBinary` and just stores it. > - `CracEngine::cpufeatures_load` loads `VM_Version::CPUFeaturesBinary`, validates the size and not-null-ness and returns a copy of it (copying is to be able to destroy the user data). > > This is just a suggestion, feel free to not implement this if you don't like it. And sorry for asking to reimplement the same stuff the third time, I should've suggested that all together ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068727411 From jkratochvil at openjdk.org Wed Apr 30 14:52:49 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 14:52:49 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v6] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with two additional commits since the last revision: - Fix GLIBC_TUNABLES rewriting - Update copyright years ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/b0c8012b..e2b427a2 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=05 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=04-05 Stats: 14 lines in 3 files changed: 9 ins; 0 del; 5 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From jkratochvil at openjdk.org Wed Apr 30 14:52:53 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 14:52:53 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v3] In-Reply-To: <2dvbT7umsHsuBU-RK8LK3LX5NG2kTrX7Zh_iZS8Q1Go=.8e17c865-5659-401c-b051-db0c0c0c9122@github.com> References: <2dvbT7umsHsuBU-RK8LK3LX5NG2kTrX7Zh_iZS8Q1Go=.8e17c865-5659-401c-b051-db0c0c0c9122@github.com> Message-ID: On Wed, 30 Apr 2025 08:14:45 GMT, Radim Vansa wrote: >> Jan Kratochvil has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update required initialization comment >> - Rename cpufeatures_restore() to cpufeatures_check() >> - Use log error, not assert > > test/jdk/jdk/crac/CPUFeatures/SimpleCPUFeatures.sh line 36: > >> 34: unset GLIBC_TUNABLES >> 35: $JAVA_HOME/bin/java -XX:CPUFeatures=generic -XX:+ShowCPUFeatures -version 2>&1 | tee /proc/self/fd/2 | grep -q 'openjdk version' >> 36: > > Could we add some tests to assert parsing with `glibc.cpu.hwcaps` ? Oops, you found a bug, thanks. e2b427a28b383f5c2b10b29e92bb1ef2ca4e9882 ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068835548 From jkratochvil at openjdk.org Wed Apr 30 14:59:39 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 14:59:39 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v7] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: whitespace fixes ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/e2b427a2..e42aafff Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=06 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=05-06 Stats: 10 lines in 1 file changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From jkratochvil at openjdk.org Wed Apr 30 15:23:46 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 15:23:46 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v5] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 13:58:29 GMT, Timofei Pushkin wrote: >> src/hotspot/share/runtime/crac_engine.cpp line 473: >> >>> 471: datap = nullptr; >>> 472: } >>> 473: if (!VM_Version::cpu_features_binary_check(datap)) { >> >> From the architectural point of view, I don't like that the checking is performed inside `CracEngine` because the class represents the engine API (originally it just encapsulated handles for lib, APIs, conf to have RAII). When calling `crac_engine.cpufeatures_check()` it's like we are asking the engine to check the features but this is not what the engine itself is doing. >> >> I would suggest: >> - `CracEngine::cpufeatures_store` receives a pre-filled `VM_Version::CPUFeaturesBinary` and just stores it. >> - `CracEngine::cpufeatures_load` loads `VM_Version::CPUFeaturesBinary`, validates the size and not-null-ness and returns a copy of it (copying is to be able to destroy the user data). >> >> This is just a suggestion, feel free to not implement this if you don't like it. > > And sorry for asking to reimplement the same stuff the third time, I should've suggested that all together Changed in 03ea8d2648c5465ab053d05c71e0a5c3dc7c7ba6 . ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2068903982 From jkratochvil at openjdk.org Wed Apr 30 15:23:45 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 30 Apr 2025 15:23:45 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v8] In-Reply-To: References: Message-ID: > There was originally a mistake: > - restoring JVM did restore the image > - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host > > That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. > > This patch changes it to: > - restoring JVM checks `cpufeatures` user data in the image against current CPU Features > - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything > > The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Refactor cpufeatures_check/cpufeatures_load etc. ------------- Changes: - all: https://git.openjdk.org/crac/pull/227/files - new: https://git.openjdk.org/crac/pull/227/files/e42aafff..03ea8d26 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=227&range=07 - incr: https://webrevs.openjdk.org/?repo=crac&pr=227&range=06-07 Stats: 41 lines in 3 files changed: 14 ins; 11 del; 16 mod Patch: https://git.openjdk.org/crac/pull/227.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/227/head:pull/227 PR: https://git.openjdk.org/crac/pull/227 From tpushkin at openjdk.org Wed Apr 30 18:08:06 2025 From: tpushkin at openjdk.org (Timofei Pushkin) Date: Wed, 30 Apr 2025 18:08:06 GMT Subject: [crac] RFR: 8355974: [CRaC] Move CPUFeatures verification to the parent process of JVM [v8] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 15:23:45 GMT, Jan Kratochvil wrote: >> There was originally a mistake: >> - restoring JVM did restore the image >> - the restored JVM started checking whether CPU Features of the new host >= CPU Features of the checkpoint host >> >> That is difficult as glibc is already configured (IFUNC) in the image for the checkpoint host and calling any such glibc functions in the restored image will crash (as the advanced instructions from misconfigured IFUNC are not available). Some glibc functions had to be reimplemented in a dummy way inside JVM due to this misdesign. >> >> This patch changes it to: >> - restoring JVM checks `cpufeatures` user data in the image against current CPU Features >> - the restored JVM is started only if the CPU Features are satisfied, restored JVM no longer has to verify anything >> >> The patch is a bit of a kitchen sink, there are various improvements of the CPU Features code. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Refactor cpufeatures_check/cpufeatures_load etc. src/hotspot/share/runtime/crac.cpp line 532: > 530: } > 531: if (!VM_Version::cpu_features_binary_check(present ? &data : nullptr)) { > 532: log_error(crac)("Image %s has incompatible CPU features in its user data %s", CRaCRestoreFrom, engine.cpufeatures_userdata_name); I think the user doesn't really need to know the name of the user data in this case, just that the image is incompatible with their CPU. Omitting this would also allow making `cpufeatures_userdata_name` a .cpp static as it was before. ------------- PR Review Comment: https://git.openjdk.org/crac/pull/227#discussion_r2069204397