From duke at openjdk.org Mon Apr 1 15:28:56 2024 From: duke at openjdk.org (duke) Date: Mon, 1 Apr 2024 15:28:56 GMT Subject: [crac] Withdrawn: Add debug flag to -XX:CREngine. In-Reply-To: <12wIGmSHSu_oP0NMgLhUEUdcSZtLL-8UKuoHNzybv0Y=.cc60f8c5-34ee-41fc-aa93-30912a5c10ec@github.com> References: <12wIGmSHSu_oP0NMgLhUEUdcSZtLL-8UKuoHNzybv0Y=.cc60f8c5-34ee-41fc-aa93-30912a5c10ec@github.com> Message-ID: On Sun, 24 Dec 2023 23:45:29 GMT, Kimura Yukihiro wrote: > Hello everyone, > > This is my first PR for the CRAC project. I propose three modifications to the criuengine. > > Firstly, I suggest adding a debug flag to the criuengine, as some JDK C/C++ native codes have the debug flag. > My proposal is to pass the debug flag using "-XX:CREngine=criuengine,debug,true". > > Secondly, the criuengine has a useful function, print_command_args_to_stderr(), which prints the criu command line. > However, it is only called when an error occurs. I believe it would be beneficial if the criu command line is printed when the debug flag is specified. > > The criu command line is built from the parameter of -XX:CREngine and environment variables such as CRAC_CRIU_PATH and CRAC_CRIU_OPT. > It would be helpful to see how it is actually assembled. > > For example: > $ export CRAC_CRIU_PATH=/work/criu-crac-release-1.4/sbin/criu > $ ./jdk/bin/java -XX:CREngine=/work/jdk/lib/criuengine,-v,3,-o,output3.log,-d,true -XX:CRaCCheckpointTo=cr Test > > Command: /work/criu-crac-release-1.4/sbin/criu dump -t 3232214 -D cr --shell-job '--verbosity=3' -o output3.log > > Thirdly, I propose that the criuengine command line, which is executed by the JavaVM, be printed when "-XX:CREngine=criuengine,debug,true" is specified. > For example, > CRaC info executing: /work/jdk/lib/criuengine checkpoint -v 3 -o output3.log -d true cr > > Testing: > I have verified the jdk/crac/VMOptionsTest.java, which is a test for -XX:CREngine. I believe it's unnecessary to add a test for the debug flag to it. > > Could you please review these modifications? > > Thank you, > Kimura Yukihiro This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/crac/pull/151 From jkratochvil at openjdk.org Tue Apr 9 12:13:30 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 9 Apr 2024 12:13:30 GMT Subject: [crac] RFR: Forbid compatibility of CRaC with BootJDK 22 Message-ID: <73PqS03UcInC-PfmmVE079PwhVe2qgUvaQuiYJuLsrE=.9443e0f5-fc5f-4b0a-9714-c1c6a8078cfb@github.com> warning: unknown enum constant Feature.STREAM_GATHERERS error: warnings found and -Werror specified ------------- Commit messages: - Forbid compatibility of CRaC with BootJDK 22 Changes: https://git.openjdk.org/crac/pull/154/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=154&range=00 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/crac/pull/154.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/154/head:pull/154 PR: https://git.openjdk.org/crac/pull/154 From jkratochvil at openjdk.org Tue Apr 9 12:14:32 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 9 Apr 2024 12:14:32 GMT Subject: [crac] RFR: Simplify CPUFeatures code [v2] In-Reply-To: References: Message-ID: > There is no functionality change. I guess the new code should be more simple and shorter. It was originally suggested by @AntonKozlov to use more functions than macros. > > > 1 file changed, 44 insertions(+), 82 deletions(-) > ``` > So I find it clearly an improvement. > The readable sub-commit is: https://github.com/openjdk/crac/pull/112/commits/6d9cb72b7a838dd4c9f107c5b71c4275005a0c23 > Otherwise it gets messed up by the other commit just renaming things. > As an explanation: Original code had two lists of the same CPU features. An `EXCESSIVE` list and an `GLIBC_DISABLE` list. The had to be kept in sync (their sets being equal) which was sanity checked: > > if (PASTE_TOKENS(disable_handled_, kind) != PASTE_TOKENS(excessive_handled_, kind)) \ > > So now there is the list just once (`EXCESSIVE`, the `GLIBC_DISABLE` one has been deleted). It looks stupid but when coding it I did not see it. > Coding both variants of #136 was needlessly more difficult without this pull request applied and now I have to rebase this pull request. Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Fix disabling glibc features - Merge branch 'crac' into crac-cpufeaturessimplify-merge - Rename excessive/i to shouldnotuse/i - Simplify CPUFeatures code ------------- Changes: https://git.openjdk.org/crac/pull/112/files Webrev: https://webrevs.openjdk.org/?repo=crac&pr=112&range=01 Stats: 118 lines in 2 files changed: 24 ins; 63 del; 31 mod Patch: https://git.openjdk.org/crac/pull/112.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/112/head:pull/112 PR: https://git.openjdk.org/crac/pull/112 From jkratochvil at openjdk.org Tue Apr 9 12:14:32 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 9 Apr 2024 12:14:32 GMT Subject: [crac] RFR: Simplify CPUFeatures code In-Reply-To: References: Message-ID: On Sat, 16 Sep 2023 15:41:32 GMT, Jan Kratochvil wrote: > There is no functionality change. I guess the new code should be more simple and shorter. It was originally suggested by @AntonKozlov to use more functions than macros. > > > 1 file changed, 44 insertions(+), 82 deletions(-) > ``` > So I find it clearly an improvement. > The readable sub-commit is: https://github.com/openjdk/crac/pull/112/commits/6d9cb72b7a838dd4c9f107c5b71c4275005a0c23 > Otherwise it gets messed up by the other commit just renaming things. > As an explanation: Original code had two lists of the same CPU features. An `EXCESSIVE` list and an `GLIBC_DISABLE` list. The had to be kept in sync (their sets being equal) which was sanity checked: > > if (PASTE_TOKENS(disable_handled_, kind) != PASTE_TOKENS(excessive_handled_, kind)) \ > > So now there is the list just once (`EXCESSIVE`, the `GLIBC_DISABLE` one has been deleted). It looks stupid but when coding it I did not see it. > Coding both variants of #136 was needlessly more difficult without this pull request applied and now I have to rebase this pull request. Converted it to draft until #139 gets accepted. ------------- PR Comment: https://git.openjdk.org/crac/pull/112#issuecomment-1814367990 From volker.simonis at gmail.com Tue Apr 16 06:32:10 2024 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 16 Apr 2024 08:32:10 +0200 Subject: Managing sun.nio.ch.Poller with CRaC In-Reply-To: <1a7b3ff8-056a-40d0-9d89-e5fc1458879d@oracle.com> References: <1a7b3ff8-056a-40d0-9d89-e5fc1458879d@oracle.com> Message-ID: Forwarding to crac-dev... Alan Bateman schrieb am Di., 16. Apr. 2024, 08:28: > On 15/04/2024 22:32, Cleber Muramoto wrote: > > Hello! > > > > I am trying to take a checkpoint using zulu's CRaC. > > > > When using VirtualThreads the checkpoint fails because of open file > > descriptors created by EPollPoller. > > > > As of now, I think the only possible way to close the FD's is by > > accessing the private read/write pollers to fetch the epfd's and > > manually closing them. (I tried the jdk.crac.resource-policies but it > > doesn't seem to pick up these FDs). > > > > While this works to capture the snapshot, restoring is another story, > > since the poller threads don't expect the epfds to change. > > > > Are there any plans to add some sort of lifecycle api to Poller to > > make it CRaC friendly? > > I assume this must be a build that uses code from the OpenJDK CRaC > project as this is not a feature in the JDK main line. You may have to > ask on crac-dev. There are literally dozens of places right across the > JDK that would need attention in order to sanely snapshot and continue > from arbitrary points like this. I haven't had cycles to track what they > have in the current exploration/prototype but I'm sure the folks on > crac-dev can help. > > -Alan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From akozlov at azul.com Tue Apr 16 10:59:08 2024 From: akozlov at azul.com (Anton Kozlov) Date: Tue, 16 Apr 2024 13:59:08 +0300 Subject: Managing sun.nio.ch.Poller with CRaC In-Reply-To: References: <1a7b3ff8-056a-40d0-9d89-e5fc1458879d@oracle.com> Message-ID: <740748ab-4634-425e-ab5a-524a2fd3de8d@azul.com> Thanks for forwarding. I created https://bugs.openjdk.org/browse/JDK-8330353 to track this. It's indeed true this and other places require changes for CRaC. However, I expect changes in each one to be not that big, so it's worth to try to fix them all :) Thanks, Anton On 4/16/24 9:32 AM, Volker Simonis wrote: > > Forwarding to crac-dev... > > Alan Bateman > schrieb am Di., 16. Apr. 2024, 08:28: > > On 15/04/2024 22:32, Cleber Muramoto wrote: > > Hello! > > > > I am trying to take a checkpoint using zulu's CRaC. > > > > When using VirtualThreads the checkpoint fails because of open file > > descriptors created by EPollPoller. > > > > As of now, I think the only possible way to close the FD's is by > > accessing the private read/write pollers to fetch the epfd's and > > manually closing them. (I tried the jdk.crac.resource-policies but it > > doesn't seem to pick up these FDs). > > > > While this works to capture the snapshot, restoring is another story, > > since the poller threads don't expect the epfds to change. > > > > Are there any plans to add some sort of lifecycle api to Poller to > > make it CRaC friendly? > > I assume this must be a build that uses code from the OpenJDK CRaC > project as this is not a feature in the JDK main line. You may have to > ask on crac-dev. There are literally dozens of places right across the > JDK that would need attention in order to sanely snapshot and continue > from arbitrary points like this. I haven't had cycles to track what they > have in the current exploration/prototype but I'm sure the folks on > crac-dev can help. > > -Alan > From rvansa at openjdk.org Wed Apr 17 06:33:09 2024 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 17 Apr 2024 06:33:09 GMT Subject: [crac] RFR: Improve C/R exception printout In-Reply-To: <6VemRfb0K-22KP_EiIQsXQfThyDVxDS6TUU_G_XdVVg=.09cb087c-edd6-4e64-916e-cc388845d640@github.com> References: <6VemRfb0K-22KP_EiIQsXQfThyDVxDS6TUU_G_XdVVg=.09cb087c-edd6-4e64-916e-cc388845d640@github.com> Message-ID: <3DdPCOnLeU4CBYzJY7zqm18dkLb0_rmmgmc_BAhmOoc=.18cac61b-1735-44c7-9cdd-68e92b2bc0ee@github.com> On Wed, 3 Jan 2024 15:18:22 GMT, Anton Kozlov wrote: >> Some users might get confused by the inner exceptions reported during C/R as *suppressed* exceptions. This PR changes the printout to make it look as if the exception had multiple causes. For example the DryRunTest will report this: >> >> jdk.crac.CheckpointException: Failed with 2 inner exceptions >> Cause 1/2: java.lang.RuntimeException: should not pass >> at DryRunTest$CRResource.beforeCheckpoint(DryRunTest.java:47) >> at java.base/jdk.crac.impl.AbstractContext.invokeBeforeCheckpoint(AbstractContext.java:44) >> ... (redacted) >> Cause 2/2: jdk.crac.impl.CheckpointOpenFileException: /tmp/jtreg-DryRunTest6956725915963168340.tmp >> at java.base/jdk.internal.crac.JDKFileResource.lambda$beforeCheckpoint$1(JDKFileResource.java:89) >> at java.base/jdk.crac.Core.checkpointRestore1(Core.java:174) >> ... (redacted) > > I also would prefer "nested" exceptions not to overlap with suppressed in implementation, to make sure the correct interface is always called. Indeed, a common parent class makes sense, and sometime in handling of the exception I had to write > > catch (CheckpointException | RestoreException e) { ... } > > > Which is probably correct way to indicate both checkpoint and restore failures are handled there. But if you want to just to print the "nested" exception, apprantly you'll need a common base exception. I.e. a common base looks good. Still relevant, waiting for @AntonKozlov review. ------------- PR Comment: https://git.openjdk.org/crac/pull/145#issuecomment-2060474927 From jkratochvil at openjdk.org Thu Apr 18 13:59:29 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 18 Apr 2024 13:59:29 GMT Subject: [crac] RFR: Simplify CPUFeatures code [v3] In-Reply-To: References: Message-ID: > There is no functionality change. I guess the new code should be more simple and shorter. It was originally suggested by @AntonKozlov to use more functions than macros. > > > 1 file changed, 44 insertions(+), 82 deletions(-) > ``` > So I find it clearly an improvement. > The readable sub-commit is: https://github.com/openjdk/crac/pull/112/commits/6d9cb72b7a838dd4c9f107c5b71c4275005a0c23 > Otherwise it gets messed up by the other commit just renaming things. > As an explanation: Original code had two lists of the same CPU features. An `EXCESSIVE` list and an `GLIBC_DISABLE` list. The had to be kept in sync (their sets being equal) which was sanity checked: > > if (PASTE_TOKENS(disable_handled_, kind) != PASTE_TOKENS(excessive_handled_, kind)) \ > > So now there is the list just once (`EXCESSIVE`, the `GLIBC_DISABLE` one has been deleted). It looks stupid but when coding it I did not see it. > Coding both variants of #136 was needlessly more difficult without this pull request applied and now I have to rebase this pull request. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Remove iomanip fix ------------- Changes: - all: https://git.openjdk.org/crac/pull/112/files - new: https://git.openjdk.org/crac/pull/112/files/3f1c3d26..9a81fb15 Webrevs: - full: https://webrevs.openjdk.org/?repo=crac&pr=112&range=02 - incr: https://webrevs.openjdk.org/?repo=crac&pr=112&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/crac/pull/112.diff Fetch: git fetch https://git.openjdk.org/crac.git pull/112/head:pull/112 PR: https://git.openjdk.org/crac/pull/112