[crac] RFR: [CRaC] Support checkpointing with --patch-module by treating patch JARs as persistent
Timofei Pushkin
tpushkin at openjdk.org
Mon Aug 4 18:42:12 UTC 2025
On Thu, 10 Jul 2025 03:11:03 GMT, mazhen <duke at openjdk.org> wrote:
> #### Summary
>
> This change enhances CRaC's checkpointing capabilities to correctly handle applications launched with the `--patch-module` option. Previously, using a JAR file with `--patch-module` would cause checkpointing to fail due to unmanaged file descriptors. This patch resolves the issue by treating these JARs as persistent resources, similar to how classpath JARs are handled.
>
> #### Problem
>
> When a JAR file is specified via `--patch-module`, it is opened by two distinct mechanisms within the JVM:
>
> 1. **Native `ClassLoader`**: The C++ `ClassLoader` implementation opens the JAR file to build its internal class path structures. This results in a file descriptor that is not managed by CRaC's Java-level resource tracking, leading to it being flagged as a "BAD: opened by application" file during `check_fds`.
> 2. **Java `ModulePatcher`**: The Java-level `jdk.internal.module.ModulePatcher` also opens the same JAR to scan for packages and resources. While this could be made CRaC-aware, the native-level file descriptor remains an issue.
>
> This dual-opening mechanism leads to `CheckpointOpenFileException` being thrown for both file descriptors during a checkpoint, preventing a successful image creation. The log output clearly illustrates this problem:
>
>
> $ jcmd 3403046 JDK.checkpoint
> 3403046:
> ...
> JVM: FD fd=4 type=regular path="/home/mazhen/works/patch-demo/legacy-patch.jar" BAD: opened by application
> JVM: FD fd=5 type=regular path="/home/mazhen/works/patch-demo/legacy-patch.jar" OK: claimed by java code
> ...
> An exception during a checkpoint operation:
> jdk.internal.crac.mirror.CheckpointException
> Suppressed: jdk.internal.crac.mirror.impl.CheckpointOpenFileException: legacy-patch.jar
> at java.base/jdk.internal.crac.JDKFileResource.lambda$beforeCheckpoint$0(JDKFileResource.java:90)
> ...
> Caused by: java.lang.Exception: This file descriptor was created by main at epoch:1752027938791 here
> at java.base/jdk.internal.crac.JDKFdResource.<init>(JDKFdResource.java:40)
> ...
> at java.base/jdk.internal.module.ModulePatcher$JarResourceFinder.<init>(ModulePatcher.java:433)
> ...
> Suppressed: jdk.internal.crac.mirror.impl.CheckpointOpenFileException: FD fd=4 type=regular path=/home/mazhen/works/patch-demo/legacy-patch.jar
> at java.base/jdk.internal.crac.mirror.Core.translateJVMExceptions(Core.java:115)
> ...
>
>
> #### Solution
>
> The fundamental assumption behind this solution is that JARs provided via `--patch-module` ...
Small comments from me, nothing functionally important
src/hotspot/share/classfile/classLoader.cpp line 763:
> 761: GrowableArray<int> fds;
> 762: assert(Thread::current()->is_VM_thread(), "should be called from VM op");
> 763: if (_patch_mod_entries == NULL) {
Suggestion:
if (_patch_mod_entries == nullptr) {
src/hotspot/share/classfile/classLoader.cpp line 766:
> 764: return fds;
> 765: }
> 766: // we don't use mutexes here because it is called from VM op
As I see it, all patch module entries are opened once at VM initialization (in `ClassLoader::setup_patch_mod_entries()`) and checkpoint cannot be called before then. If that is so, there should be no VM thread / mutex requirements to read it.
test/jdk/jdk/crac/fileDescriptors/PatchModuleTest.java line 140:
> 138: return (String) patcherClass.getMethod("getInfo").invoke(null);
> 139: }
> 140: }
I haven't looked through the test but if you will be applying Radim's suggestions please also add a newline at the end of file
-------------
PR Review: https://git.openjdk.org/crac/pull/241#pullrequestreview-3023357209
PR Review Comment: https://git.openjdk.org/crac/pull/241#discussion_r2209395175
PR Review Comment: https://git.openjdk.org/crac/pull/241#discussion_r2209442941
PR Review Comment: https://git.openjdk.org/crac/pull/241#discussion_r2215534987
More information about the crac-dev
mailing list