[crac] RFR: Move more FD tracking to java layer
Anton Kozlov
akozlov at openjdk.org
Fri Jun 9 16:10:07 UTC 2023
On Wed, 7 Jun 2023 10:51:41 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:
> The PR develops the idea of file descriptors tracking in Java started in #43. In general, that PR had two purposes. First, it provides CheckpointExceptions in terms that are clear for Java developers, improving the experience of developing for CRac. So if a FileDescriptor causes an exception, it's possible to look at the heap dump and find references to the offending FD, or to look at the stack trace when FD was created. And second, Java FD tracking is independent of the platform, so that was the first step to bring CRaC to non-Linux platforms, but that is a bit longer road.
>
> We can eliminate manual heap inspection, and this is proposed in this PR. A FileDescriptor does not exist on its own but it is owned by some higher-level Java object implementation. So an object can "claim" a FileDescriptor and define how and if to report the FD to the user. E.g. Socket can describe the its port and address without deep inspection of the process internals. Turns out, Socket.toString() provides enough information (but the reporting can be extended later if required).
>
>
> Suppressed: jdk.crac.impl.CheckpointOpenSocketException: Socket[addr=localhost/127.0.0.1,port=39957,localport=41464]
> at java.base/java.net.SocketImpl$SocketResource.lambda$beforeCheckpoint$0(SocketImpl.java:123)
> at java.base/jdk.crac.Core.lambda$checkpointRestore1$0(Core.java:128)
> ... 7 more
>
>
> A FileDescriptor is claiming itself in case there is a bug in JDK that no higher-level object is claiming the FD. FD provides just a very short description just for debugging. With stack trace to FD (which is a very nice debugging aid!), that should be enough to find the containing object and implement claiming.
>
> I believe this overlaps with #69, which at first glance would benefit a lot from being able to define policies in the domain objects. I'll comment on this after a closer look at the other PR.
Now I spot an issue in the current state that may be related
anton at mercury:~/proj/crac$ git show -s
commit a282698d2bf01588172e8f54c4cfedf40f203a68 (HEAD -> crac, jdk/crac/crac)
Author: Radim Vansa <rvansa at azul.com>
Date: Fri Jun 9 12:59:18 2023 +0000
Make CheckpointException/RestoreException aggregate-only
Reviewed-by: akozlov
There is an expected exception a jar file provided to classpath with a very simple .java application.
anton at mercury:~/proj/crac/test-jtreg$ ./../build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+CRPrintResourcesOnCheckpoint -XX:CREngine=simengine -XX:+CRHeapDumpOnCheckpointException -XX:CRaCCheckpointTo=cr -Djdk.crac.ResourceManager.debug=true -DcallCR -Djdk.crac.collect-fd-stacktraces=true -cp /home/anton/.m2/repository/org/crac/crac/1.3.0/crac-1.3.0.jar Test.java
beforeCheckpoint
Jun 09, 2023 5:17:28 PM jdk.internal.crac.LoggerContainer info
INFO: /home/anton/.m2/repository/org/crac/crac/1.3.0/crac-1.3.0.jar is recorded as always available on restore
Jun 09, 2023 5:17:28 PM jdk.internal.crac.LoggerContainer info
INFO: /home/anton/.m2/repository/org/crac/crac/1.3.0/crac-1.3.0.jar is recorded as always available on restore
JVM: FD fd=0 type=character path="/dev/pts/43"OK: claimed by java code
JVM: FD fd=1 type=character path="/dev/pts/43"OK: claimed by java code
JVM: FD fd=2 type=character path="/dev/pts/43"OK: claimed by java code
JVM: FD fd=3 type=regular path="/home/anton/proj/crac/build/linux-x86_64-server-release/images/jdk/lib/modules"OK: inherited from process env
JVM: FD fd=4 type=regular path="/home/anton/.m2/repository/org/crac/crac/1.3.0/crac-1.3.0.jar"OK: claimed by java code
Dumping heap to java_pid525343.hprof ...
Heap dump file created [9190694 bytes in 0.022 secs]
afterRestore
Exception in thread "main" org.crac.CheckpointException
at org.crac.Core$Compat.checkpointRestore(Core.java:144)
at org.crac.Core.checkpointRestore(Core.java:237)
at Test.main(Test.java:57)
Suppressed: jdk.crac.impl.CheckpointOpenFileException: FileDescriptor 6 left open: /home/anton/.m2/repository/org/crac/crac/1.3.0/crac-1.3.0.jar (regular)
at java.base/java.io.FileDescriptor.beforeCheckpoint(FileDescriptor.java:381)
at java.base/java.io.FileDescriptor$Resource.beforeCheckpoint(FileDescriptor.java:82)
at java.base/jdk.crac.impl.AbstractContext.invokeBeforeCheckpoint(AbstractContext.java:44)
at java.base/jdk.crac.impl.AbstractContext.beforeCheckpoint(AbstractContext.java:59)
at java.base/jdk.crac.impl.BlockingOrderedContext.beforeCheckpoint(BlockingOrderedContext.java:40)
at java.base/jdk.crac.impl.AbstractContext.invokeBeforeCheckpoint(AbstractContext.java:44)
at java.base/jdk.crac.impl.AbstractContext.beforeCheckpoint(AbstractContext.java:59)
at java.base/jdk.crac.impl.BlockingOrderedContext.beforeCheckpoint(BlockingOrderedContext.java:40)
at java.base/jdk.crac.Core.checkpointRestore1(Core.java:120)
at java.base/jdk.crac.Core.checkpointRestore(Core.java:268)
at java.base/jdk.crac.Core.checkpointRestore(Core.java:247)
at java.base/javax.crac.Core.checkpointRestore(Core.java:71)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.crac.Core$Compat.checkpointRestore(Core.java:141)
at org.crac.Core.checkpointRestore(Core.java:237)
at Test.main(Test.java:57)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at jdk.compiler/com.sun.tools.javac.launcher.Main.execute(Main.java:419)
at jdk.compiler/com.sun.tools.javac.launcher.Main.run(Main.java:192)
at jdk.compiler/com.sun.tools.javac.launcher.Main.main(Main.java:132)
Caused by: java.lang.Exception: This file descriptor was created by main at epoch:1686320248160 here
at java.base/java.io.FileDescriptor$Resource.<init>(FileDescriptor.java:72)
at java.base/java.io.FileDescriptor.<init>(FileDescriptor.java:97)
at java.base/sun.nio.fs.UnixChannelFactory.open(UnixChannelFactory.java:290)
at java.base/sun.nio.fs.UnixChannelFactory.newFileChannel(UnixChannelFactory.java:133)
at java.base/sun.nio.fs.UnixChannelFactory.newFileChannel(UnixChannelFactory.java:146)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:217)
at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
at java.base/java.nio.file.Files.newByteChannel(Files.java:432)
at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.<init>(ZipFileSystem.java:172)
at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.getZipFileSystem(ZipFileSystemProvider.java:125)
at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.newFileSystem(ZipFileSystemProvider.java:120)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager$ArchiveContainer.<init>(JavacFileManager.java:567)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager.getContainer(JavacFileManager.java:331)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager.pathsAndContainers(JavacFileManager.java:1075)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager.indexPathsAndContainersByRelativeDirectory(JavacFileManager.java:1030)
at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1219)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager.pathsAndContainers(JavacFileManager.java:1018)
at jdk.compiler/com.sun.tools.javac.file.JavacFileManager.list(JavacFileManager.java:774)
at java.compiler at 17-internal/javax.tools.ForwardingJavaFileManager.list(ForwardingJavaFileManager.java:79)
at jdk.compiler/com.sun.tools.javac.code.ClassFinder.list(ClassFinder.java:737)
at jdk.compiler/com.sun.tools.javac.code.ClassFinder.scanUserPaths(ClassFinder.java:681)
at jdk.compiler/com.sun.tools.javac.code.ClassFinder.fillIn(ClassFinder.java:555)
at jdk.compiler/com.sun.tools.javac.code.ClassFinder.complete(ClassFinder.java:299)
at jdk.compiler/com.sun.tools.javac.code.Symtab.lambda$addRootPackageFor$8(Symtab.java:810)
at jdk.compiler/com.sun.tools.javac.code.Symbol.complete(Symbol.java:682)
at jdk.compiler/com.sun.tools.javac.comp.Enter.visitTopLevel(Enter.java:356)
at jdk.compiler/com.sun.tools.javac.tree.JCTree$JCCompilationUnit.accept(JCTree.java:544)
at jdk.compiler/com.sun.tools.javac.comp.Enter.classEnter(Enter.java:286)
at jdk.compiler/com.sun.tools.javac.comp.Enter.classEnter(Enter.java:301)
at jdk.compiler/com.sun.tools.javac.comp.Enter.complete(Enter.java:603)
at jdk.compiler/com.sun.tools.javac.comp.Enter.main(Enter.java:587)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.enterTrees(JavaCompiler.java:1042)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:917)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.lambda$doCall$0(JavacTaskImpl.java:104)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.invocationHelper(JavacTaskImpl.java:152)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.doCall(JavacTaskImpl.java:100)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.call(JavacTaskImpl.java:94)
at jdk.compiler/com.sun.tools.javac.launcher.Main.compile(Main.java:383)
at jdk.compiler/com.sun.tools.javac.launcher.Main.run(Main.java:189)
... 1 more
I think it may be related to Persistent JarFile, that has been claiming the FD, was collected, and for some time period only FileDescriptor existed, having a chance to report the exception. For some reason I don't see the problem after this change, likely related to the fact that discovering a Resource and Exception generation are distributed in time. I don't think the problem is completely fixed by this, but I think a fix will employ better claiming anyway.
-------------
PR Comment: https://git.openjdk.org/crac/pull/79#issuecomment-1584822600
More information about the crac-dev
mailing list