RFR: 8352642: Set zipinfo-time=false when constructing zipfs FileSystem in com.sun.tools.javac.file.JavacFileManager$ArchiveContainer for better performance
Mickael Istria
mistria at redhat.com
Fri Mar 28 08:46:13 UTC 2025
Hello,
On Fri, Mar 28, 2025 at 12:56 AM Jason Zaugg <jzaugg at openjdk.org> wrote:
> A GitHub code search reveals this idea is also pursued in [Eclipse JDT](
> https://github.com/eclipse-jdtls/eclipse-jdt-core-incubator/blob/e3fd9a0c74374951d4c91663a6fd52f16f6ee9c6/org.eclipse.jdt.core.javac/src/org/eclipse/jdt/internal/javac/CachingJarsJavaFileManager.java#L45).
> So any change to `getJarFsProvider`.
>
I've been following that thread as I'm always looking for ideas to improve
the performance of the Javac integration in JDT mentioned below. Thanks for
the discussion!
The general and not-new idea is that Javac is currently optimized for a
single run from CLI, by integration in Eclipse IDE/JDT, we want to make it
a long-living framework (so do all IDEs). So we do have to implement
several caches here and there to not spend too much CPU repeating
constantly the same tasks. Loading Jars (FileManager/FileSystem) has been
by far the most important thing for which to implement some caching. We
currently cache the FileSystems and got good results for it. But as the
discussion mention, we might also consider caching the FileObject and some
of it's commonly requested values, to save some more reading of the
streams. I actually believe it's the strategy used by Netbeans.
But ideally, we'd even go further and allow caching the Symbols from the
jars and just reuse them. They are currently always recreated by
ClassReader re-reading the zips and this isn't a cheap operation... I
believe some ClassReaderWithSymbolCache would make the biggest difference
now (basically storing a trimmed version for the symbols it has loaded, and
just copying the relevant info from these symbols it has already loaded to
further requested symbols instead of reading in the jar again); and such a
cache would probably make all lower-level optimization that we have in
place useless as the underlying filesystems used by the symbols would
remain across requests (we just have to take care of not closing them!)...
I wish to investigate that in a near-ish future. I will try to report
whether this has proven to be a good idea or not then.
Anyway, this is becoming a much divergent story from the initial
discussion, so I'll just mention here that if one uses some sufficient
caching strategy, this `zipinfo-time` tweak becomes much less critical
(since the related operations happen only once per jar, which easily
becomes neglect-able over re-reading zips several times for no new reason).
Cheers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/compiler-dev/attachments/20250328/5b2c154b/attachment.htm>
More information about the compiler-dev
mailing list