RFR: 8352642: Set zipinfo-time=false when constructing zipfs FileSystem in com.sun.tools.javac.file.JavacFileManager$ArchiveContainer for better performance
Jason Zaugg
jzaugg at openjdk.org
Thu Mar 27 23:55:11 UTC 2025
On Tue, 25 Mar 2025 15:19:57 GMT, Jan Lahoda <jlahoda at openjdk.org> wrote:
>>> There are some more new jar FileSystems created in Locations, but those only typically read one file, so probably not that big deal w.r.t. this flag.
>>
>> Agreed. I initially changed those as well but when I realized they only read a single file I reverted.
>>
>>> I wonder if there's some suspicion of a particular problem in the other branch,
>>
>> The `then` branch, which I've modified, was introduced in e652402ed239e171a6f87f5ef88e07d505ad3fe8 (JDK-8149757: Implement Multi-Release JAR aware JavacFileManager for javac). It became necessary to ensure exact usage of `ZipFileSystemProvider` to be able to pass in the `env` map with the `multiRelease` config.
>>
>> I reviewed the discussion of that change. There was a mention of the introduction of this branch, but it didn't go as far to suggest removal of the `FileSystems.newFileSystem` call, it only justified the addition of the `jarFsProvider.newFileSystem` call.
>>
>> https://mail.openjdk.org/pipermail/compiler-dev/2016-April/thread.html
>>
>>
>>> -why is JavacFileManager.getJarFSProvider() needed? Shouldn't
>>> FileSystems.newFileSystem(realPath, env);
>>> be enough? (Calling an SPI directly from an API client seems
>>> suspicious to me.)
>>
>> If I recall correctly, the NIO2 API design forced this one. The method
>> you are referring to does not exist. There is only one that takes a URI,
>> and that has very different semantics. So we had to go with the method
>> on the provider, hence the use of getJarFSProvider.
>>
>>
>> The tests introduced in e652402ed239e171a6f87f5ef88e07d505ad3fe8 don't appear to exercise the else branch.
>>
>> It appears there are two ways to get there:
>> 1. Direct, programatic use of the a file manager in which the initialization sequence does not populate `multiReleaseValue`
>> 2. Use of javac with a `.zip` as a classpath entry (I had to check, but this does work.) Side question -- what is the expected behaviour with a classpath entry with a non-lowercase extension, `.JAR`? It is allowed by `javac` but would not respect the `multi-release` setting.
>
> I think that's all true, but I am not sure if it quite explains why not do the same for the same for the else section. There is `FileSystems.newFileSystem(Path path, Map<String,?> env, ClassLoader loader)` since JDK 13, so there shouldn't be a big problem to call that one instead of `FileSystems.newFileSystem(Path path, ClassLoader loader)` (the latter delegates to the former, so there shouldn't be any real change in semantics).
>
> I guess one concern could be that, in theory, this method works for other FS providers/types as well. But, presumably, those would ignore env keys they don't know, and would ignore it.
Thanks for pointing that out, I was not aware of the addition of that overload.
I agree that passing `zipinfo-time` to an unknown provider is likely quite safe.
There is an argument that `CacheInfo.jarFsProvider` is no longer necessary, AFAICT it was introduced to work around the absent overload. I would suggest this matter be considered independently.
Noting that I found `jarFsProvider` to be a useful extension point (albeit in undocumented API) for injecting performance improvements into existing `javac` versions (both the `zipinfo-time` change, and further changes to defer the `close` of `ZipFileSystem` over JARs known to be immutable).
A GitHub code search reveals this idea is also pursued in [Eclipse JDT](https://github.com/eclipse-jdtls/eclipse-jdt-core-incubator/blob/e3fd9a0c74374951d4c91663a6fd52f16f6ee9c6/org.eclipse.jdt.core.javac/src/org/eclipse/jdt/internal/javac/CachingJarsJavaFileManager.java#L45). So any change to `getJarFsProvider`.
Back to implementing your suggestion, here's how what I have prepared. If you like this I'll add to the PR. Maybe the comment would be better as a TODO link to a followup issue if you think that's worthwhile?
Map<String,String> env = new HashMap<>();
// ignores timestamps not stored in ZIP central directory, reducing I/O
// This key is handled by ZipFileSystem only.
env.put("zipinfo-time", "false");
if (multiReleaseValue != null && archivePath.toString().endsWith(".jar")) {
env.put("multi-release", multiReleaseValue);
FileSystemProvider jarFSProvider = fsInfo.getJarFSProvider();
Assert.checkNonNull(jarFSProvider, "should have been caught before!");
try {
this.fileSystem = jarFSProvider.newFileSystem(archivePath, env);
} catch (ZipException ze) {
throw new IOException("ZipException opening "" + archivePath.getFileName() + "": " + ze.getMessage(), ze);
}
} else {
// Less common case. Possible if the file manager was not initialized in JavacTask,
// OR if non "*.jar" files are on the classpath. At the time of writing, both `javac -cp a.zip`
// and `javac -cp x.JAR` file execute this branch.
this.fileSystem = FileSystems.newFileSystem(archivePath, env, (ClassLoader)null);
}
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24176#discussion_r2017732584
More information about the compiler-dev
mailing list