ZIP entry copy without recompression
Eirik Bjørsnøs
eirbjo at gmail.com
Thu Jan 19 19:03:32 UTC 2023
Hi,
A common use case for java.util.zip in build tools involves copying
entries from a ZipFile or ZipInputStream to a ZipOutputStream without
actually modifying the data.
Example use cases include minification (make a JAR with only the
reachable classes) and merging (combine several JAR files into one
uberjar).
Inflating an entry just to immediately deflate it again with no
modifications seems wasteful.
The following draft PR introduces
ZipFileInflaterInputStream.transferTo which copies compressed ZipFile
data directly to ZipOutputStream's raw data stream:
https://github.com/openjdk/jdk/pull/12099
I'm typically seeing a 15 X improvement when copying xalan.jar to a
ZipOutputStream backed by a buffered FileOutputStream, or 22 X when
backed by OutputStream.nullOutputStream().
The PR current stays completely on the happy path and is mostly there
to experiment and to show the potential performance benefits. There is
currently not much focus on validation or correctness. Copying files
to a different path is not supported, neither is copying from a
ZipInputStream.
I initially considered creating new methods for raw copies, but opted
to minimize changes to public APIs, that's why I'm overriding
transferTo.
The PR is not intended for regular review, but as a starting point for
a discussion about the usefulness of the idea and the general solution
space. If we can reach consensus on such a discussion, I'll probably
be happy to work on a more complete PR.
Cheers,
Eirik.
More information about the core-libs-dev
mailing list