ZIP entry copy without recompression

Lance Andersen lance.andersen at oracle.com
Tue Jan 31 23:41:36 UTC 2023


I have not add a chance to look or think about this yet but wanted to know I will

On Jan 19, 2023, at 2:03 PM, Eirik Bjørsnøs <eirbjo at gmail.com<mailto:eirbjo at gmail.com>> wrote:

Hi,

A common use case for java.util.zip in build tools involves copying
entries from a ZipFile or ZipInputStream to a ZipOutputStream without
actually modifying the data.

Example use cases include minification (make a JAR with only the
reachable classes) and merging (combine several JAR files into one
uberjar).

Inflating an entry just to immediately deflate it again with no
modifications seems wasteful.

The following draft PR introduces
ZipFileInflaterInputStream.transferTo which copies compressed ZipFile
data directly to ZipOutputStream's raw data stream:

https://github.com/openjdk/jdk/pull/12099

I'm typically seeing a 15 X improvement when copying xalan.jar to a
ZipOutputStream backed by a buffered FileOutputStream, or 22 X when
backed by OutputStream.nullOutputStream().

The PR current stays completely on the happy path and is mostly there
to experiment and to show the potential performance benefits. There is
currently not much focus on validation or correctness. Copying files
to a different path is not supported, neither is copying from a
ZipInputStream.

I initially considered creating new methods for raw copies, but opted
to minimize changes to public APIs, that's why I'm overriding
transferTo.

The PR is not intended for regular review, but as a starting point for
a discussion about the usefulness of the idea and the general solution
space. If we can reach consensus on such a discussion, I'll probably
be happy to work on a more complete PR.

Cheers,
Eirik.

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]






Lance Andersen | Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20230131/08114047/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: oracle_sig_logo.gif
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20230131/08114047/oracle_sig_logo-0001.gif>


More information about the core-libs-dev mailing list