ZIP entry copy without recompression

Lance Andersen lance.andersen at oracle.com
Fri Mar 10 11:31:30 UTC 2023


Morning Eirik,

Thank you Sean for creating the CR.  As of this week, Eirik has author status so he can now create JBS issues 😊

I believe Jai is going to follow up with you has he started down a similar path earlier this week looking into a jarsigner issue.

So please coordinate to avoid duplication.

Also, it would probably be worth looking a ZipFS for the same improvement.

Thank you again Eirik(and Jai) for looking into this issue

Best
Lance

On Mar 10, 2023, at 5:39 AM, Sean Coffey <sean.coffey at oracle.com<mailto:sean.coffey at oracle.com>> wrote:


I think that's a fine idea Eirik. Definitely has its use cases like you mention.

Some jarsigner operations would also benefit from this. I've created https://bugs.openjdk.org/browse/JDK-8303960 to track it.

regards,
Sean.

On 31/01/2023 23:41, Lance Andersen wrote:
I have not add a chance to look or think about this yet but wanted to know I will

On Jan 19, 2023, at 2:03 PM, Eirik Bjørsnøs <eirbjo at gmail.com<mailto:eirbjo at gmail.com>> wrote:

Hi,

A common use case for java.util.zip in build tools involves copying
entries from a ZipFile or ZipInputStream to a ZipOutputStream without
actually modifying the data.

Example use cases include minification (make a JAR with only the
reachable classes) and merging (combine several JAR files into one
uberjar).

Inflating an entry just to immediately deflate it again with no
modifications seems wasteful.

The following draft PR introduces
ZipFileInflaterInputStream.transferTo which copies compressed ZipFile
data directly to ZipOutputStream's raw data stream:

https://github.com/openjdk/jdk/pull/12099

I'm typically seeing a 15 X improvement when copying xalan.jar to a
ZipOutputStream backed by a buffered FileOutputStream, or 22 X when
backed by OutputStream.nullOutputStream().

The PR current stays completely on the happy path and is mostly there
to experiment and to show the potential performance benefits. There is
currently not much focus on validation or correctness. Copying files
to a different path is not supported, neither is copying from a
ZipInputStream.

I initially considered creating new methods for raw copies, but opted
to minimize changes to public APIs, that's why I'm overriding
transferTo.

The PR is not intended for regular review, but as a starting point for
a discussion about the usefulness of the idea and the general solution
space. If we can reach consensus on such a discussion, I'll probably
be happy to work on a more complete PR.

Cheers,
Eirik.

<oracle_sig_logo.gif>






Lance Andersen | Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>




[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]






Lance Andersen | Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20230310/315c2798/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: oracle_sig_logo.gif
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20230310/315c2798/oracle_sig_logo-0001.gif>


More information about the core-libs-dev mailing list