ZipFileSystem performance regression

Claes Redestad claes.redestad at
Tue Apr 16 12:21:53 UTC 2019

Both before and after this regression, it seems the default behavior is
not to use a temporary file (until ZFS.sync(), which writes to a temp
file and then moves it in place, but that's different from what happens
with the useTempFile option enabled). Instead entries (and the backing
zip file system) are kept in-memory.

The cause of the issue here is instead that no deflation happens until
sync(), even when writing to entries in-memory. Previously, the
deflation happened eagerly, then the result of that was copied into
the zip file during sync().

I've written a proof-of-concept patch that restores the behavior of
eagerly compressing entries when the method is METHOD_DEFLATED and the
target is to store byte[]s in-memory (the default scenario):

This restores performance of parallel zip to that of 11.0.2 for the
default case. It still has a similar regression for the case where
useTempFile is enabled, but that should be easily addressed if this
looks like a way forward?

(I've not yet created a bug as I got too caught up in trying to figure
out what was going on here...)



On 2019-04-16 09:29, Alan Bateman wrote:
> On 15/04/2019 14:32, Lennart Börjeson wrote:
>> :
>> Previously, the deflation was done when in the call to Files.copy, 
>> thus executed in parallel, and the final ZipFileSystem.close() didn't 
>> do anything much.
> Can you submit a bug? When creating/updating a zip file with zipfs then 
> the closing the file system creates the zip file. Someone needs to check 
> but it may have been that the temporary files (on the file system 
> hosting the zip file) were deflated when writing (which is surprising 
> but may have been the case).
> -Alan

More information about the core-libs-dev mailing list