ZipFileSystem performance regression

Xueming Shen xueming.shen at gmail.com
Tue Apr 16 19:44:01 UTC 2019


One of the motivations back then is to speed up the performance of accessing

those entries, means you don't have to deflate/inflate those new/updated 
entries

during the lifetime of that zipfilesystem. Those updated entries only 
get compressed

when go to storage. So the regression is more like a trade off of 
performance of

different usages. (it also simplifies the logic on handing different 
types of entries ...)


One idea I experimented long time ago for jartool is to concurrently 
write out

entries when need compression ... it does gain some performance improvement

on multi-cores, but not lots, as it ends up coming back to the main 
thread to

write out to the underlying filesystem.


-Sherman

On 4/16/19 5:21 AM, Claes Redestad wrote:
> Both before and after this regression, it seems the default behavior is
> not to use a temporary file (until ZFS.sync(), which writes to a temp
> file and then moves it in place, but that's different from what happens
> with the useTempFile option enabled). Instead entries (and the backing
> zip file system) are kept in-memory.
>
> The cause of the issue here is instead that no deflation happens until
> sync(), even when writing to entries in-memory. Previously, the
> deflation happened eagerly, then the result of that was copied into
> the zip file during sync().
>
> I've written a proof-of-concept patch that restores the behavior of
> eagerly compressing entries when the method is METHOD_DEFLATED and the
> target is to store byte[]s in-memory (the default scenario):
>
> http://cr.openjdk.java.net/~redestad/scratch/zfs.eager_deflation.00/
>
> This restores performance of parallel zip to that of 11.0.2 for the
> default case. It still has a similar regression for the case where
> useTempFile is enabled, but that should be easily addressed if this
> looks like a way forward?
>
> (I've not yet created a bug as I got too caught up in trying to figure
> out what was going on here...)
>
> Thanks!
>
> /Claes
>
> On 2019-04-16 09:29, Alan Bateman wrote:
>> On 15/04/2019 14:32, Lennart Börjeson wrote:
>>> :
>>>
>>> Previously, the deflation was done when in the call to Files.copy, 
>>> thus executed in parallel, and the final ZipFileSystem.close() 
>>> didn't do anything much.
>>>
>> Can you submit a bug? When creating/updating a zip file with zipfs 
>> then the closing the file system creates the zip file. Someone needs 
>> to check but it may have been that the temporary files (on the file 
>> system hosting the zip file) were deflated when writing (which is 
>> surprising but may have been the case).
>>
>> -Alan


More information about the core-libs-dev mailing list