RFR: Faster ZipFile.getEntry()/entries()

Xueming Shen xueming.shen at oracle.com
Wed May 21 21:19:13 UTC 2014


Hi,

This one didn't make into jdk8. Here is an updated webrev for the jdk9 repo

http://cr.openjdk.java.net/~sherman/zipfile_j/webrev

And a pure java version of j.u.ZipFile is also available at

http://cr.openjdk.java.net/~sherman/zipfile_jj/webrev/

We do have incident reports and requests that suggest a pure Java version
of j.u.ZipFile might be preferred, especially to eliminate the possibility of
jvm crash at native level, mostly triggered by the mmap usage and/or use
scenario that the target zip/jar file is being overwritten while reading.

And java implementation also brings in the benefits of better memory usage
(all memory allocated in java heap), no more expensive jni invocations...

Opinion/comments are appreciated.

Thanks!
-Sherman


On 09/05/2013 04:16 PM, Xueming Shen wrote:
> Hi,
>
> The change proposed here is to bring the zip entry handing code from the native
> level to the java level. This effectively solves the performance issues of ZipFile.getEntry
> and entries() that is caused by multiple jni invocation steps to generate one single
> ZipEntry (see ZipFile.getZipEntry()). A simple non-scientific benchmark test of simply
> iterating the ZipFile via the Enumeration from entries() on rt.jar/charsets.jar suggests
> a 50%+ speed boost.
>
> http://cr.openjdk.java.net/~sherman/zipfile_j/webrev
>
> Couple notes:
>
> (1) Ideally it might be desired to go further to bring all the native code of ZipFile to
> java level (which should help completely remove that mmap crash issue, have better
> file and memory management... ), but it is suggested that it might be better to limit
> the scope of the change at this late release circle.
>
> (2) JavaFile.read0() is the version that uses "getPrimitiveArrayCritical" to read file bits
> into the java array directly (instead of using a stack buffer and then copy into the
> java array), which appears to be 5% faster. But I can't make up my mind of which one
> would be better. Given (1) the trouble we had before in De/Infalter code (when the
> getPrimitiveArrayCritical is being heavily used), (2) FileInputStream uses the same
> "copy" approach, I'm staying with the "copy" appraoch, but option appreciated.
>
> (3) We will have to keep the native implementation (zip_util.c) for the vm directly
> access.
>
> Thanks!
> -Sherman
>




More information about the core-libs-dev mailing list