RFR: Faster ZipFile.getEntry()/entries()
Jeroen Frijters
jeroen at sumatra.nl
Thu May 22 07:07:28 UTC 2014
Hi Sherman,
As a (minor) data point, IKVM.NET has been using a pure Java ZipFile implementation since day one (based on the GNU Classpath version) and other than a few compat bugs in the early days people have never complained about it.
For obvious reasons, I'd certainly prefer the pure Java version (to minimize the amount of work I have to do ;-)), but I've also always thought that it was quite a hard sell that the native zip code was faster than pure Java code, given the overhead of JNI and the cost of native memory interop/pinning.
Regards,
Jeroen
> -----Original Message-----
> From: core-libs-dev [mailto:core-libs-dev-bounces at openjdk.java.net] On
> Behalf Of Xueming Shen
> Sent: Wednesday, May 21, 2014 23:19
> To: core-libs-dev at openjdk.java.net
> Subject: Re: RFR: Faster ZipFile.getEntry()/entries()
>
> Hi,
>
> This one didn't make into jdk8. Here is an updated webrev for the jdk9
> repo
>
> http://cr.openjdk.java.net/~sherman/zipfile_j/webrev
>
> And a pure java version of j.u.ZipFile is also available at
>
> http://cr.openjdk.java.net/~sherman/zipfile_jj/webrev/
>
> We do have incident reports and requests that suggest a pure Java
> version of j.u.ZipFile might be preferred, especially to eliminate the
> possibility of jvm crash at native level, mostly triggered by the mmap
> usage and/or use scenario that the target zip/jar file is being
> overwritten while reading.
>
> And java implementation also brings in the benefits of better memory
> usage (all memory allocated in java heap), no more expensive jni
> invocations...
>
> Opinion/comments are appreciated.
>
> Thanks!
> -Sherman
>
>
> On 09/05/2013 04:16 PM, Xueming Shen wrote:
> > Hi,
> >
> > The change proposed here is to bring the zip entry handing code from
> > the native level to the java level. This effectively solves the
> > performance issues of ZipFile.getEntry and entries() that is caused by
> > multiple jni invocation steps to generate one single ZipEntry (see
> > ZipFile.getZipEntry()). A simple non-scientific benchmark test of
> > simply iterating the ZipFile via the Enumeration from entries() on
> rt.jar/charsets.jar suggests a 50%+ speed boost.
> >
> > http://cr.openjdk.java.net/~sherman/zipfile_j/webrev
> >
> > Couple notes:
> >
> > (1) Ideally it might be desired to go further to bring all the native
> > code of ZipFile to java level (which should help completely remove
> > that mmap crash issue, have better file and memory management... ),
> > but it is suggested that it might be better to limit the scope of the
> change at this late release circle.
> >
> > (2) JavaFile.read0() is the version that uses
> > "getPrimitiveArrayCritical" to read file bits into the java array
> > directly (instead of using a stack buffer and then copy into the java
> > array), which appears to be 5% faster. But I can't make up my mind of
> > which one would be better. Given (1) the trouble we had before in
> > De/Infalter code (when the getPrimitiveArrayCritical is being heavily
> used), (2) FileInputStream uses the same "copy" approach, I'm staying
> with the "copy" appraoch, but option appreciated.
> >
> > (3) We will have to keep the native implementation (zip_util.c) for
> > the vm directly access.
> >
> > Thanks!
> > -Sherman
> >
More information about the core-libs-dev
mailing list