RFR: JDK-8142508: To bring j.u.z.ZipFile's native implementation to Java to remove the expensive jni cost and mmap crash risk

Xueming Shen xueming.shen at oracle.com
Mon Dec 7 19:03:48 UTC 2015


Hi Sean,

Thanks for the review. Webrev has been updated accordingly. The
only suggestion I did not follow is to append the file name into the
zip error/exception message. I would assume it might have vuln
concern if the file name is published in such exception. It might be
OK to only publish the "name" part (exclude the parent path) in such
scenario, but I would prefer to leave this to a separate rfe, if desired.

http://cr.openjdk.java.net/~sherman/8142508/webrev

Thanks,
Sherman

On 12/07/2015 08:51 AM, Seán Coffey wrote:
> Hi Sherman,
>
> Nice work. It'll certainly help protect the VM from bad application code. I've no major issues with the new code. Some comments below.
>
> src/java.base/share/classes/java/util/zip/ZipFile.java
> unused import :
>> import java.nio.file.Path;
>
> line 840 : This could be marked final
>> private Key key;
>
> line 914 :  Same here (final)
>
>> private static HashMap<Key, Source> files = new HashMap<>();
>
> While in this class, could you also mark inflaterCache as final (line 451)
>
>
> -        String prop = sun.misc.VM.getSavedProperty("sun.zip.disableMemoryMapping");
> With this property removed, I think you'll need to update src/java.base/share/classes/sun/misc/VM.java line 278 also. It becomes redundant code.
>
> We'll also need to log a sub-task to track doc updates to highlight that this property is no longer relevant.
>
> ZipFile : line 195 :
>
> Objects.requireNonNull(charset, "charset");
>
> You've changed the order of exception throwing here. To aid with a possible backport (and behaviour), could you consider moving the check to post the 'mode' checks ? If you can't, I'll just remember it when backporting!
>
> line 525 :
>
> byte[] cen = zsrc.cen;
>
> It might be no harm to insert a comment here saying that the caller method must check for 'ensureOpen' ?
>
> I think you need a ensureOpen check on the new initDataOffset() method. You could hit an NPE otherwise. E.g. :
>>     ZipFile zf = new ZipFile(new File("/tmp/jtreg.zip"));
>>     ZipEntry ze = zf.getEntry("jtreg/release");
>>     InputStream is = zf.getInputStream(ze);
>>     zf.close();
>>     is.skip(1);
>
> There might be room for a small improvement on the zerror method. Could
> you pre-append the name of the File to each exception message via this method ?
>
> On the new test, are you planning to add this to the jtreg tests or will it be
> a manual stand alone ? It would have to be tuned down in terms of resources
> if added to the auto-test list.
> Regards,
> Sean.
> On 11/11/15 20:22, Xueming Shen wrote:
>> Hi
>>
>> Please help review the changes for JDK-8142508 (third try)
>>
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8142508
>> webrev: http://cr.openjdk.java.net/~sherman/8142508/webrev
>>
>> Mainly to address the issues in current j.u.z.ZipFile implementation as listed
>> below
>>
>> (1) The ZIP file format support code is in native C (shared with the VM via
>>      ZipFile.c -> zip_util.c). Any entry lookup, creation operation requires multiple
>>      round-trips of expensive jni calls.
>>
>> (2) The native C implementation which uses mmap to map in the central directory
>>      table appears to be a big risk of vm crash when the underlying jar file gets
>>      overwritten with new contents while it is still being used by other ZipFile. The
>>      crash reports keep coming in even after we have introduced in system flag
>>      to disable it (sun.zip.disableMemoryMapping).
>>
>> (3) The use of "filename + lastModified()" cache (zip_util.c) appears to be broken
>>      if the timestamp is in low resolution/precision (File.getModified() for example,
>>      might only have "second" ersolution on solaris/linux), and/or the file is being
>>      overwritten.
>>
>> The clean solution appears to bring the ZIP format support code completely from
>> native to Java to remove the jni invocation cost and the mmap risk. Also to use
>> the fileKey and lastModified from java.nio.file.attribute.BasicFileAttributes to have
>> better/correct cache matching.
>>
>> Benchmark:
>>
>> This simple jmh measurement suggests a big boost of the performance of
>> ZipFile.getEntry()/entries()/getStream() which are basically entry related
>> accesses (the "open only" has some regression, which is expected as we
>> switched from the mmap to simply read the cen table in into a byte[])
>>
>> http://cr.openjdk.java.net/~sherman/8142508/MyBenchmark.java
>>
>> # JDK9 base
>>
>> Benchmark                  Mode  Cnt    Score    Error  Units
>> MyBenchmark.testEntries    avgt   50   13.671 ±  0.385  ms/op
>> MyBenchmark.testGetEntry   avgt   50   17.414 ±  0.803  ms/op
>> MyBenchmark.testGetStream  avgt   50   42.398 ± 10.136  ms/op
>> MyBenchmark.testOpen       avgt   50    3.049 ±  0.094  ms/op
>> MyBenchmark.testRead       avgt   50  215.179 ±  9.926  ms/op
>> MyBenchmark.testReadAll    avgt   50  244.422 ± 19.375  ms/op
>> --------------------------------------------------------------------------------------
>> # JDK9 ZipFile without jni invocation
>>
>> Benchmark                  Mode  Cnt    Score    Error  Units
>> MyBenchmark.testEntries    avgt   50    6.436 ±  0.422  ms/op
>> MyBenchmark.testGetEntry   avgt   50   10.021 ±  0.699  ms/op
>> MyBenchmark.testGetStream  avgt   50   38.713 ± 16.687  ms/op
>> MyBenchmark.testOpen       avgt   50    3.288 ±  0.119  ms/op
>> MyBenchmark.testRead       avgt   50  220.653 ±  8.529  ms/op
>> MyBenchmark.testReadAll    avgt   50  249.798 ± 18.438  ms/op
>> ---------------------------------------------------------------------------------------
>>
>> Test:
>> http://cr.openjdk.java.net/~sherman/8142508/webrev/test/java/util/zip/ZipFile/TestZipFile.java.html
>>
>> Verified the new ZipFile runs as expected when the underlying jar/zip file get
>> deleted/overwritten when the zip still be used. The "old" ZipFile fails to continue
>> to work but no crash, and the "new" one works correctly on updated zip file
>> without problem (The test runs a little long, have not decided if it should be
>> checked in as unit test).
>>
>> -Sherman
>>
>




More information about the core-libs-dev mailing list