JDK7's java.util.zip breakage with very large files
pisymbol at gmail.com
Thu Feb 7 16:54:56 UTC 2013
What I am trying to do is generate Zip64 extensions within a JAR file
and then dissect the zip contents (end of directory records, file
However, when I use jar or a small program that I wrote which uses
java.util.zip to zip up a very large file >12G, I do not get the
Despite the fact that jar succeeds, the zip binary created does not
have an End of Directory (EoD) record at all! (like
ZipOutStream.finish() was never called).
I am able to extract the large file and verify its MD5 which is correct.
So I am doing this (data is 12G):
- md5sum data
- jar cvf data.jar data
[wait a while, out is around 2.3G, return code is 0]
- bvi data.jar (look for EoD at end of jar file, magic 0x06054B50 or
even the zip64 (EoD) locator/record signatures)
Not found! (bummer)
- jar tvf data.jar -> I see the correct size which means jar is
reading the 64-bit sizes correctly, earlier builds (<b55 I think) I
would see -1.
- jar xvf data.jar
- md5sum data
- Matches original data
I noticed that after the deflate compressed blocks, the file is
appended with a lot of zeros (I originally thought it got truncated
but from the above extraction test, that is not the case).
This is on a x86-64 Fedora 13 system using yesterday's JDK7 build tree
(I downloaded the build infrastructure and set it to download bundles
during the build - I had no build failures).
Why for very large files does jar (java.util.zip) output a
non-standard zip file, i.e. no EoD record and friends?
I have just begun to look at the actual code to see whether this is
pilot error on my part or something else a foot (my code calls
zos.finish() explicitly which has no effect - not sure where jar calls
it just yet from ZipOutputStream.finish()).
More information about the core-libs-dev