<div dir="ltr"><div dir="ltr">On Fri, Feb 24, 2023 at 9:22 AM Alan Bateman <<a href="mailto:Alan.Bateman@oracle.com" target="_blank">Alan.Bateman@oracle.com</a>> wrote:</div><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
As a general point, the ZIP format can have redundant metadata and
there can be cases where the CRC-32 isn't available when writing a
LOC header.</div></blockquote><div><br></div><div>ZipInputStream throws exceptions in both of these cases. If the general purpose bit flag 3 is set, then CRC is set to zero in the LOC, and the actual CRC is put in the data descriptor immediately following the compressed data. With this format, an exception is thrown in ZipInputStream.readEnd:</div><div><br></div><div><a href="https://github.com/openjdk/jdk/blob/8f7c4969c28c58ae4b9adeed822707b28be16dd0/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L624-L626">https://github.com/openjdk/jdk/blob/8f7c4969c28c58ae4b9adeed822707b28be16dd0/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L624-L626</a><br></div><div><br></div><div>If the CRC-32 values is in the LOC, the exception is thrown when the read reaches the end of the data, in ZipInputStream.read:</div><div><br></div><div><a href="https://github.com/openjdk/jdk/blob/8f7c4969c28c58ae4b9adeed822707b28be16dd0/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L624-L626">https://github.com/openjdk/jdk/blob/8f7c4969c28c58ae4b9adeed822707b28be16dd0/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L624-L626</a><br></div><div> </div><div>(The test I linked to covers both of these two cases)</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div> At the same time, the APIs work differently in that
ZipFile opens a ZIP file so it has access to the CEN whereas
ZipInputStream is working on a stream of ZIP entries and does not
read the CEN. So some inconsistencies in the handling is not too
surprising.<br></div></blockquote><div><br></div><div>Indeed, but I found it a bit amusing that ZipFile (and ZipFileSystem), which both see the "full picture", are actually the ones to not enforce the CRC. It does not make complete sense to me from a purely technical point of view.</div><div><br></div><div>Perhaps the CRC in the CEN is less trustworthy across implementations than the one found in the LOC/Data Descriptor..</div><div><br></div><div>Cheers,</div><div>Eirik. </div></div></div>
</div>