RFR: 8321156: Improve the handling of invalid UTF-8 byte sequences for ZipInputStream::getNextEntry and ZipFile::getComment [v3]

Alan Bateman alanb at openjdk.org
Sun Feb 25 15:29:54 UTC 2024


On Sun, 25 Feb 2024 14:17:05 GMT, Lance Andersen <lancea at openjdk.org> wrote:

>> Please review this PR which addresses the handling of invalid UTF-8 byte sequences in the entry name of a LOC file header and a Zip file comment which is returned via ZipFile::getComment.
>> 
>> As part of the change, `ZipFile::getComment` will now return `null` if an invalid UTF-8 byte sequence is encountered while converting the byte array to a String.  The CSR for this change has also been approved.
>> 
>> Mach5 tiers 1-3 are clean with this change.
>
> Lance Andersen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Updates based on 2nd round of feedback

Spec + implementation change look fine. I haven't spent time looking at the test.

src/java.base/share/classes/java/util/zip/ZipFile.java line 313:

> 311:      * Returns the zip file comment. If a comment does not exist or an error is
> 312:      * encountered decoding the comment using the charset specified
> 313:      * when opening the Zip file, then {@code null} is returned.

I've previously discussed options with Lance around this issue and I agree with the proposal to specify that it returns null when the decoding fails.

(In passing, the casing of "zip file" is very inconsistent in the javadoc, it's "ZIP file" in some places, "zip file" in others, the change here uses "Zip file").

-------------

Marked as reviewed by alanb (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17995#pullrequestreview-1899700665
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501843252


More information about the core-libs-dev mailing list