RFR: 8321156: Improve the handling of invalid UTF-8 byte sequences for ZipInputStream::getNextEntry and ZipFile::getComment [v2]

Lance Andersen lancea at openjdk.org
Sat Feb 24 18:54:09 UTC 2024


On Sat, 24 Feb 2024 16:57:50 GMT, Eirik Bjørsnøs <eirbjo at openjdk.org> wrote:

>> Lance Andersen has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Updates based on 1st round of feedback
>
> src/java.base/share/classes/java/util/zip/ZipFile.java line 331:
> 
>> 329:             try {
>> 330:                 return res.zsrc.zc.toString(res.zsrc.comment);
>> 331:             } catch(IllegalArgumentException iae) {
> 
> Suggestion:
> 
>             } catch (IllegalArgumentException iae) {

fixed

> src/java.base/share/classes/java/util/zip/ZipInputStream.java line 526:
> 
>> 524:             throw  (ZipException) new ZipException(
>> 525:                     "invalid LOC header (bad entry name)").initCause(ex);
>> 526:         }
> 
> There is a lot going on here:
> 
> * The creation of a ZipEntry
> * The interpretation of the general purpose bit flag
> * The split to call either a static or non-static toString depending on the flag interpretation
> * The exception handling, including the somewhat unusual initCause and cast
> 
> The comment at 517 seems even more detached from the logic it tries to describe after this change.
> 
> Perhaps we could benefit from introducing a `zipCoderFromFlag(int flag)` method, similar to that in ZipFile?
> 
> This would allow the section to be rewritten to something that hide the flag parsing details, leaves us with a single (polymorphic) toString invocation and extracts `createZipEntry` from the try/catch scope. 
> 
> Something like this: (I also changed the exception handling here, that's a bit orthogonal to the main points above)
> 
> Suggestion:
> 
>         String name;
>         try {
>             name = zipCoderForFlag(flag).toString(b, len);
>         } catch (IllegalArgumentException ex) {
>             ZipException e = new ZipException("invalid LOC header (bad entry name)");
>             e.initCause(ex);
>             throw e;
>         }
>         ZipEntry e = createZipEntry(name);

I believe you meant `ZipCoderFromPos(int flag)`.  I don't think it is needed in ZipInputStream but I did move the call to createZipEntry out of the try block per your suggestion

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 49:
> 
>> 47:  * @summary Validate that a ZipException is thrown when opening a ZIP file via
>> 48:  * ZipFile, traversing a Zip File via ZipInputStream, with invalid UTF-8
>> 49:  * byte sequences in the name or comment fields fails with ZipException.
> 
> The leading summary sentence needs a cleanup. Here's one suggestion:
> 
> Suggestion:
> 
>  * @summary Validate that a ZipException is thrown when a ZIP file with 
>  * invalid UTF-8 byte sequences in the name or comment fields is opened via
>  * ZipFile or traversed via ZipInputStream.

Updated

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 60:
> 
>> 58:     private static final byte[] INVALID_UTF8_BYTE_SEQUENCE = {(byte) 0xF0, (byte) 0xA4, (byte) 0xAD};
>> 59:     // Expected error message when an invalid entry name or entry comment is
>> 60:     // encountered when accessing a CEN Header
> 
> This two-line comment could benefit from an earlier line break.

The break makes sense as is in IntelliJ, so please clarify

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 64:
> 
>> 62: 
>> 63:     // Expected error message when an invalid entry name is encountered when
>> 64:     // accessing a LOC Header
> 
> This two-line comment could benefit from an earlier line break.

Same comment as above

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 65:
> 
>> 63:     // Expected error message when an invalid entry name is encountered when
>> 64:     // accessing a LOC Header
>> 65:     private static final String LOC_BAD_ENTRY_NAME_OR_COMMENT = "invalid LOC header (bad entry name)";
> 
> Should this be renamed `LOC_BAD_ENTRY_NAME`?

Yep, I thought I made that change but must have just thought about it and never went back to it

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 128:
> 
>> 126:      * 0x93: Comment     :   [ZipFileComment]
>> 127:      */
>> 128:     public static byte[] VALID_ZIP = {
> 
> I see this byte array encoding of ZIP files is a pattern used across tests. My preference would be to encode this in Base64 or hex, since that would make the consumption by external command line tools easier and the encoded representation somewhat prettier. But no big deal, this might just come down to personal preference. (This isn't human readable anyhow for the normal definition of "human" :)

Yes, this is a  style preference and plan to leave as is.

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 182:
> 
>> 180:     /**
>> 181:      * Validate that the original Zip file can be opened via ZipFile.
>> 182:      * @throws IOException if an entry occurs
> 
> Suggestion:
> 
>      * @throws IOException if an error occurs

Not sure how I missed that. fixed

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 196:
> 
>> 194:      * Validate that the original Zip file can be opened and traversed via
>> 195:      * ZipinputStream::getNextEntry.
>> 196:      * @throws IOException if an entry occurs
> 
> Suggestion:
> 
>      * @throws IOException if an error occurs

same comment above fixed

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 201:
> 
>> 199:     public void traverseZipWithZipInputStreamTest() throws IOException {
>> 200:         Files.write(ZIP_FILE, zipArray);
>> 201:         try( ZipInputStream zis = new ZipInputStream(new FileInputStream(ZIP_FILE.toFile()))) {
> 
> Suggestion:
> 
>         try (ZipInputStream zis = new ZipInputStream(new FileInputStream(ZIP_FILE.toFile()))) {

fixed

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 203:
> 
>> 201:         try( ZipInputStream zis = new ZipInputStream(new FileInputStream(ZIP_FILE.toFile()))) {
>> 202:             ZipEntry ze;
>> 203:             while((ze = zis.getNextEntry()) != null ) {
> 
> Suggestion:
> 
>             while ((ze = zis.getNextEntry()) != null) {

fixed

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 243:
> 
>> 241: 
>> 242:     /**
>> 243:      * Validate that a ZipException is thrown when an entry name is encountered
> 
> Is the "is" here out of place or are we missing a "which"?
> 
> Suggestion:
> 
>      * Validate that a ZipException is thrown when an entry name encountered

Modified the comment

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 267:
> 
>> 265:      * @throws IOException if an error occurs
>> 266:      */
>> 267:     private void createInvalidUTFEntryInZipFile(int offset) throws IOException {
> 
> Is the "In" in this method name helpful? (To me it suggests a file argument, not a result) 
> Suggestion:
> 
>     private void createInvalidUTFEntryInZipFile(int offset) throws IOException {

Well,  your change is the same but to your comment we are modifying an entry in the ZipFile.  So I don't mind changing the method name but not sure its is really needed

> test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 304:
> 
>> 302:      *        System.out.println(result);
>> 303:      *      }
>> 304:      * </pre>
> 
> Can we use `@snippet` in tests? Would allow syntax highlighting in IDEs

Sure changed to use `@snippet`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501653167
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501677114
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501670123
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501655822
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501662492
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501656632
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501657036
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501657953
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501658140
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501658526
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501658974
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501663806
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501660807
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501666735


More information about the core-libs-dev mailing list