RFR: 8316141: Improve CEN header validation checking
    Eirik Bjorsnos 
    duke at openjdk.org
       
    Wed Nov  8 21:27:58 UTC 2023
    
    
  
On Wed, 8 Nov 2023 19:59:34 GMT, Lance Andersen <lancea at openjdk.org> wrote:
> Please review this  PR which enhances the existing CEN header validation checking to ensure that the
>  size of the CEN Header + name length + comment length + extra length do not exceed 65,535 bytes per the PKWare APP.NOTE 4.4.10, 4.4.11, & 4.4.12. Also check that current CEN header will not exceed the length of the CEN array.
> 
> Mach 5 tiers 1-3 are clean with this change.
LGTM, although I'm not sure I follow the underlying motivation of this stricter validation. 
What problem is being solved here? APPNOTE.TXT uses the phrase `SHOULD NOT`. Even if the spec is not an RFC, RFC2119 defines `SHOULD NOT` as:
```This phrase, or the phrase "NOT RECOMMENDED" mean that
   there may exist valid reasons in particular circumstances when the
   particular behavior is acceptable or even useful, but the full
   implications should be understood and the case carefully weighed
   before implementing any behavior described with this label.
I would expect our producer `ZipOutputStream` to be stricter than our consumers in this case, honoring Postel's law. From a implementation robustness perspective, the individual lengths are already validated, it's just the combined clause that is now enforced in this PR.
That said, here are some comments inline:
src/java.base/share/classes/java/util/zip/ZipFile.java line 1225:
> 1223:             int elen = CENEXT(cen, pos);
> 1224:             int clen = CENCOM(cen, pos);
> 1225:             long headerSize = (long)CENHDR + nlen + clen + elen;
Since CENHDR is 46 and nlen, clen, elen are all unsigned shorts, this sum cannot possibly overflow an int. Is the long conversion necessary?
The specification uses the term "combined length", would it be better to name the local variable `combinedLength` instead?
src/java.base/share/classes/java/util/zip/ZipFile.java line 1235:
> 1233: 
> 1234:             if (elen > 0 && !DISABLE_ZIP64_EXTRA_VALIDATION) {
> 1235:                 checkExtraFields(pos, entryPos + nlen, elen);
The naming of `entryPos` confused me here. The name indicates it is the position where the CEN header starts, but we already have `pos` for that. (It actually contains the position where the encoded name starts)
So perhaps it should be renamed to `namePos` or `npos` to make future maintenance less confusing?
Also, I actually liked the `extraStartingOffset` local variable, having a name makes the code easier to follow than just `entryPos + nlen`. But perhaps `extraPos` is shorter and more consistent with other uses of `pos`?
So perhaps: `long extraPos = pos + CENHDR + nlen` ?
src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java line 1596:
> 1594:                 throw new ZipException("invalid CEN header (unsupported compression method: " + method + ")");
> 1595:             }
> 1596:             long headerSize = (long)CENHDR + nlen + clen + elen;
If the corresponding ZipFile local variable is renamed, this should also be updated.
src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java line 1668:
> 1666:             int tagBlockSize = SH(cen, currentOffset);
> 1667:             currentOffset += Short.BYTES;
> 1668:             long tagBlockEndingOffset = (long)currentOffset + tagBlockSize;
I think my ZipFile comment also applies here.
-------------
PR Review: https://git.openjdk.org/jdk/pull/16570#pullrequestreview-1721212606
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1387163396
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1387177874
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1387194439
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1387195085
    
    
More information about the core-libs-dev
mailing list