RFR: 8316141: Improve CEN header validation checking

Lance Andersen lancea at openjdk.org
Thu Nov 9 17:25:02 UTC 2023


On Wed, 8 Nov 2023 20:26:32 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:

>> Please review this  PR which enhances the existing CEN header validation checking to ensure that the
>>  size of the CEN Header + name length + comment length + extra length do not exceed 65,535 bytes per the PKWare APP.NOTE 4.4.10, 4.4.11, & 4.4.12. Also check that current CEN header will not exceed the length of the CEN array.
>> 
>> Mach 5 tiers 1-3 are clean with this change.
>
> src/java.base/share/classes/java/util/zip/ZipFile.java line 1225:
> 
>> 1223:             int elen = CENEXT(cen, pos);
>> 1224:             int clen = CENCOM(cen, pos);
>> 1225:             long headerSize = (long)CENHDR + nlen + clen + elen;
> 
> Since CENHDR is 46 and nlen, clen, elen are all unsigned shorts, this sum cannot possibly overflow an int. Is the long conversion necessary?
> 
> The specification uses the term "combined length", would it be better to name the local variable `combinedLength` instead?

I can remove the cast as that was a holdover.  I chose to  make this a long knowing that it would not overflow but  an overflow while unlikely could occur depending on the value of `pos`   in the statement below


if (headerSize > 0xFFFF || pos + headerSize > cen.length - ENDHDR) {
                zerror("invalid CEN header (bad header size)");
 }


I could keep headerSize an int and then move the cast to the if statement:


if (headerSize > 0xFFFF || (long)pos + headerSize > cen.length - ENDHDR) {
                zerror("invalid CEN header (bad header size)");
 }

I  decided making headerSize a long  might be clearer but do not have a strong preference and will go with the consensus 

As far as the name,  I don't have a strong preference, but not sure combinedLength is any better

> src/java.base/share/classes/java/util/zip/ZipFile.java line 1235:
> 
>> 1233: 
>> 1234:             if (elen > 0 && !DISABLE_ZIP64_EXTRA_VALIDATION) {
>> 1235:                 checkExtraFields(pos, entryPos + nlen, elen);
> 
> The naming of `entryPos` confused me here. The name indicates it is the position where the CEN header starts, but we already have `pos` for that. (It actually contains the position where the encoded name starts)
> 
> So perhaps it should be renamed to `namePos` or `npos` to make future maintenance less confusing?
> 
> Also, I actually liked the `extraStartingOffset` local variable, having a name makes the code easier to follow than just `entryPos + nlen`. But perhaps `extraPos` is shorter and more consistent with other uses of `pos`?
> 
> So perhaps: `long extraPos = pos + CENHDR + nlen` ?

`entryPos` was the name of the field from a previous PR so I did not see a need to change it and decided there was no need to keep extraStartingOffset given the change in validation above.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1388318963
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1388313503


More information about the core-libs-dev mailing list