RFR: 8316141: Improve CEN header validation checking
Lance Andersen
lancea at openjdk.org
Thu Nov 9 17:25:02 UTC 2023
On Wed, 8 Nov 2023 20:26:32 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:
>> Please review this PR which enhances the existing CEN header validation checking to ensure that the
>> size of the CEN Header + name length + comment length + extra length do not exceed 65,535 bytes per the PKWare APP.NOTE 4.4.10, 4.4.11, & 4.4.12. Also check that current CEN header will not exceed the length of the CEN array.
>>
>> Mach 5 tiers 1-3 are clean with this change.
>
> src/java.base/share/classes/java/util/zip/ZipFile.java line 1225:
>
>> 1223: int elen = CENEXT(cen, pos);
>> 1224: int clen = CENCOM(cen, pos);
>> 1225: long headerSize = (long)CENHDR + nlen + clen + elen;
>
> Since CENHDR is 46 and nlen, clen, elen are all unsigned shorts, this sum cannot possibly overflow an int. Is the long conversion necessary?
>
> The specification uses the term "combined length", would it be better to name the local variable `combinedLength` instead?
I can remove the cast as that was a holdover. I chose to make this a long knowing that it would not overflow but an overflow while unlikely could occur depending on the value of `pos` in the statement below
if (headerSize > 0xFFFF || pos + headerSize > cen.length - ENDHDR) {
zerror("invalid CEN header (bad header size)");
}
I could keep headerSize an int and then move the cast to the if statement:
if (headerSize > 0xFFFF || (long)pos + headerSize > cen.length - ENDHDR) {
zerror("invalid CEN header (bad header size)");
}
I decided making headerSize a long might be clearer but do not have a strong preference and will go with the consensus
As far as the name, I don't have a strong preference, but not sure combinedLength is any better
> src/java.base/share/classes/java/util/zip/ZipFile.java line 1235:
>
>> 1233:
>> 1234: if (elen > 0 && !DISABLE_ZIP64_EXTRA_VALIDATION) {
>> 1235: checkExtraFields(pos, entryPos + nlen, elen);
>
> The naming of `entryPos` confused me here. The name indicates it is the position where the CEN header starts, but we already have `pos` for that. (It actually contains the position where the encoded name starts)
>
> So perhaps it should be renamed to `namePos` or `npos` to make future maintenance less confusing?
>
> Also, I actually liked the `extraStartingOffset` local variable, having a name makes the code easier to follow than just `entryPos + nlen`. But perhaps `extraPos` is shorter and more consistent with other uses of `pos`?
>
> So perhaps: `long extraPos = pos + CENHDR + nlen` ?
`entryPos` was the name of the field from a previous PR so I did not see a need to change it and decided there was no need to keep extraStartingOffset given the change in validation above.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1388318963
PR Review Comment: https://git.openjdk.org/jdk/pull/16570#discussion_r1388313503
More information about the core-libs-dev
mailing list