RFR: 8328995: Launcher can't open jar files where the offset of the manifest is >4GB [v8]

Liam Miller-Cushon cushon at openjdk.org
Fri Jul 19 21:10:33 UTC 2024


On Fri, 19 Jul 2024 15:18:26 GMT, Lance Andersen <lancea at openjdk.org> wrote:

>> src/java.base/share/native/libjli/parse_manifest.c line 507:
>> 
>>> 505:                   || censiz == ZIP64_MAGICVAL
>>> 506:                   || cenoff == ZIP64_MAGICVAL)
>>> 507:                 && cenext > 0) {
>> 
>> I went through these changes and here's my first round of review.
>> 
>> Before the changes proposed in this PR, this part of the code which is responsible for finding and returning a constructed zip entry instance would blindly use the local header offset value from the central directory of the entry being looked up. It would then "seek" to that position and read the metadata of that entry (details like uncompressed length, compressed length, the compression method) and return back that entry instance. Clearly this isn't right when zip64 entries are involved since various details of such an entry can reside in the zip64 extra field block of that entry instead of being in the central directory.
>> 
>> This change proposes that this part of the code first identify that a zip64 extra block exists for a particular entry and then read that zip64 block, validate some parts of the zip64 block and if that validation succeeds, then use the entry metadata from the zip64 block for constructing and returning the entry instance.
>> 
>> The idea to identify and support zip64 entries in this part of the code I think is agreed as the right one.
>> Coming to the implementation, I think we need to come to an agreement on what we want to do here. Specifically:
>> 
>> - What logic do we use to decide when to look for zip64 extra block for an entry? The changes in this PR (at this line here) proposes that we look for zip64 extra block for an entry if any of the compressed size, uncompressed size or the local header offset value is `0xFFFFFFFFL` and the extra field size noted in this central directory entry is greater than 0. This however doesn't match with what we do in the implementation of `java.util.zip.ZipFile` which too does similar checks for zip64 entry presence when parsing the central directory. Very specifically, in the ZipFile where this logic is implemented is here https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipFile.java#L1254. That code in the ZipFile has gone through the necessary vetting for dealing with various possibilities with the zip files. I think we should implement that same logic here while checking for zip64 entries.
>> - The next one to decide is what kind of validations do we want to do in this code fo...
>
> I have not had an opportunity for a deep dive of the changes in the PR, only a quick skim., but I agree we should add some additional validation similar to what was added to ZipFile

Thanks for the reviews! I have some initial notes on the questions raised:

I reviewed how `ZipFile` validates the extra fields in [`checkExtraFields`](https://github.com/openjdk/jdk/blob/c25c4896ad9ef031e3cddec493aef66ff87c48a7/src/java.base/share/classes/java/util/zip/ZipFile.java#L1254-L1255) if `jdk.util.zip.disableZip64ExtraFieldValidation` is enabled, and how the values are read and used by `ZipEntry` in [`setExtra0`](https://github.com/openjdk/jdk/blob/c25c4896ad9ef031e3cddec493aef66ff87c48a7/src/java.base/share/classes/java/util/zip/ZipEntry.java#L558-L585).

`zip_util.c` also handles zip64 extra fields [here](https://github.com/openjdk/jdk/blob/c25c4896ad9ef031e3cddec493aef66ff87c48a7/src/java.base/share/native/libzip/zip_util.c#L1048-L1061). It looks for a zip64 extra field if extra fields are present, and if any of the size/offset/length are set to the magic values. It does not appear to do any of the additional validation performed by `ZipFile`.

I think there are some questions to consider about the approach:

* Should all of the zip implementations being doing that validation, including `zip_util.c`? One of the last rounds of substantial changes to `parse_manifest`'s zip implementation in [JDK-8073158](https://bugs.openjdk.org/browse/JDK-8073158) the [review thread](https://mail.openjdk.org/pipermail/core-libs-dev/2015-March/032546.html) mentioned that it would be nice to have a single c zip implementation. At that point there were three including pack200, and now we're down to two, but consolidating `zip_util.c` and `parse_manifest.c` still seems like a big job.

* Should the validation be configurable, via `jdk.util.zip.disableZip64ExtraFieldValidation` or a separate option? There are uses of `jdk.util.zip.disableZip64ExtraFieldValidation` in the wild ([github search](https://github.com/search?type=code&q=jdk.util.zip.disableZip64ExtraFieldValidation)), adding that validation could cause the launcher to start rejecting jars that are currently accepted.

* Regarding what to do if the validation fails, `parse_manifest` currently takes the approach that if it sees something that looks like a zip64 end header but that fails checks, it could e.g. be seeing non-zip64 data from the last entry (see [here](https://github.com/openjdk/jdk/blob/c25c4896ad9ef031e3cddec493aef66ff87c48a7/src/java.base/share/native/libjli/parse_manifest.c#L172-L173)). Failing instead of ignoring ill-formed zip64 data is a departure from that approach. I wonder if the use-cases for `parse_manifest` and `ZipFile` are slightly different here, and it's more important for `ZipFile` to be hardened? If an untrusted jar is being launched by the launcher, hardening the launcher's zip implementation may not provide meaningful guarantees.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18479#discussion_r1684972451


More information about the core-libs-dev mailing list