RFR: 8328995: Launcher can't open jar files where the offset of the manifest is >4GB [v8]

Fri Jul 19 15:21:37 UTC 2024

On Fri, 19 Jul 2024 14:16:21 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:

>> Liam Miller-Cushon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:
>> 
>>  - Add a missing `break`
>>  - Merge branch 'master' into JDK-8328995
>>  - Merge remote-tracking branch 'origin/master' into JDK-8328995
>>  - Move test to test/jdk/tools/launcher
>>  - Add some more comments
>>  - Maximum Zip64 extra field length is 32
>>  - Make cendsk an unsigned short
>>  - Fix disk number size
>>  - Improvements
>>    
>>    * don't rely on variable length arrays
>>    * only run the test of 64 bit machines, since it requires >4GB of heap
>>  - 8328995: launcher can't open jar files where the offset of the manifest is >4GB
>
> src/java.base/share/native/libjli/parse_manifest.c line 507:
> 
>> 505:                   || censiz == ZIP64_MAGICVAL
>> 506:                   || cenoff == ZIP64_MAGICVAL)
>> 507:                 && cenext > 0) {
> 
> I went through these changes and here's my first round of review.
> 
> Before the changes proposed in this PR, this part of the code which is responsible for finding and returning a constructed zip entry instance would blindly use the local header offset value from the central directory of the entry being looked up. It would then "seek" to that position and read the metadata of that entry (details like uncompressed length, compressed length, the compression method) and return back that entry instance. Clearly this isn't right when zip64 entries are involved since various details of such an entry can reside in the zip64 extra field block of that entry instead of being in the central directory.
> 
> This change proposes that this part of the code first identify that a zip64 extra block exists for a particular entry and then read that zip64 block, validate some parts of the zip64 block and if that validation succeeds, then use the entry metadata from the zip64 block for constructing and returning the entry instance.
> 
> The idea to identify and support zip64 entries in this part of the code I think is agreed as the right one.
> Coming to the implementation, I think we need to come to an agreement on what we want to do here. Specifically:
> 
> - What logic do we use to decide when to look for zip64 extra block for an entry? The changes in this PR (at this line here) proposes that we look for zip64 extra block for an entry if any of the compressed size, uncompressed size or the local header offset value is `0xFFFFFFFFL` and the extra field size noted in this central directory entry is greater than 0. This however doesn't match with what we do in the implementation of `java.util.zip.ZipFile` which too does similar checks for zip64 entry presence when parsing the central directory. Very specifically, in the ZipFile where this logic is implemented is here https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipFile.java#L1254. That code in the ZipFile has gone through the necessary vetting for dealing with various possibilities with the zip files. I think we should implement that same logic here while checking for zip64 entries.
> - The next one to decide is what kind of validations do we want to do in this code for zip64 extra field block. I th...

I have not had an opportunity for a deep dive of the changes in the PR, only a quick skim., but I agree we should add some additional validation similar to what was added to ZipFile

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18479#discussion_r1684534283