RFR: 8342040: Further improve entry lookup performance for multi-release JARs [v3]

Eirik Bjørsnøs eirbjo at openjdk.org
Fri Oct 18 16:06:58 UTC 2024


On Fri, 18 Oct 2024 15:43:18 GMT, Chen Liang <liach at openjdk.org> wrote:

>> Eirik Bjørsnøs has updated the pull request incrementally with four additional commits since the last revision:
>> 
>>  - Map versions by entry name hashcode instead of by entry name. This avoids String allocation and storage
>>  - Merge pull request #1 from cl4es/bitset_versions
>>    
>>    Use BitSet to streamline construction
>>  - Fix traversal, traverse backwards to pick latest applicable version
>>  - Use BitSet to streamline construction
>
> src/java.base/share/classes/java/util/zip/ZipFile.java line 1796:
> 
>> 1794:                                 if (metaVersions == null)
>> 1795:                                     metaVersions = new HashMap<>();
>> 1796:                                 metaVersions.computeIfAbsent(hashCode, _ -> new BitSet()).set(version);
> 
> Does this hashCode ever collide? Tried reading ZipCoder and can't rule that out. If yes, we need a fallback mechanism if this `computeIfAbsent` encountered an existing entry.

Yes, this is an overapproximation, so a hash collision may result in an extra lookup being performed for a versioned entry which does not exist for that version. See the comment on the `metaVersions` field declaration.

This is a trade-off against the initial alternative proposed in this PR which was to map using the entry name string. That would minimize lookups, but it would also require allocating the String name and storing those strings in the map. So we decided a few extra lookups on hash collisions was an okay trade-off for faster parsing and less memory used.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21489#discussion_r1806726866


More information about the security-dev mailing list