Improving ZipFile.getEntry performance using Bloom filters

Eirik Bjørsnøs eirbjo at gmail.com
Tue Apr 14 13:19:33 UTC 2020


On Tue, Apr 14, 2020 at 2:06 PM Eirik Bjørsnøs <eirbjo at gmail.com> wrote:


> The remaining 33% should be explained by the bloom filter providing a
> faster negative lookup than the hash table.
>

This made me wonder if a hash map is the optimal data structure for this
kind of read-only lookups. There's good amount of pointer chasing and
scanning going on and it's probably not very cache friendly.

My intuition is that an array sorted by lookup key plus a binary search for
lookups could be faster.

Eirik.


More information about the core-libs-dev mailing list