RFR: 8243469: Lazily encode name in ZipFile.getEntryPos
Claes Redestad
claes.redestad at oracle.com
Sun Apr 26 21:36:32 UTC 2020
Hi again,
On 2020-04-24 21:22, Claes Redestad wrote:
>> It seems that 'getEntryHitUncached' is getting slightly slower with
>> your change while all the other variants get significantly faster. I
>> don't think that's a problem, but do you have an explanation why
>> that's the case?
>
> I've noticed it swing a bit either way, and have been asking myself the
> same thing. After a little analysis I think it's actually a bug in my
> microbenchmark: I'm always looking up the same entry, and thus hitting
> the same bucket in the hash table. If that one has a collision, we'll do
> a few extra passes. If not, we won't. This might be reflected as a
> significant swing in either direction.
>
> I'm going to try rewriting it to consider more (if not all) entries in
> the zip file. That should mean the cost averages out a bit.
after I improved my micro to root out sources of variance, the
performance issue for hits persisted.
Luckily Eirik had a brilliant idea: Why not decode the bytes in the
cen to a String and compare that, rather than the other way around?
To some surprise it turns out this gives us about a ~1.2x speedup for
getEntryHit and getEntryHitUncached over open.00 - and comfortably
just ahead of the baseline on getEntryHitUncached[1]. It also leads to
slightly cleaner code[2].
Webrev: http://cr.openjdk.java.net/~redestad/8243469/open.01/
The speed-up appears to come from String.equals, which is intrinsified
and significantly faster than the replaced loop. I profiled allocation
per operation and it stays the same (EA removes the String).
Testing: tier1-4
Thanks!
/Claes
[1]
Baseline:
Benchmark (size) Mode Cnt Score Error Units
ZipFileGetEntry.getEntryHit 512 avgt 15 126.264 ± 5.297
ns/op
ZipFileGetEntry.getEntryHit 1024 avgt 15 130.823 ± 7.212
ns/op
ZipFileGetEntry.getEntryHitUncached 512 avgt 15 152.149 ± 4.978
ns/op
ZipFileGetEntry.getEntryHitUncached 1024 avgt 15 151.527 ± 4.054
ns/op
open.01:
Benchmark (size) Mode Cnt Score Error
Units
ZipFileGetEntry.getEntryHit 512 avgt 15 84.450 ± 5.474
ns/op
ZipFileGetEntry.getEntryHit 1024 avgt 15 85.224 ± 3.776
ns/op
ZipFileGetEntry.getEntryHitUncached 512 avgt 15 140.448 ± 4.667
ns/op
ZipFileGetEntry.getEntryHitUncached 1024 avgt 15 145.046 ± 7.363
[2] I stopped short of taking the cleanup a step further by decoding to
String even in initCEN, which sadly isn't performance neutral:
http://cr.openjdk.java.net/~redestad/8243469/open.01.init_decode/
Something for the future to consider, maybe.
More information about the core-libs-dev
mailing list