RFR: JDK-8031043 ClassValue's backing map should have a smaller initial size
Peter Levart
peter.levart at gmail.com
Mon Jun 13 17:18:17 UTC 2016
Hi,
I explored various strategies to minimize worst-case lookup performance
for MethodType keys in LinearProbeHashtable. One idea is from the
"Hopscotch hashing" algorithm [1] which tries to optimize placement of
keys by moving them around at each insertion or deletion. While a
concurrent Hopscotch hashtable is possible, it requires additional
"metadata" about buckets which complicates it and does not make it
practical for implementing in Java until Java gets value types and
arrays of them. The simplest idea until then is to optimize placement of
keys when the table is rehashed. Normally when table is rehashed the old
table is scanned and entries from it inserted into new table. To achieve
similar effect to "Hopscotch hashing", the order in which keys are taken
from the old table and inserted into new table is changed. Keys are
ordered by increasing bucket index as it would be computed for the key
in the new table. Inserting in this order minimizes the worst-case
lookup performance. Doing this when rehashing and not at every insertion
or deletion guarantees that at least half of keys are optimally placed.
Another strategy to minimize worst-case lookup performance is to use
quadratic probe sequence instead of linear probe sequence. Normally,
when looking up a key, slots in the table are probed in the following
sequence (seq = 0, 1, 2 ...):
index(seq) = (hashCode + seq) % tableLength
Quadratic probing uses the following probe sequence:
index(seq) = (hashCode + seq * (seq + 1) / 2) % tableLength
Those two strategies can be combined. Here's a simulation of using those
two strategies in an open-addressing hashtable:
http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_probe_sequence.txt
Using those strategies does not affect the average length of probing
sequence much (length of 0 means that the key was found at its home
location, length of 1 means that one non-equal key was probed before
finding the equal one, etc ...), but worst-case lookup performance is
greatly impacted. Combining both strategies minimizes the worst-case
lookup performance.
Benchmarking using those strategies reveals the average lookup
performance is consistently better than using CHM:
http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_bench.pdf
The last trick to make this happen is stolen from CHM. The method type's
key is a WeakReference<MethodType> which caches the hashCode of
MethodType. By using cached hashCode in the key's equals()
implementation as a means of optimization, we achieve similar effect
that CHM achieves when it caches key's hashCode(s) in its Entry objects.
Here's the source of above benchmark:
http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_bench_src.tgz
3 variations of LinearProbeHashtable are compared with CHM:
LinearProbeHashtable - the plain one from webrev.04.4
LinearProbeHashtable1 - using sorting of keys when rehashing to
optimize their placement
LinearProbeHashtable2 - combines sorting of keys with quadratic
probe sequence
I think LinearProbeHashtable2 could be used in MethodType interning
without fear of degrading lookup performance.
Regards, Peter
[1] https://en.wikipedia.org/wiki/Hopscotch_hashing
More information about the core-libs-dev
mailing list