HashMap collision speed (regression 7->8)

Mon Jan 12 08:26:36 UTC 2015

On 01/12/2015 12:26 AM, Peter Levart wrote:
> With tree bins we don't need heavy bit-smearing to get decent 
> performance in speed, but the table gets quite sparse anyway (although 
> this is the smaller of the space overheads - tree nodes are bigger). 
> For example, for 1M integral Floats, we get the following:
>
> >>> Float ...
>                  Capacity: 2097152
>               Load factor: 0.75
>                      Size: 1000000
>                 Bin sizes: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 6*0 
> 7*0 8*0 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000
>                Empty bins: 93.8 %
> Unique hash codes per bin: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 6*0 
> 7*0 8*0 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000 

Hi Doug,

Did bit smearing function used in JDK 7 have negative impact on any 
hashCodes or was it replaced with simpler one just because it is not 
needed when tree bins are used?

At least on my i7 PC the old bit smearing compared to new one does not 
show that much difference in relative speed (jdk7's is about 20% slower) 
while absolute time is in range of a few ns:

Benchmark                         Mode  Samples  Score   Error Units
j.t.HashSmearingTest.testHash7    avgt       60  3.472 ± 0.105 ns/op
j.t.HashSmearingTest.testHash8    avgt       60  2.881 ± 0.087 ns/op

Using the following JMH benchmark:

@State(Scope.Thread)
@Fork(6)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 7)
@Measurement(iterations = 10)
public class HashSmearingTest {

     static int hash7(Object key) {
         if (key == null) return 0;
         int h = key.hashCode();
         h ^= (h >>> 20) ^ (h >>> 12);
         return h ^ (h >>> 7) ^ (h >>> 4);
     }

     static int hash8(Object key) {
         int h;
         return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
     }

     private int cnt = ThreadLocalRandom.current().nextInt();

     @Benchmark
     public int testHash7() {
         return hash7(cnt++);
     }

     @Benchmark
     public int testHash8() {
         return hash8(cnt++);
     }
}

Regards, Peter