Faster HashMap implementation

Wed Jul 1 12:27:50 UTC 2009

While I'm posting in-progress stuff, here are some other notes on hash
maps:

First, some observations about usage (repeating some
already posted):

Via some dated but probably still valid dynamic measurments:
    Ops: Approx 84% read, 15% add/modify, 1% remove
    Sizes: Peak at 0, 1, 2, then mostly flat distribution across all others
Static guesses from frequencies of code forms, mainly via google code search
   Key types
     50-60% Strings
     10-20% Numbers -- mainly Integer, then Long,  then others
     10-20% Identity-based (no override of equals/hashCode)
     10-30% eveything else
   Reads might be around 70% get, 30% traverse?
   get: At least 80% successful (no null checks on get)
   Iterators: Probably less than 20% Entry, others about half key vs value
   Puts: typically add, not modify

The tables below (from MapMicrobenchmark) show that cache misses
completely dominate performance for large maps. You'd like to both
push out the sizes for which they come into play, and lessen the
number of indirections (thus misses) even when they do. HashMapV2
mostly does the latter. Of course, for cache misses to dominate, you
also need good key hashing and collision strategies to cope with
possibly poor hashCode()'s.

Times are approximately nanosecs per element/op.  Here are runs on an
Intel x86 linux 2.6.24-23 with jdk6 -server.
Absolute times vary on other platforms but show the same patterns (*).

HashMap:
Type/Size:      36     144     576    2304    9216   36864  147456  589824
BigDecimal      40      37      43      47      57     174     267     317
BigInteger      39      34      38      39      48     187     247     288
String          37      34      38      42      54     149     227     253
Double          41      36      39      44      49      80     219     250
Float           40      38      41      44      48      95     216     267
Long            34      29      33      34      43      74     194     242
Integer         38      33      36      39      46      70     199     233
Object          36      33      35      39      47      64     215     275

average         38      34      37      41      49     111     223     265

HashMapV2:
Type/Size:      36     144     576    2304    9216   36864  147456  589824
BigDecimal      34      34      40      44      58     146     255     286
BigInteger      28      25      29      34      44     151     220     243
String          34      29      34      40      59     136     208     238
Double          37      32      32      44      46      55     180     192
Float           34      37      38      42      42      67     165     246
Long            26      21      22      29      32      40     123     147
Integer         29      25      30      37      46      60     137     157
Object          34      31      33      41      49      64     196     250

average         32      29      32      38      47      89     185     219

Here are some more details listed across some operation categories
(via updated MapCheck) using as keys, English words (Strings) from a
big dictionary.  Times are scaled differently than above, and these
runs are from AMD-x86/solaris, also typical of others:

                         HashMap  :     V2
Access Present         :    213  :    200
Add    Absent          :    265  :    245
Modify Present         :    231  :    199
Remove Present         :    111  :    101
Search Absent          :    119  :     93
Traverse access entry  :    173  :    137
Traverse access key/val:    147  :     87

(*) At least on Solaris jdk6u14, adding the -XX:+AggressiveOpts option
also seems to enable the specialized Integer version of HashMap, which
improves performance on some tests but worsens on others. On all
platforms, the -XX:+AggressiveOpts switch seems to lead to more
run-to-run variation.

Here's synosis of MapMicrobenchmark pasted from javadoc:

  * A micro-benchmark with key types and operation mixes roughly
  * corresponding to some real programs.
  *
  * The main results are a table of approximate nanoseconds per
  * element-operation (averaged across get, put etc) for each type,
  * across a range of map sizes.
  *
  * The program includes a bunch of microbenchmarking safeguards that
  * might underestimate typical performance. For example, by using many
  * different key types and exercising them in warmups it disables most
  * dynamic type specialization.  Some test classes, like Float and
  * BigDecimal are included not because they are commonly used as keys,
  * but because they can be problematic for some map implementations.
  *
  * By default, it creates and inserts in order dense numerical keys
  * and searches for keys in scrambled order. Use "r" as second arg to
  * instead use random numerical values, and "s" as third arg to search
  * in insertion order.