Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps
Ulf Zibis
Ulf.Zibis at gmx.de
Sat May 26 01:43:40 UTC 2012
Am 23.05.2012 07:16, schrieb Mike Duigou:
> Webrevs for the Java 7u6 and 8 changes are available for download at [2] and [3] for your review. There are some important differences between the Java 7 and 8 implementations of this enhancement. Most specifically in the Java 8 implementation alternative string hashing is always enabled--no threshold is used for enablement and alternative hashing cannot be disabled. (The Java 8 implementation completely ignores the jdk.map.althashing.threshold system property). The Java 8 implementation is also subject to additional refinement as Java 8 develops.
To me it seems, that computing the murmur hash is more expensive, especially on short strings, than
the old hash algorithm.
So I think, a developer should have an influence on the hash algorithm to be used by a hash map,
especially for strings, see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862.
So I propose a public accessible field in String:
public int customHash;
Then the HashMap should have an overrideable hash method:
protected int hash(K key) {
h = key.hashCode();
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
Then we should have a new class:
public class MurmurHashMap<V> extends HashMap<String,V> {
protected final int hash(String key) {
int h = key.customHash;
if (h == 0) {
// harmless data race on customHash here.
h = Hashing.murmur3_32(HASHING_SEED, key);
// ensure result is not zero to avoid recalculating
h = (h != 0) ? h : 1;
key.customHash = h;
}
return hashMask ^ h;
}
In class Hashing we need:
public static int murmur3_32(int seed, String key) {
int h1 = seed;
int count = key.length();
// body
for (int off = 0; count >= 2;) {
int k1 = (key.charAt(off++) & 0xFFFF) | key.charAt(off++) << 16);
// alternative:
for (int off = 0; count >= 4;) {
int k1 = (key.charAt(off) & 0x0FF)
| (key.charAt(off + 1) & 0x0FF) << 8
| (key.charAt(off + 2) & 0x0FF) << 16
| key.charAt(off + 3) << 24;
...
}
...
}
For other use cases see:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862
-Ulf
More information about the core-libs-dev
mailing list