Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps
Ulf Zibis
Ulf.Zibis at gmx.de
Wed May 23 23:58:07 UTC 2012
Hi,
What about making this approach a little bit more general?
See: Bug <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862>6812862
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862> - provide customizable hash() algorithm
in HashMap for speed tuning
+ all later comments.
Then you additionally could save:
if ((0 != h) && (k instanceof String))
Looking at the codes of many charsets, the main variance seems to be in the lower 8 bits of a
character, especially if the strings belong to the same language. So if we would compose the initial
32-bit values from 4 chars then the murmur3 algorithm could perform almost twice faster.
If you alter all hash maps in JDK to use a new hash value, which noteworthy use cases remain to use
the legacy hashcode()? Do we really need 2 hash fields in String?
In project coin, we have set in stone to use compile time hashes for Strings_in_switch extension. So
it never can't profit from the murmur3 optimization. IMO: what a pity!
(Prominent people have said, it will never make sense to change the String's hash algorithm.)
See: http://markmail.org/message/ig3nzmfinfuvgbwz
http://markmail.org/message/h3nlhhae5qlmf37a
Am 23.05.2012 21:03, schrieb Mike Duigou:
>> Also, this change
>>
>> - return h ^ (h>>> 7) ^ (h>>> 4);
>> + h ^= (h>>> 7) ^ (h>>> 4);
>> +
>> + return h;
>>
>> will make the compiler generates an additional iload/istore pair.
>> While the Jitted code will be the same, it may bother the inlining heuristic.
Wouldn' t
return (h ^= (h>>> 7) ^ (h>>> 4));
have the same effect ?
Anyway, please add a comment for later readers.
-Ulf
More information about the core-libs-dev
mailing list