Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps

Ulf Zibis Ulf.Zibis at gmx.de
Thu May 24 00:46:31 UTC 2012


Am 24.05.2012 02:22, schrieb Mike Duigou:
>
> On May 23 2012, at 16:58 , Ulf Zibis wrote:
>
>> Hi,
>>
>> What about making this approach a little bit more general?
>> See: Bug <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862>6812862 
>> <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862> - provide customizable hash() 
>> algorithm in HashMap for speed tuning
>
> It is possible for JDK 8. One option is to provide the hash() method as a virtual extension method 
> on Map allowing implementations to override it.
Oh, these are good news.
But I don't think all Maps need such a method, e.g. TreeMap.

>> If you alter all hash maps in JDK to use a new hash value, which noteworthy use cases remain to 
>> use the legacy hashcode()?
>
> Compatibility (which is of paramount importance). Part of the difficulty is that the 
> String.hashCode() calculation method is part of the specification. Changing the specification is 
> possible but probably not practical.
>
>> Do we really need 2 hash fields in String?
>
> Performance without caching the hash code result is unacceptable. (I've tried it).
I didn't think about saving the cashing, I thought about cashing the new hash in the existing old 
field in context of my upper question. Maybe we could drop caching the legacy one, if now rarely used.

>
>> In project coin, we have set in stone to use compile time hashes for Strings_in_switch extension. 
>> So it never can't profit from the murmur3 optimization. IMO: what a pity!
>> (Prominent people have said, it will never make sense to change the String's hash algorithm.)
>> See: http://markmail.org/message/ig3nzmfinfuvgbwz
>> http://markmail.org/message/h3nlhhae5qlmf37a
>
> Yes, there other other reasons as well.
I was one of those, who objected the compile time hashes O:-)

-Ulf

P.S.: have you seen my question + suggestion on the very bottom ...

>
> Mike
>
>>
>>
>> Am 23.05.2012 21:03, schrieb Mike Duigou:
>>>> Also, this change
>>>>
>>>> -        return h ^ (h>>>   7) ^ (h>>>   4);
>>>> +        h ^= (h>>>   7) ^ (h>>>   4);
>>>> +
>>>> +        return h;
>>>>
>>>> will make the compiler generates an additional iload/istore pair.
>>>> While the Jitted code will be the same, it may bother the inlining heuristic.
>> Wouldn' t
>>     return (h ^= (h>>>  7) ^ (h>>>  4));
>> have the same effect ?
>>
>> Anyway, please add a comment for later readers.
>>
>> -Ulf
>>
>



More information about the core-libs-dev mailing list