Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps

Ulf Zibis Ulf.Zibis at gmx.de
Thu May 24 01:03:15 UTC 2012


Am 24.05.2012 02:22, schrieb Mike Duigou:
>> Looking at the codes of many charsets, the main variance seems to be in the lower 8 bits of a 
>> character, especially if the strings belong to the same language. So if we would compose the 
>> initial 32-bit values from 4 chars then the murmur3 algorithm could perform almost twice faster.
>
> Interesting idea. The description specifically avoids defining the behaviour so implementations 
> can do this kind of optimization. For 7u6, unless there are technical problems, the proposed. 
> implementation is sufficient. The important message though is that it can be changed later.
And as improvement for the following:
   58             int k1 = (data[offset] & 0x0FF)
   59                     | (data[offset + 1] & 0x0FF) << 8
   60                     | (data[offset + 2] & 0x0FF) << 16
   61                     | data[offset + 3] << 24;
See:
Bug 6914113 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6914113> - Copy int to byte[] in 1 step

-Ulf



More information about the core-libs-dev mailing list