Benefit from computing String Hash at compile time?

Sun Dec 20 13:40:35 PST 2009

Osvaldo Pinali Doederlein wrote:
> Em 20/12/2009 17:25, Mark Thornton escreveu:
>> Osvaldo Pinali Doederlein wrote:
>>> attached in the end) - to no avail. I believe one significant 
>>> problem is that the hashing function is not friendly to 
>>> optimization; something like str[0] ^ str[1] ^ str[2]... would be 
>>> much better (i.e., any function that allows loop unrolling and SIMD 
>>> tricks). But, as we've already discussed, String.hashCode()'s 
>>> algorithm cannot be changed so it's pointless considering this.
>>
>> from JavaDoc: s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
>>
>> = s[0]*31^(n-1) + s[2]*31^(n-3) + ... +
>>  s[1]*31^(n-2) + s[4]*31^(n-4) + ...
>
> Good idea, I didn't consider this and I believe the JIT is not that 
> smart. I did this change to the code, but with bad results again. For 
> the simple String(char[]), only regressions (server: 1704->1749ns; 
> client: 1609->1630ns). The String(int[],int,int) constructor can't use 
> the 2x-unrolled code because it must proceed one char at a time due to 
> supplementary code points (I could work around this - but then, that 
> would add even more code).
>
> As usual, it's dangerous to pollute very tight loops with extra 
> variables and calculations; we risk losing performance due to factors 
> like register allocation. The eager hashcode code was bad enough, and 
> the more sophisticated version is even worse. Things are different 
> when the code is really bound by the FSB; the benchmark allocated 
> 500-char strings (roughly 1Kb) but this is too small, the new operator 
> zero-fills the char[] so everything in in the L1 when it is 
> initialized. Eager hashcode computation would probably look better for 
> huge strings that don't fit in the caches, then the extra instructions 
> could be completely hidden behind cache misses, but this is an 
> irrelevant scenario for String.
>
For something important like String.hashCode the JIT doesn't have to be 
smart because we can ask the JVM authors to handcode intrinsic 
alternatives. The fastest code is likely to depend on the hardware --- 
x64 has more registers available, the number of integer multiplier units 
varies, etc.

Regards,
Mark Thornton