Tune String's hashCode() + equals()
Ulf Zibis
Ulf.Zibis at gmx.de
Thu Mar 4 21:36:05 UTC 2010
Much thanks for your effort.
Am 04.03.2010 21:31, schrieb Marek Kozieł:
> @Ulf
> Few explanations:
> 1.
>
>> Intersting alternative, but I'm afraid, this is against the spec.
>> Shifting all 0's to 1 would break String's hash definition: h = 31 * h + val[i++].
>>
> Yes it does, any way i think spec is to tight here. Do we really need
> hash of each value even if String have length like 600000?
> there is noting good coming from it in my opinion.
> Did any one saw at least one code relaying on that ?
> Btw. Same come with .compare
>
See discussion on project Coin list, subject "Benefit from computing
String Hash at compile time?"
> 2.
> private static final long isHashBijection= 0x8000000000000000L;
> should be fine
>
Now I understand how it should work.
Your algorithm will guarantee only 2^63 bijectional values. So strings
of lenght >= 4 can't have bijectional hashes as 4 chars count 2^64
variations.
> 3.
> Second sample would work only if hash would be set in constructor so
> even 0 would be valid hash.
>
> 4.
> Maybe I'm wrong but if most cases when hash should be count is String
> concatenation then we could make it +/-
>
>> a.hash+b.hash*powderOf31[a.length()]
>>
> so it would not consume so much time.
>
Interesting idea. I suggest to file an RFE.
> 5.
> Intern string do not need hash codes co comparing cos they have same
> address, so first loop would return true if they are equal, after this
> we need only to check if they are not equal:
>
>> if (isIntern()&& anotherString.isIntern()) return false;
>>
You are right, but
if (h1 != 0 && h2 != 0 && h1 != h2) return false;
would perform same (if already computed internal hash would be
back-propagated to the Java object).
> 6.
> Have in mind that adding additional fields to string might be not an
> option, because memory lost this way may have great impact on average
> application efficiency.
>
+ would break compatibility if objects are serialized.
-Ulf
More information about the core-libs-dev
mailing list