RFR: 8221723: Avoid storing zero to String.hash
Peter Levart
peter.levart at gmail.com
Tue Apr 2 10:53:04 UTC 2019
Hi Claes,
On 4/2/19 10:29 AM, Claes Redestad wrote:
> Hi Peter,
>
> On 2019-04-01 23:54, Peter Levart wrote:
>>
>> In addition, this might not be easy from another point of view. You
>> all know that there's no synchronization performed when caching the
>> hash value. This works, because 32 bits are always read or written
>> atomically. If there was another byte to be read or written together
>> with 32 bit value, it would be tricky to get the ordering right and
>> still keep the performance.
>
> I won't do this for the current RFE, but it's an interesting topic for
> a future one. :-)
Right. And after I posted this message yesterday, it occurred to me that
it is possible to get away without fences if this additional boolean
flag is not called "hashCalculated" but "hashIsZero". Like in the
following example:
private int hash;
private boolean hashIsZero;
public int hashCode() {
int h = hash;
if (h == 0 && !hashIsZero) {
h = isLatin1() ? StringLatin1.hashCode(value)
: StringUTF16.hashCode(value);
if (h == 0) {
hashIsZero = true;
} else {
hash = h;
}
}
return h;
}
The trick here is that the logic writes to either 'hashIsZero' or to
'hash' field but never to both, so there's no ordering to be performed...
Regards, Peter
>
> Disclaimer: As I'm trying to be clever about concurrency I'm likely
> missing something...
>
> ... but I think a couple of barriers would be enough to ensure
> sufficient visibility/atomicity for this case, since the only
> catastrophic situation would be if another thread observes hash = 0 and
> hashCalculated = true when the real hash value is != 0.
>
> private boolean hashCalculated;
> private static Unsafe U = Unsafe.getUnsafe();
>
> public int hashCode() {
> int h = hash;
> if (h == 0 && !hashCalculated) {
> if (!hashCalculated) {
> hash = h = isLatin1() ? StringLatin1.hashCode(value)
> : StringUTF16.hashCode(value);
> // Ensure the store of the hashCalculated value is not
> // reordered with the store of the hash value
> U.storeStoreFence();
> hashCalculated = true;
> } else {
> // We could have lost a race where we read hash = 0,
> // then another thread stored hash and hashCalculated,
> // then we read hashCalculated = true when the real hash
> // value is not 0.
> // A re-read behind a load fence should be sufficient
> U.loadLoadFence();
> h = hash;
> }
> }
> return h;
> }
>
> This is performance neutral on the fast-path (hash is calculated and
> non-zero), and the extra fences cost ~1.2ns/op on my machine for cases
> where the calculated hash code is 0, which is more or less the same
> overhead as the extra branches we have now. The benefit of course is
> that for non-empty Strings with hash 0 then repeated hashCode()
> invocation is now as fast as "".hashCode()
>
> (One performance risk is that the added calls to U.*Fence will mess up
> inlining heuristics.)
>
> Thanks!
>
> /Claes
More information about the core-libs-dev
mailing list