RFR: 8221836: Avoid recalculating String.hash when zero

John Rose john.r.rose at oracle.com
Mon Apr 8 21:31:40 UTC 2019


I agree that this is a good change, and you can use me as a reviewer.

I disagree with Aleksey; it's a new technique but not complex
to document or understand.  The two state components are
independent in their action; there is no race between their
state changes.

Meanwhile, there are two reasons I want the change:

1. Less risk of spurious updates to COW memory segments in
shared archives.

2. No risk of hashcode recomputation for the 2^-32 case.
This might seem laughable, until you remember that it's exactly
those cases that DOS attackers like to create.

Both are defense in depth, against performance potholes and
intentional attacks.

If we spent as much time documenting this change as we
spent complaining about its supposed uselessness, we'd
be done.

— John

On Apr 8, 2019, at 8:24 AM, Claes Redestad <claes.redestad at oracle.com> wrote:
> 
> Right, this and possibly reducing latency when running with String
> deduplication enabled might be the more tangible benefits. Removing
> a cause for spurious performance degradations is nice, but mainly
> theoretical. There's likely a pre-existing negative interaction
> between string dedup and String archiving that would need to be
> resolved either way.
> 
> I've simplified the patch somewhat and folded set_hash/hash into
> hash_code (since direct manipulation of the hash field should be
> avoided), along with a comment to try and explain and caution about the
> data race:
> 
> http://cr.openjdk.java.net/~redestad/8221836/open.02/
> 
> Thanks!
> 
> /Claes
> 
> On 2019-04-08 12:24, Peter Levart wrote:
>> I think the most benefit in this patch is the emptyString.hashCode() speedup. By holding a boolean flag in the String object itself, there is one less de-reference to be made on fast-path in case of empty string. Which shows in microbenchmark and would show even more if code iterated many different instances of empty strings that don't share the underlying array invoking .hashCode() on them. Which, I admit, is not a frequent case in practice, but hey, it is a speedup after all.
>> Regards, Peter




More information about the hotspot-gc-dev mailing list