[15] RFR (xs) 8046362 IdentityHashMap.hash comments should be clarified

Stuart Marks stuart.marks at oracle.com
Tue Feb 11 18:42:24 UTC 2020


Hi Roger,

I don't want to have a detailed discussion of the performance tradeoffs in the 
comments. My comments changes are intended to address two specific points: the 
"multiply by -127" phrase and the arrangement of alternating keys and values in 
the array. Both have been sources of confusion in the past and have even 
resulted in bug reports.

I'll change "JRE" to "Java" per your suggestion.

s'marks


On 2/11/20 6:41 AM, Roger Riggs wrote:
> Hi Stuart,
> 
> The implNote compares the performance impact without mentioning
> that HashMap uses 'equals()' vs '==' which might have as big an impact
> on performance as sequential vs chained references.
> Detailed performance comparisons are very difficult given the factors
> that impact performance, cache sizes, data structure sizes, etc.
> I'm not sure the wordsmithing adds much to the doc.
> 
> "JRE" -> "Java"
> 
> $.02, Roger
> 
> On 2/10/20 7:09 PM, Stuart Marks wrote:
>>
>>
>> On 2/10/20 7:52 AM, Martin Buchholz wrote:
>>> On Fri, Feb 7, 2020 at 2:47 PM David Holmes <david.holmes at oracle.com> wrote:
>>>
>>>> Hi Martin,
>>>>
>>>> On 8/02/2020 3:10 am, Martin Buchholz wrote:
>>>>> I explored System.identityHashCode (see below).
>>>>
>>>> System.identityHashCode is a VM call that is potentially very expensive.
>>>> In the worst-case it triggers monitor inflation.
>>>>
>>>
>>> I can believe that.  The VM can't simply return the address of an object
>>> because ... the object might move and an address is a poor hash code
>>> because low bits will tend to be zero.
>>>
>>> But  IdentityHashMap already calls System.identityHashCode - that price
>>> must be paid.
>>>
>>> My proposal is still "Let's trust System.identityHashCode to produce
>>> acceptable hash codes"
>>
>> OK... I appreciate that there are a bunch of subtle issues here, from whether 
>> it's necessary to do mixing of the identity hash code to whether an 
>> open-addressed table really is faster than separate chaining in this case. 
>> Before we go on too long, though, I'd like to ask what to do about my proposed 
>> comments changes.
>>
>> Should I proceed with pushing my comments changes (which I've appended below 
>> for convenience)? Or is somebody going to dive in right away and make changes 
>> that will invalidate my proposed comments changes, and thus I should hold off 
>> pushing them?
>>
>> Thanks,
>>
>> s'marks
>>
>>
>>
>>
>>
>> # HG changeset patch
>> # User smarks
>> # Date 1580167577 28800
>> #      Mon Jan 27 15:26:17 2020 -0800
>> # Node ID 0e08ce23484ca42597105225ffa3dd0827cb4b60
>> # Parent  981f6982717a7df4a2a43dd0ae6b2c2389e683f9
>> 8046362: IdentityHashMap.hash comments should be clarified
>> Reviewed-by: XXX
>>
>> diff -r 981f6982717a -r 0e08ce23484c 
>> src/java.base/share/classes/java/util/IdentityHashMap.java
>> --- a/src/java.base/share/classes/java/util/IdentityHashMap.java Mon Jan 27 
>> 14:03:58 2020 -0800
>> +++ b/src/java.base/share/classes/java/util/IdentityHashMap.java Mon Jan 27 
>> 15:26:17 2020 -0800
>> @@ -1,5 +1,5 @@
>>  /*
>> - * Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
>> + * Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>   *
>>   * This code is free software; you can redistribute it and/or modify it
>> @@ -115,17 +115,19 @@
>>   * exception for its correctness: <i>fail-fast iterators should be used only
>>   * to detect bugs.</i>
>>   *
>> - * <p>Implementation note: This is a simple <i>linear-probe</i> hash table,
>> - * as described for example in texts by Sedgewick and Knuth.  The array
>> - * alternates holding keys and values.  (This has better locality for large
>> - * tables than does using separate arrays.)  For many JRE implementations
>> - * and operation mixes, this class will yield better performance than
>> - * {@link HashMap} (which uses <i>chaining</i> rather than linear-probing).
>> - *
>>   * <p>This class is a member of the
>>   * <a 
>> href="{@docRoot}/java.base/java/util/package-summary.html#CollectionsFramework">
>>   * Java Collections Framework</a>.
>>   *
>> + * @implNote
>> + * <p>This is a simple <i>linear-probe</i> hash table,
>> + * as described for example in texts by Sedgewick and Knuth.  The array
>> + * contains alternating keys and values, with keys at even indexes and values
>> + * at odd indexes. (This arrangement has better locality for large
>> + * tables than does using separate arrays.)  For many JRE implementations
>> + * and operation mixes, this class will yield better performance than
>> + * {@link HashMap}, which uses <i>chaining</i> rather than linear-probing.
>> + *
>>   * @see     System#identityHashCode(Object)
>>   * @see     Object#hashCode()
>>   * @see     Collection
>> @@ -293,7 +295,7 @@
>>       */
>>      private static int hash(Object x, int length) {
>>          int h = System.identityHashCode(x);
>> -        // Multiply by -127, and left-shift to use least bit as part of hash
>> +        // Multiply by -254 to use the hash LSB and to ensure index is even
>>          return ((h << 1) - (h << 8)) & (length - 1);
>>      }
>>
> 


More information about the core-libs-dev mailing list