[9] Review request for 8140503: JavaFX "Pair" Hash Code Collisions
Kevin Rushforth
kevin.rushforth at oracle.com
Tue Nov 3 22:43:46 UTC 2015
Inlining it seems like a fine way to go to me, too. As another point of
reference, we use 7/31 in other places in JavaFX.
-- Kevin
Jim Graham wrote:
> Absolutely, this report uncovered an unrelated NPE bug which is
> helpful, and apparently different constants might affect the collision
> probabilities (which were already very unlikely to begin with), but
> the bug report was definitely filed primarily as a misunderstanding.
>
> One thing to note - perhaps small numbers are more likely to be
> encountered so the larger the prime, the less likelihood of getting a
> collision. I wouldn't recommend using too large of a prime since the
> probability is already so low even with 13, and larger primes may
> increase the size of the byte code and compiled code since larger
> constants take more complicated instructions to deal with. Josh Bloch
> recommended a base of 17 and a multiplier of 37 in Chapter 3 of
> Effective Java, but he also admitted that the numbers were arbitrary
> except for the multiplier needing to be an odd prime.
>
> One reason 13 might be too low depends on the probability that Pair is
> used in hash cases with lots of very small numbers. Adding in a base
> prime and using a little bit larger prime multiplier quickly turns
> lots of small numbers into massively distributed hashes - even with 31
> or 17/37.
>
> I kind of feel like using the existing utility is overkill for just 2
> objects since it requires constructing an array object to use it. I'd
> be just as happy (perhaps even more so) if we just upgraded the code
> to check key for null and use a little larger prime, but keep it
> inlined...
>
> ...jim
>
> On 11/3/15 1:43 PM, Alexander Kouznetsov wrote:
>> Moreover, the following two sentences:
>>
>> "However, this is an incorrect way to compute a hash code of two
>> values."
>> "This can lead to hard-to-find bugs anywhere that instances of Pair are
>> used in a data structure like a HashSet or HashTable."
>>
>> seem to indicate misunderstanding of what hashcode is and how it is to
>> be used.
>>
>> Best regards,
>> Alexander Kouznetsov
>> (408) 276-0387
>>
>> On 3 ноя 2015 13:42, Alexander Kouznetsov wrote:
>>> After the fix, you should expect another incident report of
>>>
>>> Objects.hash(1, 0) == Objects.hash(0, 31)
>>>
>>> always true :-)
>>>
>>> I'd rather file another bug on key == null causing NPE and closing
>>> this one as incomplete or not an issue.
>>>
>>> Best regards,
>>> Alexander Kouznetsov
>>> (408) 276-0387
>>>
>>> On 3 ноя 2015 12:07, Vadim Pakhnushev wrote:
>>>> Hmm, yeah, the actual difference is in the prime number only (that is
>>>> changing the algorithm only doesn't improve anything), so the only
>>>> remaining reason to fix this is that Objects.hash guards against null
>>>> values (and I forgot to mention it in the review).
>>>> The key in Pair could actually be null and in this case hashCode will
>>>> throw NPE.
>>>>
>>>> Vadim
>>>>
>>>> On 03.11.2015 23:01, Vadim Pakhnushev wrote:
>>>>> Well, not exactly... Previously it was 13*hash(a) + hash(b) and now
>>>>> it's 31*(31 + hash(a)) + hash(b).
>>>>> And apparently it improves the quality somehow. I did a test with
>>>>> 100^4 combinations and collision probability dropped by the factor
>>>>> of 3 from 0.065% to 0.022%.
>>>>> Not really impressive, but still, and it uses well-defined utility
>>>>> method.
>>>>> Yeah, I know it's not really a bug since you don't want to rely on
>>>>> the hashCode at all...
>>>>>
>>>>> Thanks,
>>>>> Vadim
>>>>>
>>>>> On 03.11.2015 22:35, Jim Graham wrote:
>>>>>> All this does is change the prime constant used to produce the hash
>>>>>> value.
>>>>>>
>>>>>> Objects.hash(a, b) uses 31*hash(a) + hash(b) instead of the
>>>>>> 13*hash(a) + hash(b) that the embedded implementation uses.
>>>>>>
>>>>>> I don't really think this is a bug. The fact that Integer objects
>>>>>> make it easy to reverse engineer and compute collisions of any
>>>>>> reasonable hash combination computation don't mean that the
>>>>>> technique has a bug, it just means that the submitter can read the
>>>>>> code and think of a counter-example.
>>>>>>
>>>>>> If there are practical problems being caused for some particular
>>>>>> and popular use case by the use of this particular constant "13",
>>>>>> then we need to understand those issues and come up with a more
>>>>>> comprehensive solution than to simply hand off to another mechanism
>>>>>> which uses the same procedure with a different prime constant...
>>>>>>
>>>>>> ...jim
>>>>>>
>>>>>> On 11/3/15 3:06 AM, Vadim Pakhnushev wrote:
>>>>>>> Hi Chien,
>>>>>>>
>>>>>>> Could you please review the fix:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8140503
>>>>>>> http://cr.openjdk.java.net/~vadim/8140503/webrev.00/
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vadim
>>>>>
>>>>
>>>
>>
More information about the openjfx-dev
mailing list