Value type hash code

Daniel Latrémolière daniel.latremoliere at gmail.com
Thu Apr 19 20:13:17 UTC 2018


Why? If you want to use value types as keys in HashMap, you will probably need enhancement to java.util.HashMap implementation for allowing custom hashing strategy. This is already possible in SortedMap where you can provide your own Comparator and this is perfectly doable in HashMap, like done in Eclipse Collections:

https://www.eclipse.org/collections/javadoc/9.1.0/org/eclipse/collections/api/block/HashingStrategy.html

---

Honestly, I can perfectly understand that you dislike this implementation of equals/hashCode but (for me), the root cause of problem is the existence of these methods in Object. Given this mandatory existence in all classes, clients expects these methods exists with declared contracts, but their implementation is frequently lacking or erroneous/obsolete in practice.

I would prefer if these methods doesn't exists in new code, like value types or records. The algorithm using identity test for reference fields would then be used only for testing if two value types containing these references have same identity (a == b), which would be a clear behavior.

But this would need that Oracle change the root "class" of all instances and create a super-class of Object for, at least, value types and records, with only the really defined "methods" (getClass(), ==) and removing all others methods:
- like hashCode(), equals(Object), toString(), clone(), finalize(): for ceasing to mandate their existence even if the developer doesn't implemented these methods in practice for a specific class.
- notify* and wait* because their behavior will be problematic for value types.

Given, this doesn't seems to be the direction of Oracle, I try to avoid the biggest issue of unuseful methods (hashCode, equals) due to being frequently untrustable, because they are existing even if developer doesn't want/had time to implement these methods. In a simple sentence: I prefer a trustable method doing unuseful things than an untrustable method.

Daniel.

Le 19 avril 2018 18:04:50 GMT+02:00, Jonas Konrad <me at yawk.at> a écrit :
>That would make it useless in almost all cases where reference types
>are 
>present in the value. Even something simple like a String field (which 
>is definitely immutable) would use identity hash and make this hashcode
>
>useless.
>
>It might be nice conceptually but if we do that we might as well throw 
>an exception when a reference type is present. It'd be useful in the 
>same cases (meaning none) and it would at least be fail-fast and users 
>wouldn't be bitten by behaviour like identityHashCode of strings.
>
>- Jonas
>
>On 04/19/2018 12:38 PM, Daniel Latrémolière wrote:
>> A value type is explicitly immutable by design and not by
>documentation, 
>> then equals/hashcode methods needs to have the same immutable
>behaviour 
>> not just by documentation but also by design. These methods can not
>be 
>> delegated to user (because it would allow value type design bugs in 
>> immutability if hashCode change when user implement hash code with a 
>> random function or by following a reference to a mutable object.
>> 
>> The only really immutable available information for a reference field
>is 
>> System.identityHashCode(), then it need to be used.
>> 
>> NB: A value type is (for me) pure data designed for performance: it
>can 
>> be a pair of integers, but it can not be a fraction. If you want 
>> comparison behaviour of a fraction, you can do a fast
>equalsAsFraction 
>> method by cross multiplication, but you can not implements fast and 
>> correct hash code (you would need to simplify the fraction to have no
>
>> common divider between numerator and denominator).
>> 
>> 
>> Daniel.
>> 
>> 
>> Le 19/04/2018 à 10:04, David Simms a écrit :
>>>
>>> Summary of points raised:
>>>
>>>  * Implementation specification: besides the general contract [1],
>>>    implementation doesn't need to be specified
>>>      o this has advantages to JDK and JVM developers to enable
>change,
>>>        protects users from said future changes
>>>  * Default implementation
>>>      o Javac could provide the default implementation
>>>          + bloats class file forever
>>>      o BSM mechanism described by John enables more flexibility,
>>>        efficiency and better optimization opportunities
>>>          + e.g. BSM may read annotations at lookup time, allowing
>users
>>>            to decoratively specify which fields and which method for
>>>            handling references
>>>          + may have bootstrapping issues, if so, said JDK classes
>need
>>>            to implement hashCode themselves
>>>      o Even if the JVM doesn't implement it directly, it shouldn't
>>>        crash or behave erratically
>>>          + JVM: Return 0, -1, 4711, or throw exception (doesn't
>matter
>>>            given the point above, 0 for argument's sake) ?
>>>  * On the topic of calling reference fields, calling "hashCode()" or
>>>    using "System.identityHashCode()"
>>>      o "System.identityHashCode()" is consistent with
>>>        "Arrays.hashCode(Object[])" [2]
>>>          + Almost meaningless to the user, many think it is a
>mistake
>>>      o Calling "hashCode()" is consistent with
>>>        "List.hashCode(Object[])" [3]
>>>          + may result in recursion or costly traversal
>>>              # This is fine, user needs to decide what to do by
>>>                supplying their own...
>>>      o BSM method can help user to declare what they prefer
>>>
>>> Obviously a similar discussion can be had for "equals()", except
>this 
>>> issue doesn't really involve the JVM (as hashCode does).
>>>
>>> Clearly being able to declaratively control hash/equals deep vs 
>>> identity is very powerful...we'll be prototyping looking for further
>
>>> technical issues.
>>>
>>>
>>> Feel free to call shenanigans if I have something wrong. Agreeing to
>
>>> disagree is also an option, and nothing is set in stone, still 
>>> prototyping.
>>>
>>> Thanks for all the feedback !
>>>
>>> /David Simms
>>>
>>>
>>> [1] 
>>>
>https://docs.oracle.com/javase/10/docs/api/java/lang/Object.html#hashCode()
>
>>>
>>> [2] 
>>>
>https://docs.oracle.com/javase/10/docs/api/java/util/Arrays.html#hashCode(java.lang.Object%5B%5D)
>
>>>
>>> [3] 
>>>
>https://docs.oracle.com/javase/10/docs/api/java/util/List.html#hashCode()
>>>
>>>
>> 
>> ---
>> 
>> Just for history, my response to Rémi's mail last week on this
>subject 
>> in amber (forbidden because discussion is not allowed on the 
>> corresponding mailing list, and his following copy [1] of his mail
>was 
>> on a reserved mailing list, then I had administrative phobia):
>> 
>> [1]: 
>>
>http://mail.openjdk.java.net/pipermail/amber-spec-experts/2018-April/000557.html
>
>> 
>> 
>> 
>> -------- Message transféré --------
>> Sujet :     Re: Record design (and ancillary fields)
>> Date :     Sun, 15 Apr 2018 06:19:44 +0200
>> De :     Daniel Latrémolière <daniel.latremoliere at gmail.com>
>> Pour :     Remi Forax <forax at univ-mlv.fr>
>> Copie à :     amber-spec-comments
><amber-spec-comments at openjdk.java.net>
>> 
>> 
>> Le 14/04/2018 à 23:18, Remi Forax a écrit :
>>> I do not think we have to do something specific for supporting 
>>> relational database mapping,[...]
>> I'm not asking for object-relational mapping, only for not forgetting
>
>> experience from database design.
>> MapReduce in Google index database is not the same than map/reduce in
>
>> java.util.stream, but they are the same design pattern.
>> 
>>>> PS: Given primitive/value type disallow cyclical references, this
>will
>>>> prohibit StackOverflowException in equals/hashCode methods.
>>> only if an equals on a value type that contains an object doesn't
>call 
>>> equals on that object.
>> Another design would probably be a bug in these compiler generated 
>> methods for value type: value types are using pass-by-value
>convention 
>> for methods.
>> 
>> They are like primitives (boolean, int, float, ... and address!). For
>a 
>> value type, a field targeting an object is opaque, value type know
>only 
>> address, not pointed object.
>> 
>>  From point of view of compiler, equals/hashCode methods of value
>types 
>> would be using all fields but these fields can only be primitive or 
>> value types. Then, recursion is only possible between value types,
>and 
>> always descending, then finite (like Fermat). After flattening all 
>> levels of value types inside a value type, it will become non
>recursive 
>> and using only primitives.
>> 
>> Given pass-by-value design, these generated methods would be required
>to 
>> be defined like a field containing an address value (not a field 
>> targeting an object):
>> - equals use identity test on the field (not equality test).
>> - hashCode use result of System.identityHashCode on targeted object
>(not 
>> hashCode virtual method, which would also create potentially 
>> NullPointerException in case of null address if not tested before).
>> 
>> Daniel.



More information about the valhalla-dev mailing list