Strings in Switch

Joe Darcy Joe.Darcy at Sun.COM
Mon Dec 7 20:06:19 PST 2009


Reinier Zwitserloot wrote:
> String.hashCode()'s exact algorithm is codified in the official javadoc. It
> is therefore canon. Thus, changing String.hashCode breaks backwards
> compatibility. Java has never broken backwards compatibility in such a core
> feature. Hell freezes over before hashCode() will change comes to mind.
>   
Back in the dawn of time, the JLS also contained the javadoc of the 
platform classes.  JLSv1 had a hashing algorithm for string that that 
only sampled 8 or 9 characters of the string!  The actual javadoc had 
evolved to specify the current algorithm, which is a function of all the 
characters.  When the irresistible force the platform javadoc met the 
immovable object of the JLS, in this case the javadoc won and became the 
canonical specification (and the platform javadoc was quite sensibly 
removed from the JLS as of JLSv2).

Such discrepancies and changes were long ago in a Java platform far, far 
away.  It is vanishingly unlikely that String.hashCode will change again 
in the SE platform because the "behavioral compatibility" impact would 
be too large; see

"JDK Release Types and Compatibility Regions"
http://blogs.sun.com/darcy/entry/release_types_compatibility_regions

> If Strings are ever going to get a different hashCode algorithm, I expect it
> will be an internal affair, with special-casing code in e.g. HashMap to use
> the more efficient one, leaving the public-facing hashCode() intact, lest
> tons of existing code that relies on string hashCodes breaks.
>   

As I understand it, some sophisticated collection implementations like 
ConcurrentHashMap already have internal re-hashing logic to cope with 
poor-quality hashCode implementations.

The hashing algorithm of Strings.hashCode is certainly not wonderful and 
by default I'm against specifying the hashing algorithm of a class.  
However, giving the distinguished role of String, I don't foresee its 
hashing algorithm changing and I believe it is reasonable for strings in 
switch to rely on that algorithm being used.

-Joe



More information about the coin-dev mailing list