Tune String's hashCode() + equals() [was: Need reviewer for forward port of 6815768 (File.getXXXSpace) and 6815768 (String.hashCode)]

Ulf Zibis Ulf.Zibis at gmx.de
Thu Mar 4 20:04:24 UTC 2010


Am 04.03.2010 19:33, schrieb Marek Kozieł:
> Hello,
> I would suggest:
> public int hashCode() {
>          int h = hash;
>         if (h == 0) {
>             h = 0;
>             char[] val = value;
>             for (int i = offset, limit = count + i; i != limit; )
>                 h = 31 * h + val[i++];
>             if (h == 0)
>                 h++;
>             hash = h;
>         }
>         return h;
>     }
>
>    

Intersting alternative, but I'm afraid, this is against the spec.
Shifting all 0's to 1 would break String's hash definition:  h = 31 * h 
+ val[i++].


> But personally I would consider:
> 1. make hash long
> 2. change method of it's  generation to ensure that:
>    -- in most cases String.concat(...) would be able to determine new
> hash from substring hashes so it would be available to set it in
> constructor always (with little effort it's possible now).
>    -- would contains flag (bit) that would tell us if hash is bijection
>
>   public boolean equals(Object anObject) {
>      if (this == anObject) {
>          return true;
>      }
>      if (anObject instanceof String) {
>          String anotherString = (String)anObject;
>
>                  if (hash!=anotherString.hash) return false;
>    

only valid if (hash != 0 && anotherString.hash != 0)


>                  if (hash&isHashBijection!=0) return true;
>    

Which integer value should isHashBijection have ?

>          int n = count;
>          if (n == anotherString.count) {
>          char v1[] = value;
>          char v2[] = anotherString.value;
>          int i = offset;
>          int j = anotherString.offset;
>          while (n-- != 0) {
>              if (v1[i++] != v2[j++])
>              return false;
>          }
>          return true;
>          }
>      }
>      return false;
>      }
>
> As you know this would require a lot of work and probably it's not
> worth it's effect.
>
>
> Notice one more thing if we would be able to knew if String is in
> intern version, equal could look like:
>
> public boolean equals(Object anObject) {
>      if (this == anObject) {
>          return true;
>      }
>      if (anObject instanceof String) {
>          String anotherString = (String)anObject;
>          if (isIntern()&&  anotherString.isIntern()) return false;// we
> checked it at first line already
>
>          int n = count;
>          if (n == anotherString.count) {
>          char v1[] = value;
>          char v2[] = anotherString.value;
>          int i = offset;
>          int j = anotherString.offset;
>          while (n-- != 0) {
>              if (v1[i++] != v2[j++])
>              return false;
>          }
>          return true;
>          }
>      }
>      return false;
>      }
>    

Interned strings have their hashes already computed to organize them in 
internal hash map. Unfortunately those hashes are not back-propagated to 
the Java object, so equals() can't benefit from them for now.

-Ulf





More information about the core-libs-dev mailing list