Need reviewer for forward port of 6815768 (File.getXXXSpace) and 6815768 (String.hashCode)

Marek Kozieł develop4lasu at gmail.com
Thu Mar 4 18:33:43 UTC 2010


2010/2/28 Ulf Zibis <Ulf.Zibis at gmx.de>:
> Am 25.02.2010 23:07, schrieb Alan Bateman:
>>
>> Kelly O'Hair wrote:
>>>
>>> Yup.  My eyes must be tired, I didn't see that. :^(
>>
>> Too many repositories in the air at the same time. The webrev has been
>> refreshed. Thanks Ulf.
>>
>>
>
> Another thought:
>
> In the constructors of String we could initialize hash = Integer.MIN_VALUE
> except if length == 0.
> Then we could stay at the fastest version:
>
>    public int hashCode() {
>        int h = hash;
>        if (h == Integer.MIN_VALUE) {
>            h = 0;
>            char[] val = value;
>            for (int i = offset, limit = count + i; i != limit; )
>                h = 31 * h + val[i++];
>            hash = h;
>        }
>        return h;
>    }
>
> As an alternative we could use:
> private static final int UNKNOWN_HASH = 1;
> Justification:
> Using a small value results in little shorter byte code and machine code
> footprint after compilation.
> Additionally on some CPU's this likely will perform little better, but never
> worse.
>
> Please note:
> Original loop causes 2 values to increment:
>            for (int i = 0; i < len; i++) {
>                h = 31*h + val[off++];
>            }
> This is inefficient as I have proved in a little micro-benchmark.
>
> -Ulf
>
>
>
>

Hello,
I would suggest:
public int hashCode() {
        int h = hash;
       if (h == 0) {
           h = 0;
           char[] val = value;
           for (int i = offset, limit = count + i; i != limit; )
               h = 31 * h + val[i++];
           if (h == 0)
               h++;
           hash = h;
       }
       return h;
   }


But personally I would consider:
1. make hash long
2. change method of it's  generation to ensure that:
  -- in most cases String.concat(...) would be able to determine new
hash from substring hashes so it would be available to set it in
constructor always (with little effort it's possible now).
  -- would contains flag (bit) that would tell us if hash is bijection

 public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;

                if (hash!=anotherString.hash) return false;
                if (hash&isHashBijection!=0) return true;

        int n = count;
        if (n == anotherString.count) {
        char v1[] = value;
        char v2[] = anotherString.value;
        int i = offset;
        int j = anotherString.offset;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
            return false;
        }
        return true;
        }
    }
    return false;
    }

As you know this would require a lot of work and probably it's not
worth it's effect.


Notice one more thing if we would be able to knew if String is in
intern version, equal could look like:

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        if (isIntern() && anotherString.isIntern()) return false;// we
checked it at first line already

        int n = count;
        if (n == anotherString.count) {
        char v1[] = value;
        char v2[] = anotherString.value;
        int i = offset;
        int j = anotherString.offset;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
            return false;
        }
        return true;
        }
    }
    return false;
    }

This solution would powdered .intern() so once someone would optimise
application and use interns it would improve speed and memory usage,
also it do not have negative impact like calculating hash in
constructor, the problem is where this information should be stored?
(I have some idea about it but I doubt if this would be accepted)




-- 
Pozdrowionka. / Regards.
Lasu aka Marek Kozieł

http://lasu2string.blogspot.com/



More information about the core-libs-dev mailing list