Scaling problem of HashMap (introduced with alternative hashing)
Eric McCorkle
eric.mccorkle at oracle.com
Tue Jan 1 23:13:41 UTC 2013
Was the original purpose security-oriented? It seemed to me that the purpose was to make performance pathologies less likely.
Consider, for example, a bad hash function that maps ip addresses directly to their integer representation. If an application puts entire subnets into a map. All the addresses will go into contiguous buckets, which increases probability of a collision considerably. Adding in randomness helps to smooth over these sorts of pathologies.
On Jan 1, 2013, at 7:53 AM, Peter Levart <peter.levart at gmail.com> wrote:
> Hi and happy new year, 2013!
>
> I don't know what the main purpose of "hashSeed" actually is. If it is to fight against predictable hashCode to bucket-index mappings, it does it's job somehow, but not very good. For example, it is very easy to predict the next hashSeed numbers that will be allocated:
>
> import sun.misc.Hashing;
>
> public class HashSeedCrack {
> private static final long multiplier = 0x5DEECE66DL;
> private static final long addend = 0xBL;
> private static final long mask = (1L << 48) - 1;
>
> static class Rnd {
> private long rnd;
>
> Rnd(long seed) { this.rnd = seed; }
>
> int nextInt() {
> rnd = (rnd * multiplier + addend) & mask;
> return (int) (rnd >>> 16);
> }
> }
>
> public static void main(String[] args) throws Exception {
> int r0 = Hashing.randomHashSeed(null);
> int r1 = Hashing.randomHashSeed(null);
>
> long s0 = ((long) r0 << 16) & mask;
> long s1 = ((long) r1 << 16) & mask;
> long mx = mask - ((1L << 16) - 1);
> Rnd rnd = null;
> for (int i = 0; i < 65536; i++) {
> long sx = s0 + i;
> if (((sx * multiplier + addend) & mx) == s1) {
> rnd = new Rnd(sx);
> int rx = rnd.nextInt();
> assert rx == r1;
> break;
> }
> }
> assert rnd != null;
>
> System.out.println("Next HashMap.hashSeed will be: " + rnd.nextInt());
> System.out.println("... and it is: " + Hashing.randomHashSeed(null));
> }
> }
>
>
> For this purpuse, the Hashing.randomHashSeed() should not be publicly visible. If ThreadLocalRandom is to be used instead of a private instance of a java.util.Random(), then it would defeat the "secrecy" since ThreadLocalRandom.current() is a shared per-thread instance accessible to anyone.
>
> As far as standard ThreadLocalRandom is concerned, it's usability is impaired by it's design. It would be a better API if a plain ThreadUnsafeRandom with standard constructors was part of standard JDK API. ThreadLocal binding would then be a matter of each particular (possibly private) usage...
>
> Regards, Peter
>
> On 12/27/2012 08:38 PM, Aleksey Shipilev wrote:
>> Looks very like dumb inlined java.util.Random?
>> Is there a security threat to use ThreadLocalRandom instead there?
>>
>> -Aleksey.
>>
>> On 27.12.2012, at 23:16, Zhong Yu <zhong.j.yu at gmail.com> wrote:
>>
>>> Reported by the SO question
>>>
>>> http://stackoverflow.com/questions/14010906
>>>
>>> the HashMap constructor contains a CAS, which is kind of surprising.
>>> Could it be removed?
>>>
>>> transient final int hashSeed =
>>> sun.misc.Hashing.randomHashSeed(this); // CAS
>>>
>>> Zhong Yu
>
More information about the core-libs-dev
mailing list