HashMap bug for large map sizes
Kasper Nielsen
kasperni at gmail.com
Tue Jul 13 17:13:31 UTC 2010
Hi,
I don't know if this has been discussed before. But I was looking at the
HashMap implementation today and noticed that there are some issues with
very large sized hashmaps with more then Integer.MAX_VALUE elements.
1.
The Map contract says that "If the map contains more than
Integer.MAX_VALUE elements, returns Integer.MAX_VALUE."
The current implementation will just wrap around and return negative
numbers when you add elements (size++).
2.
If the size of a HashMap has wrapped around and returns negative size
you cannot deserialize it. Because of this loop in readObject
for (int i=0; i<size; i++) {
K key = (K) s.readObject();
V value = (V) s.readObject();
putForCreate(key, value);
}
Either these issues should be stated somewhere in the documentation for
HashMap or it should be fixed.
Don't think there are any pretty fixes for this though. We could keep
the size as a long inside the HashMap. And then make a small change to
the serialization format of HashMap. So, for example, if
size>Integer.MAX_VALUE we would write
s.writeInt(Integer.MAX_VALUE);
for (int i=0;i<Integer.MAX_VALUE;i++) {
}
s.writeLong(size);//write the real size
for (long i=Integer.MAX_VALUE;i<size;i++) {
...read remaining elements
}
Obvious this won't break anybodys code since it already impossible to
deserialize hashmaps with more then Integer.MAX_VALUE elements properly
as the code is right now.
If someone wants to play around with the size limits of HashMap I
suggest taking the source code of HashMap and change the type of the
size field from an int to a short, in which case you can test this with
less the xx GB of heap.
There are probably other map implementations in the JDK with the same
issues.
Cheers
Kasper
More information about the core-libs-dev
mailing list