HashMap bug for large sizes

Eamonn McManus eamonn at mcmanus.net
Fri Jun 1 19:12:05 UTC 2012


It seems to me that since the serialization of HashMaps with more than
Integer.MAX_VALUE entries produces an output that cannot be
deserialized, nobody can be using it, and we are free to change it.
For example we could say that if the read size is -1 then the next
item in the stream is a long that is the true size, and arrange for
that to be true in writeObject when there are more than
Integer.MAX_VALUE entries.

Whether there really are people who have HashMaps with billions of
entries that they want to serialize with Java serialization is another
question.

Éamonn


On 1 June 2012 06:36, Doug Lea <dl at cs.oswego.edu> wrote:
>
> On 06/01/12 05:29, Kasper Nielsen wrote:
>>
>> Hi,
>>
>> I don't know if this has been discussed before. But I was looking at
>> the HashMap implementation today and noticed that there are some
>> issues with very large sized hashmaps with more then Integer.MAX_VALUE
>> elements.
>
>
> I think this arose on this list (or possibly some side-exchanges)
> three years ago when discussing possible HashMap improvements. It
> seems impossible to fix this without breaking serialization
> compatibility, so people seemed resigned to let the limitations
> remain until there was some change forcing incompatibility anyway.
> At the least though, an implementation note could be added to
> the javadocs. I think several other java.util classes also have
> this problem, but none of the java.util.concurrent ones do.
> So as a workaround, people can use ConcurrentHashMap even if
> they aren't using it concurrently.
>
> -Doug
>
>
>>
>> 1.
>> The Map contract says that "If the map contains more than
>> Integer.MAX_VALUE elements, returns Integer.MAX_VALUE." The current
>> implementation will just wrap around and return negative
>> numbers when you add elements (size++).
>>
>> 2.
>> If the size of a HashMap has wrapped around and returns negative size
>> you cannot deserialize it. Because of this loop in readObject
>> for (int i=0; i<size; i++) {
>>    K key = (K) s.readObject();
>>    V value = (V) s.readObject();
>>    putForCreate(key, value);
>> }
>>
>> If someone wants to play around with the size limits of HashMap I
>> suggest taking the source code of HashMap and change the type of the
>> size field from an int to a short, in which case you can test this
>> with less the xx GB of heap.
>>
>> There are probably other map implementations in the JDK with the same issues.
>>
>> Cheers
>>   Kasper
>>
>



More information about the core-libs-dev mailing list