Proposal: Better HashMap.resize() when memory is tight

Peter Arrenbrecht peter.arrenbrecht at gmail.com
Wed Nov 21 20:58:16 UTC 2007


Hi Martin

> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.

Would you care to elaborate? At first glance, this seems not
especially hard to me, in particular since tightResize() would be able
to switch to a smaller array without causing a short spike of needing
more memory first. But I'm sure I have overlooked something here.

-peter


On Nov 21, 2007 6:37 PM, Martin Buchholz <Martin.Buchholz at sun.com> wrote:
> Hi Peter,
>
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards.  We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing.  Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
>
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
>
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
>
> Martin
>
>
> Peter Arrenbrecht wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >       @Override
> >       void resize( int _newCapacity )
> >       {
> >               try {
> >                       fastResize( _newCapacity ); // This would be the current resize code.
> >               }
> >               catch (OutOfMemoryError e) {
> >                       tightResize( _newCapacity );
> >               }
> >       }
> >
> >       @SuppressWarnings( "unchecked" )
> >       private void tightResize( int newCapacity )
> >       {
> >               Entry head = joinLists();
> >               table = new Entry[ newCapacity ];
> >               threshold = (int) (newCapacity * loadFactor);
> >               transfer( head, table );
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private Entry joinLists()
> >       {
> >               final Entry[] src = table;
> >               final int n = src.length;
> >               Entry head = null;
> >               int i = 0;
> >               while (i < n && null == head) {
> >                       head = src[ i++ ];
> >               }
> >               Entry tail = head;
> >               assert i >= n || null != tail;
> >               while (i < n) {
> >                       Entry e = src[ i++ ];
> >                       if (null != e) {
> >                               tail.next = e;
> >                               do {
> >                                       tail = e;
> >                                       e = e.next;
> >                               } while (null != e);
> >                       }
> >               }
> >               return head;
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private void transfer( Entry head, Entry[] tgt )
> >       {
> >               int n = capacity();
> >               while (head != null) {
> >                       Entry e = head;
> >                       head = head.next;
> >                       int i = indexFor( e.hash, n );
> >                       e.next = tgt[ i ];
> >                       tgt[ i ] = e;
> >               }
> >       }
> >
> >
> > What do you think?
> > -peo
>
>
> ---------- Forwarded message ----------
> From: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> To: core-libs-dev at openjdk.java.net
> Date: Wed, 21 Nov 2007 16:51:28 +0100
> Subject: Re: Proposal: Better HashMap.resize() when memory is tight
> Hi again, I have gone over my code again and a) discovered a very
> stupid mistake rendering the desired effect null and void, and b)
> developed a test that demos the effect of the improvement. Here's the
> improved code:
>
>         private void tightResize( int newCapacity )
>         {
>                 Entry head = joinLists();
>                 table = null; // free it first
>                 table = new Entry[ newCapacity ]; // then reallocate
>                 threshold = (int) (newCapacity * loadFactor);
>                 transfer( head, table );
>         }
>
> Below you can find the test code. This shows the problem here on
> Ubuntu Linux 7.04 with jre 1.6.0:
>
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>
> The command-line for the improved map is:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
>
> and for the old map:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
>
> I have only managed to demo the effect on the server VM. And it is
> necessary to leave an Object array of size initial*2+something free,
> rather than just initial+something, which I expected. Maybe that is an
> effect of the generational collector. Also, there have been spurious
> cases where the test failed even with the new map. No idea why.
>
> Here is the test code:
>
>         public static void main( String[] args )
>         {
>                 final int initial = 131072;
>                 final float loadFactor = 0.5F;
>                 final HashMap<Integer, Integer> m;
>                 if (args.length > 0) {
>                         System.out.println( "Creating better map..." );
>                         m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
>                 }
>                 else {
>                         System.out.println( "Creating standard map..." );
>                         m = new HashMap<Integer, Integer>( initial, loadFactor );
>                 }
>
>                 System.out.println( "Priming map (should see no resize here)..." );
>                 for (int i = 0; i < initial / 2; i++) {
>                         Integer o = i;
>                         m.put( o, o );
>                 }
>                 Integer o = initial;
>
>                 Entry head = blockMemExcept( initial * 2 + initial / 4 );
>                 System.out.println( "Filled with " + n + " entries." );
>
>                 System.out.println( "Adding next element (should see resize here)..." );
>                 m.put( o, o );
>
>                 if (head == null) System.out.println( "Bad." ); // force "head" to
> remain in scope
>                 System.out.println( "Done." );
>         }
>
>         /**
>          * Done separately so memBlock goes out of scope cleanly, leaving no
> local stack copies pointing
>          * to it.
>          */
>         private static Entry blockMemExcept( int exceptObjs )
>         {
>                 System.out.println( "Reserving memory..." );
>                 Object[] memBlock = new Object[ exceptObjs ];
>
>                 System.out.println( "Filling rest of memory..." );
>                 int i = 0;
>                 Entry head = null;
>                 try {
>                         while (true) {
>                                 head = new Entry( 0, null, null, head );
>                                 i++;
>                         }
>                 }
>                 catch (OutOfMemoryError e) {
>                         // ignore
>                 }
>
>                 if (memBlock[ 0 ] != null) return null;
>                 n = i;
>                 return head;
>         }
>         private static int n = 0;
>
> Cheers,
> -peo
>
>
> ps. This all runs on copies of HashMap and AbstractMap in
> ch.arrrenbrecht.java.util.
>
>
> On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >         @Override
> >         void resize( int _newCapacity )
> >         {
> >                 try {
> >                         fastResize( _newCapacity ); // This would be the current resize code.
> >                 }
> >                 catch (OutOfMemoryError e) {
> >                         tightResize( _newCapacity );
> >                 }
> >         }
> >
> >         @SuppressWarnings( "unchecked" )
> >         private void tightResize( int newCapacity )
> >         {
> >                 Entry head = joinLists();
> >                 table = new Entry[ newCapacity ];
> >                 threshold = (int) (newCapacity * loadFactor);
> >                 transfer( head, table );
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private Entry joinLists()
> >         {
> >                 final Entry[] src = table;
> >                 final int n = src.length;
> >                 Entry head = null;
> >                 int i = 0;
> >                 while (i < n && null == head) {
> >                         head = src[ i++ ];
> >                 }
> >                 Entry tail = head;
> >                 assert i >= n || null != tail;
> >                 while (i < n) {
> >                         Entry e = src[ i++ ];
> >                         if (null != e) {
> >                                 tail.next = e;
> >                                 do {
> >                                         tail = e;
> >                                         e = e.next;
> >                                 } while (null != e);
> >                         }
> >                 }
> >                 return head;
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private void transfer( Entry head, Entry[] tgt )
> >         {
> >                 int n = capacity();
> >                 while (head != null) {
> >                         Entry e = head;
> >                         head = head.next;
> >                         int i = indexFor( e.hash, n );
> >                         e.next = tgt[ i ];
> >                         tgt[ i ] = e;
> >                 }
> >         }
> >
> >
> > What do you think?
> > -peo
> >
>
>



More information about the core-libs-dev mailing list