Proposal: Better HashMap.resize() when memory is tight
Peter Arrenbrecht
peter.arrenbrecht at gmail.com
Wed Nov 21 20:58:16 UTC 2007
Hi Martin
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
Would you care to elaborate? At first glance, this seems not
especially hard to me, in particular since tightResize() would be able
to switch to a smaller array without causing a short spike of needing
more memory first. But I'm sure I have overlooked something here.
-peter
On Nov 21, 2007 6:37 PM, Martin Buchholz <Martin.Buchholz at sun.com> wrote:
> Hi Peter,
>
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards. We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing. Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
>
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
>
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
>
> Martin
>
>
> Peter Arrenbrecht wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> > @Override
> > void resize( int _newCapacity )
> > {
> > try {
> > fastResize( _newCapacity ); // This would be the current resize code.
> > }
> > catch (OutOfMemoryError e) {
> > tightResize( _newCapacity );
> > }
> > }
> >
> > @SuppressWarnings( "unchecked" )
> > private void tightResize( int newCapacity )
> > {
> > Entry head = joinLists();
> > table = new Entry[ newCapacity ];
> > threshold = (int) (newCapacity * loadFactor);
> > transfer( head, table );
> > }
> >
> > @SuppressWarnings("unchecked")
> > private Entry joinLists()
> > {
> > final Entry[] src = table;
> > final int n = src.length;
> > Entry head = null;
> > int i = 0;
> > while (i < n && null == head) {
> > head = src[ i++ ];
> > }
> > Entry tail = head;
> > assert i >= n || null != tail;
> > while (i < n) {
> > Entry e = src[ i++ ];
> > if (null != e) {
> > tail.next = e;
> > do {
> > tail = e;
> > e = e.next;
> > } while (null != e);
> > }
> > }
> > return head;
> > }
> >
> > @SuppressWarnings("unchecked")
> > private void transfer( Entry head, Entry[] tgt )
> > {
> > int n = capacity();
> > while (head != null) {
> > Entry e = head;
> > head = head.next;
> > int i = indexFor( e.hash, n );
> > e.next = tgt[ i ];
> > tgt[ i ] = e;
> > }
> > }
> >
> >
> > What do you think?
> > -peo
>
>
> ---------- Forwarded message ----------
> From: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> To: core-libs-dev at openjdk.java.net
> Date: Wed, 21 Nov 2007 16:51:28 +0100
> Subject: Re: Proposal: Better HashMap.resize() when memory is tight
> Hi again, I have gone over my code again and a) discovered a very
> stupid mistake rendering the desired effect null and void, and b)
> developed a test that demos the effect of the improvement. Here's the
> improved code:
>
> private void tightResize( int newCapacity )
> {
> Entry head = joinLists();
> table = null; // free it first
> table = new Entry[ newCapacity ]; // then reallocate
> threshold = (int) (newCapacity * loadFactor);
> transfer( head, table );
> }
>
> Below you can find the test code. This shows the problem here on
> Ubuntu Linux 7.04 with jre 1.6.0:
>
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>
> The command-line for the improved map is:
>
> java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
>
> and for the old map:
>
> java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
>
> I have only managed to demo the effect on the server VM. And it is
> necessary to leave an Object array of size initial*2+something free,
> rather than just initial+something, which I expected. Maybe that is an
> effect of the generational collector. Also, there have been spurious
> cases where the test failed even with the new map. No idea why.
>
> Here is the test code:
>
> public static void main( String[] args )
> {
> final int initial = 131072;
> final float loadFactor = 0.5F;
> final HashMap<Integer, Integer> m;
> if (args.length > 0) {
> System.out.println( "Creating better map..." );
> m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
> }
> else {
> System.out.println( "Creating standard map..." );
> m = new HashMap<Integer, Integer>( initial, loadFactor );
> }
>
> System.out.println( "Priming map (should see no resize here)..." );
> for (int i = 0; i < initial / 2; i++) {
> Integer o = i;
> m.put( o, o );
> }
> Integer o = initial;
>
> Entry head = blockMemExcept( initial * 2 + initial / 4 );
> System.out.println( "Filled with " + n + " entries." );
>
> System.out.println( "Adding next element (should see resize here)..." );
> m.put( o, o );
>
> if (head == null) System.out.println( "Bad." ); // force "head" to
> remain in scope
> System.out.println( "Done." );
> }
>
> /**
> * Done separately so memBlock goes out of scope cleanly, leaving no
> local stack copies pointing
> * to it.
> */
> private static Entry blockMemExcept( int exceptObjs )
> {
> System.out.println( "Reserving memory..." );
> Object[] memBlock = new Object[ exceptObjs ];
>
> System.out.println( "Filling rest of memory..." );
> int i = 0;
> Entry head = null;
> try {
> while (true) {
> head = new Entry( 0, null, null, head );
> i++;
> }
> }
> catch (OutOfMemoryError e) {
> // ignore
> }
>
> if (memBlock[ 0 ] != null) return null;
> n = i;
> return head;
> }
> private static int n = 0;
>
> Cheers,
> -peo
>
>
> ps. This all runs on copies of HashMap and AbstractMap in
> ch.arrrenbrecht.java.util.
>
>
> On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> > @Override
> > void resize( int _newCapacity )
> > {
> > try {
> > fastResize( _newCapacity ); // This would be the current resize code.
> > }
> > catch (OutOfMemoryError e) {
> > tightResize( _newCapacity );
> > }
> > }
> >
> > @SuppressWarnings( "unchecked" )
> > private void tightResize( int newCapacity )
> > {
> > Entry head = joinLists();
> > table = new Entry[ newCapacity ];
> > threshold = (int) (newCapacity * loadFactor);
> > transfer( head, table );
> > }
> >
> > @SuppressWarnings("unchecked")
> > private Entry joinLists()
> > {
> > final Entry[] src = table;
> > final int n = src.length;
> > Entry head = null;
> > int i = 0;
> > while (i < n && null == head) {
> > head = src[ i++ ];
> > }
> > Entry tail = head;
> > assert i >= n || null != tail;
> > while (i < n) {
> > Entry e = src[ i++ ];
> > if (null != e) {
> > tail.next = e;
> > do {
> > tail = e;
> > e = e.next;
> > } while (null != e);
> > }
> > }
> > return head;
> > }
> >
> > @SuppressWarnings("unchecked")
> > private void transfer( Entry head, Entry[] tgt )
> > {
> > int n = capacity();
> > while (head != null) {
> > Entry e = head;
> > head = head.next;
> > int i = indexFor( e.hash, n );
> > e.next = tgt[ i ];
> > tgt[ i ] = e;
> > }
> > }
> >
> >
> > What do you think?
> > -peo
> >
>
>
More information about the core-libs-dev
mailing list