Proposal: Better HashMap.resize() when memory is tight
Roman Kennke
roman at kennke.org
Wed Nov 21 20:23:39 UTC 2007
Hi there,
Why not implement such a thing as a separate library/class. After all,
Map is an interface which can be implemented in many ways and for many
different purposes. I think there are a couple of efforts that go in
this direction, for example javolution:
http://javolution.org/
Cheers, Roman
Am Mittwoch, den 21.11.2007, 09:37 -0800 schrieb Martin Buchholz:
> Hi Peter,
>
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards. We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing. Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
>
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
>
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
>
> Martin
>
> Peter Arrenbrecht wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> > @Override
> > void resize( int _newCapacity )
> > {
> > try {
> > fastResize( _newCapacity ); // This would be the current resize code.
> > }
> > catch (OutOfMemoryError e) {
> > tightResize( _newCapacity );
> > }
> > }
> >
> > @SuppressWarnings( "unchecked" )
> > private void tightResize( int newCapacity )
> > {
> > Entry head = joinLists();
> > table = new Entry[ newCapacity ];
> > threshold = (int) (newCapacity * loadFactor);
> > transfer( head, table );
> > }
> >
> > @SuppressWarnings("unchecked")
> > private Entry joinLists()
> > {
> > final Entry[] src = table;
> > final int n = src.length;
> > Entry head = null;
> > int i = 0;
> > while (i < n && null == head) {
> > head = src[ i++ ];
> > }
> > Entry tail = head;
> > assert i >= n || null != tail;
> > while (i < n) {
> > Entry e = src[ i++ ];
> > if (null != e) {
> > tail.next = e;
> > do {
> > tail = e;
> > e = e.next;
> > } while (null != e);
> > }
> > }
> > return head;
> > }
> >
> > @SuppressWarnings("unchecked")
> > private void transfer( Entry head, Entry[] tgt )
> > {
> > int n = capacity();
> > while (head != null) {
> > Entry e = head;
> > head = head.next;
> > int i = indexFor( e.hash, n );
> > e.next = tgt[ i ];
> > tgt[ i ] = e;
> > }
> > }
> >
> >
> > What do you think?
> > -peo
> E-Mail-Nachricht-Anlage (Attached Message)
> > -------- Weitergeleitete Nachricht --------
> > Von: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> > Antwort an: peter.arrenbrecht at gmail.com
> > An: core-libs-dev at openjdk.java.net
> > Betreff: Re: Proposal: Better HashMap.resize() when memory is tight
> > Datum: Wed, 21 Nov 2007 16:51:28 +0100
> >
> > einfaches Textdokument-Anlage (Attached Message)
> > Hi again, I have gone over my code again and a) discovered a very
> > stupid mistake rendering the desired effect null and void, and b)
> > developed a test that demos the effect of the improvement. Here's the
> > improved code:
> >
> > private void tightResize( int newCapacity )
> > {
> > Entry head = joinLists();
> > table = null; // free it first
> > table = new Entry[ newCapacity ]; // then reallocate
> > threshold = (int) (newCapacity * loadFactor);
> > transfer( head, table );
> > }
> >
> > Below you can find the test code. This shows the problem here on
> > Ubuntu Linux 7.04 with jre 1.6.0:
> >
> > java version "1.6.0"
> > Java(TM) SE Runtime Environment (build 1.6.0-b105)
> > Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
> >
> > The command-line for the improved map is:
> >
> > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
> >
> > and for the old map:
> >
> > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
> >
> > I have only managed to demo the effect on the server VM. And it is
> > necessary to leave an Object array of size initial*2+something free,
> > rather than just initial+something, which I expected. Maybe that is an
> > effect of the generational collector. Also, there have been spurious
> > cases where the test failed even with the new map. No idea why.
> >
> > Here is the test code:
> >
> > public static void main( String[] args )
> > {
> > final int initial = 131072;
> > final float loadFactor = 0.5F;
> > final HashMap<Integer, Integer> m;
> > if (args.length > 0) {
> > System.out.println( "Creating better map..." );
> > m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
> > }
> > else {
> > System.out.println( "Creating standard map..." );
> > m = new HashMap<Integer, Integer>( initial, loadFactor );
> > }
> >
> > System.out.println( "Priming map (should see no resize here)..." );
> > for (int i = 0; i < initial / 2; i++) {
> > Integer o = i;
> > m.put( o, o );
> > }
> > Integer o = initial;
> >
> > Entry head = blockMemExcept( initial * 2 + initial / 4 );
> > System.out.println( "Filled with " + n + " entries." );
> >
> > System.out.println( "Adding next element (should see resize here)..." );
> > m.put( o, o );
> >
> > if (head == null) System.out.println( "Bad." ); // force "head" to
> > remain in scope
> > System.out.println( "Done." );
> > }
> >
> > /**
> > * Done separately so memBlock goes out of scope cleanly, leaving no
> > local stack copies pointing
> > * to it.
> > */
> > private static Entry blockMemExcept( int exceptObjs )
> > {
> > System.out.println( "Reserving memory..." );
> > Object[] memBlock = new Object[ exceptObjs ];
> >
> > System.out.println( "Filling rest of memory..." );
> > int i = 0;
> > Entry head = null;
> > try {
> > while (true) {
> > head = new Entry( 0, null, null, head );
> > i++;
> > }
> > }
> > catch (OutOfMemoryError e) {
> > // ignore
> > }
> >
> > if (memBlock[ 0 ] != null) return null;
> > n = i;
> > return head;
> > }
> > private static int n = 0;
> >
> > Cheers,
> > -peo
> >
> >
> > ps. This all runs on copies of HashMap and AbstractMap in
> > ch.arrrenbrecht.java.util.
> >
> >
> > On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > > Hi all,
> > >
> > > I recently thought about how to resize hashmaps. When looking at the
> > > JDK 6 source, I saw that java.util.HashMap always transfers old
> > > entries from the old table. When memory is tight, it might be better
> > > to first concatenate the entry lists into one big list, throw away the
> > > old table, allocate the new one, and then fill it from the
> > > concatenated list. So:
> > >
> > > @Override
> > > void resize( int _newCapacity )
> > > {
> > > try {
> > > fastResize( _newCapacity ); // This would be the current resize code.
> > > }
> > > catch (OutOfMemoryError e) {
> > > tightResize( _newCapacity );
> > > }
> > > }
> > >
> > > @SuppressWarnings( "unchecked" )
> > > private void tightResize( int newCapacity )
> > > {
> > > Entry head = joinLists();
> > > table = new Entry[ newCapacity ];
> > > threshold = (int) (newCapacity * loadFactor);
> > > transfer( head, table );
> > > }
> > >
> > > @SuppressWarnings("unchecked")
> > > private Entry joinLists()
> > > {
> > > final Entry[] src = table;
> > > final int n = src.length;
> > > Entry head = null;
> > > int i = 0;
> > > while (i < n && null == head) {
> > > head = src[ i++ ];
> > > }
> > > Entry tail = head;
> > > assert i >= n || null != tail;
> > > while (i < n) {
> > > Entry e = src[ i++ ];
> > > if (null != e) {
> > > tail.next = e;
> > > do {
> > > tail = e;
> > > e = e.next;
> > > } while (null != e);
> > > }
> > > }
> > > return head;
> > > }
> > >
> > > @SuppressWarnings("unchecked")
> > > private void transfer( Entry head, Entry[] tgt )
> > > {
> > > int n = capacity();
> > > while (head != null) {
> > > Entry e = head;
> > > head = head.next;
> > > int i = indexFor( e.hash, n );
> > > e.next = tgt[ i ];
> > > tgt[ i ] = e;
> > > }
> > > }
> > >
> > >
> > > What do you think?
> > > -peo
> > >
--
http://kennke.org/blog/
More information about the core-libs-dev
mailing list