RFR: 8035584: (s) ArrayList(c) should avoid inflation if c is empty

Mike Duigou mike.duigou at oracle.com
Wed Apr 23 21:35:52 UTC 2014


On Mar 13 2014, at 08:56 , Jason Mehrens <jason_mehrens at hotmail.com> wrote:

> Mike,
>  
> The constructor modification looks good.  The only other alternative I can think would be to use 'c.toArray(EMPTY_ELEMENTDATA)' to avoid creating an extra array.  I'm guessing that version would not perform as well as your current version.

Good idea, but you're also correct that it didn't perform as well (for either Arrays.ArrayList or ArrayList itself)

>  
> The modification for toArray violates the List contract because the returned array must be safe (must use 'new' and must not retain reference).   I think you'll have to back out that change.  Contract violation aside, the only case that I can see that might unsafe for an empty array would be acquiring the monitor of EMPTY_ELEMENTDATA array.  When we patched the Collections$EmptyXXX classes we avoided returning a cached empty array for this reason.

You are of course correct. Yet another reminder to avoid unnecessary promises when creating specifications. <sigh> The Collection.toArray() javadoc could have been written to allow for a shared empty array but isn't and can't be changed to  be allow for this. We did eliminate the requirement to return a "new" instance some cases for String/CharSequence/StringBuilder/StringBuffer in https://bugs.openjdk.java.net/browse/JDK-7174936 and https://bugs.openjdk.java.net/browse/JDK-8028757 that were similar to this but those changes for the most part were to sync the javadoc to the longstanding actual behaviour. 

Mike
>  
> Jason
>  
> Subject: Re: RFR: 8035584: (s) ArrayList(c) should avoid inflation if c is empty
> From: mike.duigou at oracle.com
> Date: Wed, 12 Mar 2014 12:41:00 -0700
> CC: martinrb at google.com; core-libs-dev at openjdk.java.net
> To: jason_mehrens at hotmail.com
> 
> Hi Jason,'
> 
> Using isEmpty() was intended to save the array creation but you make a valid point regarding correctness. I have amended the webrev. This handling suggests that it would useful for implementation to cache the empty result for toArray() and I have also added this to ArrayList's toArray.
> 
> http://cr.openjdk.java.net/~mduigou/JDK-8035584/3/webrev/
> 
> Mike
> 
> On Mar 12 2014, at 07:06 , Jason Mehrens <jason_mehrens at hotmail.com> wrote:
> 
> Mike,
>  
> In the constructor do you think you should skip the call to isEmpty and just check the length of toArray?  Seems like since the implementation of the given collection is unknown it is probably best to perform as little interaction with it as possible.  Also it would be possible for isEmpty to return false followed by toArray returning a zero length array that is different from cached copies.
>  
> Jason
>  
> > Subject: Re: RFR: 8035584: (s) ArrayList(c) should avoid inflation if c is	empty
> > From: mike.duigou at oracle.com
> > Date: Tue, 11 Mar 2014 18:18:26 -0700
> > To: martinrb at google.com
> > CC: core-libs-dev at openjdk.java.net
> > 
> > 
> > On Mar 11 2014, at 17:42 , Martin Buchholz <martinrb at google.com> wrote:
> > 
> > > I'm hoping y'all have evidence that empty ArrayLists are common in the wild.
> > Yes, certainly. From the original proposal:
> > > [This change] based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. 
> > 
> > > It is curious that empty lists grow immediately to 10, while ArrayList with capacity 1 grows to 2, then 3... Some people might think that a bug.
> > Yes, that's unfortunate. I've made another version that uses a second sentinel for the default sized but empty case.
> > 
> > Now I want to reduce the ensureCapacity reallocations! It seems like insane churn to replace the arrays that frequently.
> > > There are more   changes that need to be reverted.
> > > Else looks good.
> > > - * More formally, returns the lowest index <tt>i</tt> such that
> > > - * <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
> > > + * More formally, returns the lowest index {@code i} such that
> > > + * {@code (o==null ? get(i)==null : o.equals(get(i)))},
> > 
> > Corrected. Thank you.
> > 
> > http://cr.openjdk.java.net/~mduigou/JDK-8035584/2/webrev/
> > 
> > 
> > Mike
> > 
> > 
> > 
> > > 
> > > 
> > > On Tue, Mar 11, 2014 at 5:20 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> > > I've actually always used scp. :-)
> > > 
> > > Since I accepted all of your changes as suggested and had no other changes I was just going to go ahead and push once testing was done.
> > > 
> > > I've now prepared a revised webrev and can still accept feedback.
> > > 
> > > http://cr.openjdk.java.net/~mduigou/JDK-8035584/1/webrev/
> > > 
> > > (Note: The webrev also contains https://bugs.openjdk.java.net/browse/JDK-8037097 since I am testing/pushing the two issues together.)
> > > 
> > > Mike
> > > 
> > > On Mar 11 2014, at 16:42 , Martin Buchholz <martinrb at google.com> wrote:
> > > 
> > >> I don't see the updated webrev. Maybe you also fell victim to "rsync to cr no longer working"?
> > >> 
> > >> 
> > >> On Tue, Mar 11, 2014 at 4:34 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> > >> 
> > >> On Feb 21 2014, at 14:56 , Martin Buchholz <martinrb at google.com> wrote:
> > >> 
> > >>> You should do <tt> -> code conversion separately, and do it pervasively across the entire JDK.
> > >> 
> > >> From your lips to God's ears.... I keep suggesting this along with a restyle to official style every time we create new repos. Seems unlikely unfortunately as it makes backports harder. 
> > >> 
> > >>> This is not right.
> > >>> + * {@code (o==null ? get(i)==null : o.equals(get(i)))}
> > >> 
> > >> Corrected.
> > >> 
> > >>> You accidentally deleted a stray space here?
> > >>> 
> > >>> -        this.elementData = EMPTY_ELEMENTDATA;
> > >>> +       this.elementData = EMPTY_ELEMENTDATA;
> > >> 
> > >> Corrected.
> > >> 
> > >>> public ArrayList(int initialCapacity) {
> > >>> - super();
> > >>> if (initialCapacity < 0)
> > >>> throw new IllegalArgumentException("Illegal Capacity: "+
> > >>>                                                 initialCapacity);
> > >>> -        this.elementData = new Object[initialCapacity];
> > >>> +        this.elementData = (initialCapacity > 0)
> > >>> + ? new Object[initialCapacity]
> > >>> + : EMPTY_ELEMENTDATA;
> > >>> }
> > >>> 
> > >>> When optimizing for special cases, we should try very hard minimize overhead in the common case. In the above, we now have two branches in the common case. Instead,
> > >>> 
> > >>> if (initialCapacity > 0) this.elementData = new Object[initialCapacity];
> > >>> else if (initialCapacity == 0) this.elementData = EMPTY_ELEMENTDATA;
> > >>> else barf
> > >> 
> > >> Corrected.
> > >> 
> > >> Thanks as always for the feedback.
> > >> 
> > >> Mike
> > >> 
> > >>> 
> > >>> 
> > >>> 
> > >>> On Fri, Feb 21, 2014 at 2:41 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> > >>> Hello all;
> > >>> 
> > >>> This changeset consists of two small performance improvements for ArrayList. Both are related to the lazy initialization introduced in JDK-8011200.
> > >>> 
> > >>> The first change is in the ArrayList(int capacity) constructor and forces lazy initialization if the requested capacity is zero. It's been observed that in cases where zero capacity is requested that it is very likely that the list never receives any elements. For these cases we permanently avoid the allocation of an element array.
> > >>> 
> > >>> The second change, noticed by Gustav Åkesson, involves the ArrayList(Collection c) constructor. If c is an empty collection then there is no reason to inflate the backing array for the ArrayList.
> > >>> 
> > >>> http://cr.openjdk.java.net/~mduigou/JDK-8035584/0/webrev/
> > >>> 
> > >>> I also took the opportunity to the <tt></tt> -> {@code } conversion for the javadoc.
> > >>> 
> > >>> Enjoy!
> > >>> 
> > >>> Mike
> > >>> 
> > >> 
> > >> 
> > > 
> > > 
> > 
> 




More information about the core-libs-dev mailing list