A behavior mismatch in AbstractCollection.toArray(T[] )

Sean Chou zhouyx at linux.vnet.ibm.com
Wed Dec 14 07:05:14 UTC 2011


Hi Mike, David,

    I reported this as a bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7121314  .

On Wed, Dec 14, 2011 at 1:03 PM, Mike Duigou <mike.duigou at oracle.com> wrote:

> I agree that most people probably won't check whether the array returned
> is the array they provided. It is possible that they incorrectly assume it
> is. ie.
>
> Collection<Integer> collection;
> ...
> Integer[] foo = new Integer[collection.size()]; // of sufficient size for
> all elements
> // at this point collection grows.
> collection.toArray(foo); // this will work, right?
> for(Integer each : foo) {
>   // wait! foo is empty, why?
> }
>
> If collection is mutated and increased in size between the creation of foo
> and the toArray operation then foo will be empty and the user probably
> won't know why the elements weren't copied. Actually, the elements were
> copied but the result of toArray containing those elements was ignored.
>
> This usage seems like just a bad idiom though and is avoidable.
>
> More worrying is the toArray() case where the resultant array is currently
> documented to hold only result elements and no extra nulls. If the
> collection shrinks then it is currently possible that the resulting array
> could very unexpectedly hold nulls.
>
> Mike
>
> On Dec 13 2011, at 05:30 , Sean Chou wrote:
>
> > Sorry for the confuse. By "ok", I mean "compare the size of array which
> is
> > going to be
> > returned and the size of the specified array, and copy the elements
> > into the specified
> > array if it is larger and return the specified array."
> >
> > Nothing is causing problem for now, I just found a mismatch. I think most
> > guys will
> > just use the returned array without checking if it's the specified one;
> and
> > this is also
> > why I think it may be possible to modify the behavior without causing
> > problems.
> >
> > And I think modifying ConcurrentHashMap is as dangerous as modifying
> > AbstractCollection
> > if people are relying on implementation, is this right? So it seems we
> can
> > do nothing
> > to the mismatch now...
> >
> >
> > On Tue, Dec 13, 2011 at 8:06 PM, David Holmes <david.holmes at oracle.com
> >wrote:
> >
> >> On 13/12/2011 9:18 PM, Sean Chou wrote:
> >>
> >>> Hi ,
> >>>
> >>> Is it possible to change the spec ? I found it is defined in
> >>> java.utils.Collection interface. It would be easy for
> >>> AbstractCollection to state that it is not designed for concurrent
> >>> operations, and its subclass should take care of them.
> >>>
> >>
> >> Such a disclaimer might be added to the spec for AbstractCollection but
> >> that doesn't really change anything - it just makes observed behaviour
> less
> >> surprising.
> >>
> >>
> >>     However, I think the simplest way may be modifying toArray(T[])
> >>> method for an additional check, which would work for most subclasses of
> >>> AbstractCollection...
> >>> Is that ok ?
> >>>
> >>
> >> "ok" in what sense? Do you want to change the spec or just change the
> >> current behaviour? If you do the latter and people rely on that
> >> implementation rather than the spec then code will not be portable
> across
> >> different implementations of the platform.
> >>
> >> I would not want to see a change in behaviour without a change in
> >> specification and I do not think the specification for
> AbstractCollection
> >> can, or should be, changed. Just my opinion of course.
> >>
> >> What is the concrete concurrent collection that you have a problem with?
> >> If it is ConcurrentHashMap, as per the example, then perhaps
> >> ConcurrentHashMap should be where a change is considered.
> >>
> >> David
> >> -----
> >>
> >>
> >>> On Tue, Dec 13, 2011 at 3:41 PM, David Holmes <david.holmes at oracle.com
> >>> <mailto:david.holmes at oracle.**com <david.holmes at oracle.com>>> wrote:
> >>>
> >>>   Hi Sean,
> >>>
> >>>
> >>>   On 13/12/2011 5:21 PM, Sean Chou wrote:
> >>>
> >>>       When I was reading the code of AbstractCollection.toArray(T[] ),
> I
> >>>       found its behavior maybe different from the spec in multithread
> >>>       environment. The spec says "If the collection fits in the
> specified
> >>>       array, it is returned therein. Otherwise, a new array is
> allocated
> >>>       with the runtime type of the specified array and the size of this
> >>>       collection."  However, in multithread environment, it is not easy
> >>> to
> >>>       tell if the collection fits in the specified array, because the
> >>>       items may be removed when toArray is copying.
> >>>
> >>>
> >>>   Right. The problem is that AbstractCollection doesn't address
> >>>   thread-safety or any other concurrency issues so doesn't account for
> >>>   the collection growing or shrinking while the toArray snapshot is
> >>>   being taken. Really the collection implementations that are designed
> >>>   to support multiple threads should override toArray to make it clear
> >>>   how it should behave. As it stands, in my opinion, it is more a
> >>>   "quality of implementation" issue as to whether AbstractCollection
> >>>   expends effort after creating the array to see if the array is
> >>>   actually full or not; or whether after creating an array it turns
> >>>   out it could have fit in the original.
> >>>
> >>>   For a concurrent collection  I would write the spec for toArray
> >>>   something like:
> >>>
> >>>   "The current size of the collection is examined and if the
> >>>   collection fits in the specified array it will be the target array,
> >>>   else a new array is allocated based on that current size and it
> >>>   becomes the target array. If the collection grows such that the
> >>>   target array no longer fits then extra elements will not be copied
> >>>   into the target array. If the collection shrinks then the target
> >>>   array will contain null elements."
> >>>
> >>>   Or for the last part "then the target array will be copied to a new
> >>>   array that exactly fits the number of elements returned".
> >>>
> >>>   David Holmes
> >>>   ------------
> >>>
> >>>
> >>>
> >>>            Here is a testcase:
> >>>       //////////////////////////////**__////////////////////////////**
> >>> //__//////////////
> >>>
> >>>       import java.util.Map;
> >>>       import java.util.concurrent.__**ConcurrentHashMap;
> >>>
> >>>
> >>>       public class CollectionToArrayTest {
> >>>
> >>>            static volatile Map<String, String>  map = new
> >>>       TConcurrentHashMap<String, String>();
> >>>            static volatile boolean gosleep = true;
> >>>
> >>>            static class TConcurrentHashMap<K, V>  extends
> >>>       ConcurrentHashMap<K, V>  {
> >>>                public int size() {
> >>>                    int oldresult = super.size();
> >>>                    System.out.println("map size before concurrent
> >>>       remove is "
> >>>                            + oldresult);
> >>>                    while (gosleep) {
> >>>                        try {
> >>>                            // Make sure the map is modified during
> >>>       toArray is
> >>>       called,
> >>>                            // between getsize and being iterated.
> >>>                            Thread.sleep(1000);
> >>>                            // System.out.println("size called, size is
> "
> >>> +
> >>>       oldresult +
> >>>                            // " take a sleep to make sure the element
> >>>       is deleted
> >>>       before size is returned.");
> >>>                        } catch (Exception e) {
> >>>                        }
> >>>                    }
> >>>                    return oldresult;
> >>>                }
> >>>            }
> >>>
> >>>            static class ToArrayThread implements Runnable {
> >>>                public void run() {
> >>>                    for (int i = 0; i<  5; i++) {
> >>>                        String str = Integer.toString(i);
> >>>                        map.put(str, str);
> >>>                    }
> >>>                    String[] buffer = new String[4];
> >>>                    String[] strings = map.values().toArray(buffer);
> >>>                    // System.out.println("length is " +
> strings.length);
> >>>                    if (strings.length<= buffer.length) {
> >>>                        System.out.println("given array size is "
> >>>                                        + buffer.length
> >>>                                        + " \nreturned array size is "
> >>>                                        + strings.length
> >>>                                        + ", \nbuffer should be used
> >>>       according to
> >>>       spec. Is buffer used : "
> >>>                                        + (strings == buffer));
> >>>                    }
> >>>                }
> >>>            }
> >>>
> >>>            static class RemoveThread implements Runnable {
> >>>                public void run() {
> >>>                    String str = Integer.toString(0);
> >>>                    map.remove(str);
> >>>                    gosleep = false;
> >>>                }
> >>>            }
> >>>
> >>>            public static void main(String args[]) {
> >>>                CollectionToArrayTest app = new CollectionToArrayTest();
> >>>                app.test_concurrentRemove();
> >>>            }
> >>>
> >>>            public void test_concurrentRemove() {
> >>>
> >>>       System.out.println("//////////**__////////////////////////////**
> >>> //__//////\n"
> >>>
> >>>       +
> >>>       "The spec says if the given array is large\n "  +
> >>>       "enough to hold all elements, the given array\n" +
> >>>       "should be returned by toArray. This \n" +
> >>>       "testcase checks this case. \n" +
> >>>       "/////////////////////////////**__/////////////////");
> >>>
> >>>
> >>>                Thread[] threads = new Thread[2];
> >>>                threads[0] = new Thread(new ToArrayThread());
> >>>                threads[1] = new Thread(new RemoveThread());
> >>>
> >>>                threads[0].start();
> >>>
> >>>                try {
> >>>                    // Take a sleep to make sure toArray is already
> >>> called.
> >>>                    Thread.sleep(1200);
> >>>                } catch (Exception e) {
> >>>                }
> >>>
> >>>                threads[1].start();
> >>>            }
> >>>       }
> >>>
> >>>       //////////////////////////////**__////////////////////////////**
> >>> //__//////////////
> >>>
> >>>
> >>>       TConcurrentHashMap is used to make sure the collection is
> modified
> >>>       during toArray is invoked. So the returned array fits in the
> >>>       specified
> >>>       array, but a new array is used because toArray checks the size
> >>>       before copying.
> >>>
> >>>       Is this a bug ?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Sean Chou
> >>>
> >>>
> >
> >
> > --
> > Best Regards,
> > Sean Chou
>
>


-- 
Best Regards,
Sean Chou



More information about the core-libs-dev mailing list