A behavior mismatch in AbstractCollection.toArray(T[] )

David Holmes david.holmes at oracle.com
Wed Dec 14 08:02:21 UTC 2011


On 14/12/2011 5:05 PM, Sean Chou wrote:
> Hi Mike, David,
>
>      I reported this as a bug:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7121314  .

Thanks - I've turned  the initial incident report into a "real" bug report.

David

> On Wed, Dec 14, 2011 at 1:03 PM, Mike Duigou <mike.duigou at oracle.com
> <mailto:mike.duigou at oracle.com>> wrote:
>
>     I agree that most people probably won't check whether the array
>     returned is the array they provided. It is possible that they
>     incorrectly assume it is. ie.
>
>     Collection<Integer> collection;
>     ...
>     Integer[] foo = new Integer[collection.size()]; // of sufficient
>     size for all elements
>     // at this point collection grows.
>     collection.toArray(foo); // this will work, right?
>     for(Integer each : foo) {
>        // wait! foo is empty, why?
>     }
>
>     If collection is mutated and increased in size between the creation
>     of foo and the toArray operation then foo will be empty and the user
>     probably won't know why the elements weren't copied. Actually, the
>     elements were copied but the result of toArray containing those
>     elements was ignored.
>
>     This usage seems like just a bad idiom though and is avoidable.
>
>     More worrying is the toArray() case where the resultant array is
>     currently documented to hold only result elements and no extra
>     nulls. If the collection shrinks then it is currently possible that
>     the resulting array could very unexpectedly hold nulls.
>
>     Mike
>
>     On Dec 13 2011, at 05:30 , Sean Chou wrote:
>
>      > Sorry for the confuse. By "ok", I mean "compare the size of array
>     which is
>      > going to be
>      > returned and the size of the specified array, and copy the elements
>      > into the specified
>      > array if it is larger and return the specified array."
>      >
>      > Nothing is causing problem for now, I just found a mismatch. I
>     think most
>      > guys will
>      > just use the returned array without checking if it's the
>     specified one; and
>      > this is also
>      > why I think it may be possible to modify the behavior without causing
>      > problems.
>      >
>      > And I think modifying ConcurrentHashMap is as dangerous as modifying
>      > AbstractCollection
>      > if people are relying on implementation, is this right? So it
>     seems we can
>      > do nothing
>      > to the mismatch now...
>      >
>      >
>      > On Tue, Dec 13, 2011 at 8:06 PM, David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>>wrote:
>      >
>      >> On 13/12/2011 9:18 PM, Sean Chou wrote:
>      >>
>      >>> Hi ,
>      >>>
>      >>> Is it possible to change the spec ? I found it is defined in
>      >>> java.utils.Collection interface. It would be easy for
>      >>> AbstractCollection to state that it is not designed for concurrent
>      >>> operations, and its subclass should take care of them.
>      >>>
>      >>
>      >> Such a disclaimer might be added to the spec for
>     AbstractCollection but
>      >> that doesn't really change anything - it just makes observed
>     behaviour less
>      >> surprising.
>      >>
>      >>
>      >>     However, I think the simplest way may be modifying toArray(T[])
>      >>> method for an additional check, which would work for most
>     subclasses of
>      >>> AbstractCollection...
>      >>> Is that ok ?
>      >>>
>      >>
>      >> "ok" in what sense? Do you want to change the spec or just
>     change the
>      >> current behaviour? If you do the latter and people rely on that
>      >> implementation rather than the spec then code will not be
>     portable across
>      >> different implementations of the platform.
>      >>
>      >> I would not want to see a change in behaviour without a change in
>      >> specification and I do not think the specification for
>     AbstractCollection
>      >> can, or should be, changed. Just my opinion of course.
>      >>
>      >> What is the concrete concurrent collection that you have a
>     problem with?
>      >> If it is ConcurrentHashMap, as per the example, then perhaps
>      >> ConcurrentHashMap should be where a change is considered.
>      >>
>      >> David
>      >> -----
>      >>
>      >>
>      >>> On Tue, Dec 13, 2011 at 3:41 PM, David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>      >>> <mailto:david.holmes at oracle. <mailto:david.holmes at oracle.>**com
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>>>> wrote:
>      >>>
>      >>>   Hi Sean,
>      >>>
>      >>>
>      >>>   On 13/12/2011 5:21 PM, Sean Chou wrote:
>      >>>
>      >>>       When I was reading the code of
>     AbstractCollection.toArray(T[] ), I
>      >>>       found its behavior maybe different from the spec in
>     multithread
>      >>>       environment. The spec says "If the collection fits in the
>     specified
>      >>>       array, it is returned therein. Otherwise, a new array is
>     allocated
>      >>>       with the runtime type of the specified array and the size
>     of this
>      >>>       collection."  However, in multithread environment, it is
>     not easy
>      >>> to
>      >>>       tell if the collection fits in the specified array,
>     because the
>      >>>       items may be removed when toArray is copying.
>      >>>
>      >>>
>      >>>   Right. The problem is that AbstractCollection doesn't address
>      >>>   thread-safety or any other concurrency issues so doesn't
>     account for
>      >>>   the collection growing or shrinking while the toArray snapshot is
>      >>>   being taken. Really the collection implementations that are
>     designed
>      >>>   to support multiple threads should override toArray to make
>     it clear
>      >>>   how it should behave. As it stands, in my opinion, it is more a
>      >>> "quality of implementation" issue as to whether AbstractCollection
>      >>>   expends effort after creating the array to see if the array is
>      >>>   actually full or not; or whether after creating an array it turns
>      >>>   out it could have fit in the original.
>      >>>
>      >>>   For a concurrent collection  I would write the spec for toArray
>      >>>   something like:
>      >>>
>      >>> "The current size of the collection is examined and if the
>      >>>   collection fits in the specified array it will be the target
>     array,
>      >>>   else a new array is allocated based on that current size and it
>      >>>   becomes the target array. If the collection grows such that the
>      >>>   target array no longer fits then extra elements will not be
>     copied
>      >>>   into the target array. If the collection shrinks then the target
>      >>>   array will contain null elements."
>      >>>
>      >>>   Or for the last part "then the target array will be copied to
>     a new
>      >>>   array that exactly fits the number of elements returned".
>      >>>
>      >>>   David Holmes
>      >>>   ------------
>      >>>
>      >>>
>      >>>
>      >>>            Here is a testcase:
>      >>>
>     //////////////////////////////**__////////////////////////////**
>      >>> //__//////////////
>      >>>
>      >>>       import java.util.Map;
>      >>>       import java.util.concurrent.__**ConcurrentHashMap;
>      >>>
>      >>>
>      >>>       public class CollectionToArrayTest {
>      >>>
>      >>>            static volatile Map<String, String>  map = new
>      >>>       TConcurrentHashMap<String, String>();
>      >>>            static volatile boolean gosleep = true;
>      >>>
>      >>>            static class TConcurrentHashMap<K, V>  extends
>      >>>       ConcurrentHashMap<K, V>  {
>      >>>                public int size() {
>      >>>                    int oldresult = super.size();
>      >>>                    System.out.println("map size before concurrent
>      >>>       remove is "
>      >>>                            + oldresult);
>      >>>                    while (gosleep) {
>      >>>                        try {
>      >>>                            // Make sure the map is modified during
>      >>>       toArray is
>      >>>       called,
>      >>>                            // between getsize and being iterated.
>      >>>                            Thread.sleep(1000);
>      >>>                            // System.out.println("size called,
>     size is "
>      >>> +
>      >>>       oldresult +
>      >>>                            // " take a sleep to make sure the
>     element
>      >>>       is deleted
>      >>>       before size is returned.");
>      >>>                        } catch (Exception e) {
>      >>>                        }
>      >>>                    }
>      >>>                    return oldresult;
>      >>>                }
>      >>>            }
>      >>>
>      >>>            static class ToArrayThread implements Runnable {
>      >>>                public void run() {
>      >>>                    for (int i = 0; i<  5; i++) {
>      >>>                        String str = Integer.toString(i);
>      >>>                        map.put(str, str);
>      >>>                    }
>      >>>                    String[] buffer = new String[4];
>      >>>                    String[] strings = map.values().toArray(buffer);
>      >>>                    // System.out.println("length is " +
>     strings.length);
>      >>>                    if (strings.length<= buffer.length) {
>      >>>                        System.out.println("given array size is "
>      >>>                                        + buffer.length
>      >>>                                        + " \nreturned array
>     size is "
>      >>>                                        + strings.length
>      >>>                                        + ", \nbuffer should be used
>      >>>       according to
>      >>>       spec. Is buffer used : "
>      >>>                                        + (strings == buffer));
>      >>>                    }
>      >>>                }
>      >>>            }
>      >>>
>      >>>            static class RemoveThread implements Runnable {
>      >>>                public void run() {
>      >>>                    String str = Integer.toString(0);
>      >>>                    map.remove(str);
>      >>>                    gosleep = false;
>      >>>                }
>      >>>            }
>      >>>
>      >>>            public static void main(String args[]) {
>      >>>                CollectionToArrayTest app = new
>     CollectionToArrayTest();
>      >>>                app.test_concurrentRemove();
>      >>>            }
>      >>>
>      >>>            public void test_concurrentRemove() {
>      >>>
>      >>>
>     System.out.println("//////////**__////////////////////////////**
>      >>> //__//////\n"
>      >>>
>      >>>       +
>      >>> "The spec says if the given array is large\n "  +
>      >>> "enough to hold all elements, the given array\n" +
>      >>> "should be returned by toArray. This \n" +
>      >>> "testcase checks this case. \n" +
>      >>> "/////////////////////////////**__/////////////////");
>      >>>
>      >>>
>      >>>                Thread[] threads = new Thread[2];
>      >>>                threads[0] = new Thread(new ToArrayThread());
>      >>>                threads[1] = new Thread(new RemoveThread());
>      >>>
>      >>>                threads[0].start();
>      >>>
>      >>>                try {
>      >>>                    // Take a sleep to make sure toArray is already
>      >>> called.
>      >>>                    Thread.sleep(1200);
>      >>>                } catch (Exception e) {
>      >>>                }
>      >>>
>      >>>                threads[1].start();
>      >>>            }
>      >>>       }
>      >>>
>      >>>
>     //////////////////////////////**__////////////////////////////**
>      >>> //__//////////////
>      >>>
>      >>>
>      >>>       TConcurrentHashMap is used to make sure the collection is
>     modified
>      >>>       during toArray is invoked. So the returned array fits in the
>      >>>       specified
>      >>>       array, but a new array is used because toArray checks the
>     size
>      >>>       before copying.
>      >>>
>      >>>       Is this a bug ?
>      >>>
>      >>>
>      >>>
>      >>>
>      >>>
>      >>> --
>      >>> Best Regards,
>      >>> Sean Chou
>      >>>
>      >>>
>      >
>      >
>      > --
>      > Best Regards,
>      > Sean Chou
>
>
>
>
> --
> Best Regards,
> Sean Chou
>



More information about the core-libs-dev mailing list