A behavior mismatch in AbstractCollection.toArray(T[] )
Sean Chou
zhouyx at linux.vnet.ibm.com
Wed Dec 14 07:05:14 UTC 2011
Hi Mike, David,
I reported this as a bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7121314 .
On Wed, Dec 14, 2011 at 1:03 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> I agree that most people probably won't check whether the array returned
> is the array they provided. It is possible that they incorrectly assume it
> is. ie.
>
> Collection<Integer> collection;
> ...
> Integer[] foo = new Integer[collection.size()]; // of sufficient size for
> all elements
> // at this point collection grows.
> collection.toArray(foo); // this will work, right?
> for(Integer each : foo) {
> // wait! foo is empty, why?
> }
>
> If collection is mutated and increased in size between the creation of foo
> and the toArray operation then foo will be empty and the user probably
> won't know why the elements weren't copied. Actually, the elements were
> copied but the result of toArray containing those elements was ignored.
>
> This usage seems like just a bad idiom though and is avoidable.
>
> More worrying is the toArray() case where the resultant array is currently
> documented to hold only result elements and no extra nulls. If the
> collection shrinks then it is currently possible that the resulting array
> could very unexpectedly hold nulls.
>
> Mike
>
> On Dec 13 2011, at 05:30 , Sean Chou wrote:
>
> > Sorry for the confuse. By "ok", I mean "compare the size of array which
> is
> > going to be
> > returned and the size of the specified array, and copy the elements
> > into the specified
> > array if it is larger and return the specified array."
> >
> > Nothing is causing problem for now, I just found a mismatch. I think most
> > guys will
> > just use the returned array without checking if it's the specified one;
> and
> > this is also
> > why I think it may be possible to modify the behavior without causing
> > problems.
> >
> > And I think modifying ConcurrentHashMap is as dangerous as modifying
> > AbstractCollection
> > if people are relying on implementation, is this right? So it seems we
> can
> > do nothing
> > to the mismatch now...
> >
> >
> > On Tue, Dec 13, 2011 at 8:06 PM, David Holmes <david.holmes at oracle.com
> >wrote:
> >
> >> On 13/12/2011 9:18 PM, Sean Chou wrote:
> >>
> >>> Hi ,
> >>>
> >>> Is it possible to change the spec ? I found it is defined in
> >>> java.utils.Collection interface. It would be easy for
> >>> AbstractCollection to state that it is not designed for concurrent
> >>> operations, and its subclass should take care of them.
> >>>
> >>
> >> Such a disclaimer might be added to the spec for AbstractCollection but
> >> that doesn't really change anything - it just makes observed behaviour
> less
> >> surprising.
> >>
> >>
> >> However, I think the simplest way may be modifying toArray(T[])
> >>> method for an additional check, which would work for most subclasses of
> >>> AbstractCollection...
> >>> Is that ok ?
> >>>
> >>
> >> "ok" in what sense? Do you want to change the spec or just change the
> >> current behaviour? If you do the latter and people rely on that
> >> implementation rather than the spec then code will not be portable
> across
> >> different implementations of the platform.
> >>
> >> I would not want to see a change in behaviour without a change in
> >> specification and I do not think the specification for
> AbstractCollection
> >> can, or should be, changed. Just my opinion of course.
> >>
> >> What is the concrete concurrent collection that you have a problem with?
> >> If it is ConcurrentHashMap, as per the example, then perhaps
> >> ConcurrentHashMap should be where a change is considered.
> >>
> >> David
> >> -----
> >>
> >>
> >>> On Tue, Dec 13, 2011 at 3:41 PM, David Holmes <david.holmes at oracle.com
> >>> <mailto:david.holmes at oracle.**com <david.holmes at oracle.com>>> wrote:
> >>>
> >>> Hi Sean,
> >>>
> >>>
> >>> On 13/12/2011 5:21 PM, Sean Chou wrote:
> >>>
> >>> When I was reading the code of AbstractCollection.toArray(T[] ),
> I
> >>> found its behavior maybe different from the spec in multithread
> >>> environment. The spec says "If the collection fits in the
> specified
> >>> array, it is returned therein. Otherwise, a new array is
> allocated
> >>> with the runtime type of the specified array and the size of this
> >>> collection." However, in multithread environment, it is not easy
> >>> to
> >>> tell if the collection fits in the specified array, because the
> >>> items may be removed when toArray is copying.
> >>>
> >>>
> >>> Right. The problem is that AbstractCollection doesn't address
> >>> thread-safety or any other concurrency issues so doesn't account for
> >>> the collection growing or shrinking while the toArray snapshot is
> >>> being taken. Really the collection implementations that are designed
> >>> to support multiple threads should override toArray to make it clear
> >>> how it should behave. As it stands, in my opinion, it is more a
> >>> "quality of implementation" issue as to whether AbstractCollection
> >>> expends effort after creating the array to see if the array is
> >>> actually full or not; or whether after creating an array it turns
> >>> out it could have fit in the original.
> >>>
> >>> For a concurrent collection I would write the spec for toArray
> >>> something like:
> >>>
> >>> "The current size of the collection is examined and if the
> >>> collection fits in the specified array it will be the target array,
> >>> else a new array is allocated based on that current size and it
> >>> becomes the target array. If the collection grows such that the
> >>> target array no longer fits then extra elements will not be copied
> >>> into the target array. If the collection shrinks then the target
> >>> array will contain null elements."
> >>>
> >>> Or for the last part "then the target array will be copied to a new
> >>> array that exactly fits the number of elements returned".
> >>>
> >>> David Holmes
> >>> ------------
> >>>
> >>>
> >>>
> >>> Here is a testcase:
> >>> //////////////////////////////**__////////////////////////////**
> >>> //__//////////////
> >>>
> >>> import java.util.Map;
> >>> import java.util.concurrent.__**ConcurrentHashMap;
> >>>
> >>>
> >>> public class CollectionToArrayTest {
> >>>
> >>> static volatile Map<String, String> map = new
> >>> TConcurrentHashMap<String, String>();
> >>> static volatile boolean gosleep = true;
> >>>
> >>> static class TConcurrentHashMap<K, V> extends
> >>> ConcurrentHashMap<K, V> {
> >>> public int size() {
> >>> int oldresult = super.size();
> >>> System.out.println("map size before concurrent
> >>> remove is "
> >>> + oldresult);
> >>> while (gosleep) {
> >>> try {
> >>> // Make sure the map is modified during
> >>> toArray is
> >>> called,
> >>> // between getsize and being iterated.
> >>> Thread.sleep(1000);
> >>> // System.out.println("size called, size is
> "
> >>> +
> >>> oldresult +
> >>> // " take a sleep to make sure the element
> >>> is deleted
> >>> before size is returned.");
> >>> } catch (Exception e) {
> >>> }
> >>> }
> >>> return oldresult;
> >>> }
> >>> }
> >>>
> >>> static class ToArrayThread implements Runnable {
> >>> public void run() {
> >>> for (int i = 0; i< 5; i++) {
> >>> String str = Integer.toString(i);
> >>> map.put(str, str);
> >>> }
> >>> String[] buffer = new String[4];
> >>> String[] strings = map.values().toArray(buffer);
> >>> // System.out.println("length is " +
> strings.length);
> >>> if (strings.length<= buffer.length) {
> >>> System.out.println("given array size is "
> >>> + buffer.length
> >>> + " \nreturned array size is "
> >>> + strings.length
> >>> + ", \nbuffer should be used
> >>> according to
> >>> spec. Is buffer used : "
> >>> + (strings == buffer));
> >>> }
> >>> }
> >>> }
> >>>
> >>> static class RemoveThread implements Runnable {
> >>> public void run() {
> >>> String str = Integer.toString(0);
> >>> map.remove(str);
> >>> gosleep = false;
> >>> }
> >>> }
> >>>
> >>> public static void main(String args[]) {
> >>> CollectionToArrayTest app = new CollectionToArrayTest();
> >>> app.test_concurrentRemove();
> >>> }
> >>>
> >>> public void test_concurrentRemove() {
> >>>
> >>> System.out.println("//////////**__////////////////////////////**
> >>> //__//////\n"
> >>>
> >>> +
> >>> "The spec says if the given array is large\n " +
> >>> "enough to hold all elements, the given array\n" +
> >>> "should be returned by toArray. This \n" +
> >>> "testcase checks this case. \n" +
> >>> "/////////////////////////////**__/////////////////");
> >>>
> >>>
> >>> Thread[] threads = new Thread[2];
> >>> threads[0] = new Thread(new ToArrayThread());
> >>> threads[1] = new Thread(new RemoveThread());
> >>>
> >>> threads[0].start();
> >>>
> >>> try {
> >>> // Take a sleep to make sure toArray is already
> >>> called.
> >>> Thread.sleep(1200);
> >>> } catch (Exception e) {
> >>> }
> >>>
> >>> threads[1].start();
> >>> }
> >>> }
> >>>
> >>> //////////////////////////////**__////////////////////////////**
> >>> //__//////////////
> >>>
> >>>
> >>> TConcurrentHashMap is used to make sure the collection is
> modified
> >>> during toArray is invoked. So the returned array fits in the
> >>> specified
> >>> array, but a new array is used because toArray checks the size
> >>> before copying.
> >>>
> >>> Is this a bug ?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Sean Chou
> >>>
> >>>
> >
> >
> > --
> > Best Regards,
> > Sean Chou
>
>
--
Best Regards,
Sean Chou
More information about the core-libs-dev
mailing list