RFR: 8001647: In-place methods on Collection/List

Fri Apr 19 07:59:24 PDT 2013

On Apr 19, 2013, at 2:42 PM, David Holmes <david.holmes at oracle.com> wrote:

> On 19/04/2013 10:14 PM, Paul Sandoz wrote:
>> 
>> On Apr 19, 2013, at 1:15 PM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>> 
>>> On 18/04/2013 19:49, Akhil Arora wrote:
>>>> Looks like the stars are aligning on getting on this into TL... the
>>>> refreshed webrev is -
>>>> 
>>>> http://cr.openjdk.java.net/~akhil/8001647.8/webrev/
>>>> 
>>> A minor comment on Collection.removeIf is "that satisifies the given predicate" might be better than "which matches the provided predicate". Also for completeness, you could say "RuntimeExceptions and Errors thrown by the predicate are propagated ...".
>>> 
>>> In List.replaceAll then @throws NullPointerException is listed twice, which is okay, but might be better to combine them. A typo in the second NPE description "if the an element".
>>> 
>>> In the implementation then the only thing that puzzled me is checking the modification count in legacy Vector, that seems unnecessary.
>>> 
>> 
>> The function value could structurally modify the Vector instance.
> 
> So this came through while I was writing my similar comments ...
> 
> My reaction to this is simply "well don't do that".

FWIW in the past this functionality has helped me detect bugs, so yes i agree "don't do it" but sometimes it unintentionally happens and it can be harder to track down the source of the problem without a CME.

> If the function/predicate/comparator is mutating the Vector then the user gets what they deserve in my opinion. Trying to account for that is somewhat futile. As per my other email the loop check for modCount==expectedModCount will get hoisted from the loop. Further in removeIf you need to be a lot more defensive during the first iteration as you haven't kept a reference to the original size and array. That aside the second part of removeIf (the actual removal) doesn't invoke any external code so no concurrent modification is possible then.
> 
> This seems like overkill to me.
> 

I think the mod count checking should be explicitly hoisted outside the loops and performed at the end of the loop/method (see the spliterator.forEachReamining implementations) since a CME should only be used to detect bugs. Doug's comments on ArrayList's spliterator [*] are relevant.

Probably the most important methods for such checking are forEach, to be consistent with the default method implementation, Iterator/Spliterator.forEachRemaining, and Stream.forEach. 

If we can do such checking easily and efficiently for the other methods, which seems possible, then i think it worth doing and is consistent with the default methods, since they use iterator (List.sort defers to Collections.sort uses the iterator to update the list).

Paul.

[*]
        /*
         * If ArrayLists were immutable, or structurally immutable (no
         * adds, removes, etc), we could implement their spliterators
         * with Arrays.spliterator. Instead we detect as much
         * interference during traversal as practical without
         * sacrificing much performance. We rely primarily on
         * modCounts. These are not guaranteed to detect concurrency
         * violations, and are sometimes overly conservative about
         * within-thread interference, but detect enough problems to
         * be worthwhile in practice. To carry this out, we (1) lazily
         * initialize fence and expectedModCount until the latest
         * point that we need to commit to the state we are checking
         * against; thus improving precision.  (This doesn't apply to
         * SubLists, that create spliterators with current non-lazy
         * values).  (2) We perform only a single
         * ConcurrentModificationException check at the end of forEach
         * (the most performance-sensitive method). When using forEach
         * (as opposed to iterators), we can normally only detect
         * interference after actions, not before. Further
         * CME-triggering checks apply to all other possible
         * violations of assumptions for example null or too-small
         * elementData array given its size(), that could only have
         * occurred due to interference.  This allows the inner loop
         * of forEach to run without any further checks, and
         * simplifies lambda-resolution. While this does entail a
         * number of checks, note that in the common case of
         * list.stream().forEach(a), no checks or other computation
         * occur anywhere other than inside forEach itself.  The other
         * less-often-used methods cannot take advantage of most of
         * these streamlinings.
         */