RFR: jsr166 jdk integration 2018-05
Time to do this, since Claes is also touching ArrayList. http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.h... 8202685: Improve ArrayList replaceAll http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll... https://bugs.openjdk.java.net/browse/JDK-8202685 8201386: Miscellaneous changes imported from jsr166 CVS 2018-05 http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/miscellane... https://bugs.openjdk.java.net/browse/JDK-8201386
On 2018-05-09 20:17, Martin Buchholz wrote:
8202685: Improve ArrayList replaceAll http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll... <http://cr.openjdk.java.net/%7Emartin/webrevs/jdk/jsr166-integration/replaceAll/index.html> https://bugs.openjdk.java.net/browse/JDK-8202685
This looks reasonable to me. Removing modCount increments from replaceAll seems like the right call as it doesn't change the structure of the list (compare ArrayList#set(int, E)) Thanks! /Claes
On May 9, 2018, at 11:17 AM, Martin Buchholz <martinrb@google.com> wrote:
Time to do this, since Claes is also touching ArrayList.
http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.h... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.html>
8202685: Improve ArrayList replaceAll http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll/index.html> https://bugs.openjdk.java.net/browse/JDK-8202685 <https://bugs.openjdk.java.net/browse/JDK-8202685>
ArrayList.java — @Override public void replaceAll(UnaryOperator<E> operator) { Objects.requireNonNull(operator); final int expectedModCount = modCount; final Object[] es = elementData; ! final int size = this.size; ! for (int i = 0; modCount == expectedModCount && i < size; i++) es[i] = operator.apply(elementAt(es, i)); if (modCount != expectedModCount) throw new ConcurrentModificationException(); - modCount++; } A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
8201386: Miscellaneous changes imported from jsr166 CVS 2018-05 http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/miscellane... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/miscellaneous/index.html> https://bugs.openjdk.java.net/browse/JDK-8201386 <https://bugs.openjdk.java.net/browse/JDK-8201386>
PriorityQueue — 587 public E poll() { 588 final Object[] es; 589 final E result; 590 591 if ((result = (E) ((es = queue)[0])) != null) { 592 modCount++; 593 final int n; 594 final E x = (E) es[(n = --size)]; 595 es[n] = null; 596 if (n > 0) { 597 final Comparator<? super E> cmp; 598 if ((cmp = comparator) == null) 599 siftDownComparable(0, x, es, n); 600 else 601 siftDownUsingComparator(0, x, es, n, cmp); 602 } 603 } 604 return result; 605 } Why inline the siftDown method? Paul.
On Fri, May 11, 2018 at 9:06 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
On May 9, 2018, at 11:17 AM, Martin Buchholz <martinrb@google.com> wrote:
Time to do this, since Claes is also touching ArrayList.
http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166- integration/overview.html
8202685: Improve ArrayList replaceAll http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166- integration/replaceAll/index.html https://bugs.openjdk.java.net/browse/JDK-8202685
ArrayList.java —
@Override public void replaceAll(UnaryOperator<E> operator) { Objects.requireNonNull(operator); final int expectedModCount = modCount; final Object[] es = elementData;! final int size = this.size;! for (int i = 0; modCount == expectedModCount && i < size; i++) es[i] = operator.apply(elementAt(es, i)); if (modCount != expectedModCount) throw new ConcurrentModificationException();- modCount++; }
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument. You can argue that the very idea of structural modifications was a bad idea, but it's still there in https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/ut... (Personally, I think checking for concurrent modification at all is not worth the small improvement in safety)
8201386: Miscellaneous changes imported from jsr166 CVS 2018-05 http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166- integration/miscellaneous/index.html https://bugs.openjdk.java.net/browse/JDK-8201386
PriorityQueue —
587 public E poll() { 588 final Object[] es; 589 final E result; 590 591 if ((result = (E) ((es = queue)[0])) != null) { 592 modCount++; 593 final int n; 594 final E x = (E) es[(n = --size)]; 595 es[n] = null; 596 if (n > 0) { 597 final Comparator<? super E> cmp; 598 if ((cmp = comparator) == null) 599 siftDownComparable(0, x, es, n); 600 else 601 siftDownUsingComparator(0, x, es, n, cmp); 602 } 603 } 604 return result; 605 }
Why inline the siftDown method?
There is an effort here to remove gratuitous implementation differences between PriorityQueue and PriorityBlockingQueue. Maybe this one should go the other way - refactor the corresponding code in PBQ into siftDown. But can we trust the VM to not re-read the queue and size fields?
On May 11, 2018, at 9:33 AM, Martin Buchholz <martinrb@google.com> wrote:
On Fri, May 11, 2018 at 9:06 AM, Paul Sandoz <paul.sandoz@oracle.com <mailto:paul.sandoz@oracle.com>> wrote:
On May 9, 2018, at 11:17 AM, Martin Buchholz <martinrb@google.com <mailto:martinrb@google.com>> wrote:
Time to do this, since Claes is also touching ArrayList.
http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.h... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.html>
8202685: Improve ArrayList replaceAll http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/replaceAll/index.html> https://bugs.openjdk.java.net/browse/JDK-8202685 <https://bugs.openjdk.java.net/browse/JDK-8202685>
ArrayList.java — @Override public void replaceAll(UnaryOperator<E> operator) { Objects.requireNonNull(operator); final int expectedModCount = modCount; final Object[] es = elementData; ! final int size = this.size; ! for (int i = 0; modCount == expectedModCount && i < size; i++) es[i] = operator.apply(elementAt(es, i)); if (modCount != expectedModCount) throw new ConcurrentModificationException(); - modCount++; }
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
You can argue that the very idea of structural modifications was a bad idea, but it's still there in https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/ut... <https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/util/List.html>
(Personally, I think checking for concurrent modification at all is not worth the small improvement in safety)
I tend to agree, the cost of specifying and implementing is quite high (i have been saved once by this functionality).
8201386: Miscellaneous changes imported from jsr166 CVS 2018-05 http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/miscellane... <http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/miscellaneous/index.html> https://bugs.openjdk.java.net/browse/JDK-8201386 <https://bugs.openjdk.java.net/browse/JDK-8201386>
PriorityQueue —
587 public E poll() { 588 final Object[] es; 589 final E result; 590 591 if ((result = (E) ((es = queue)[0])) != null) { 592 modCount++; 593 final int n; 594 final E x = (E) es[(n = --size)]; 595 es[n] = null; 596 if (n > 0) { 597 final Comparator<? super E> cmp; 598 if ((cmp = comparator) == null) 599 siftDownComparable(0, x, es, n); 600 else 601 siftDownUsingComparator(0, x, es, n, cmp); 602 } 603 } 604 return result; 605 }
Why inline the siftDown method?
There is an effort here to remove gratuitous implementation differences between PriorityQueue and PriorityBlockingQueue. Maybe this one should go the other way - refactor the corresponding code in PBQ into siftDown. But can we trust the VM to not re-read the queue and size fields?
If it inlines it probably won’t :-) Paul.
On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says: """protected transient int modCount The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.""" replaceAll doesn't qualify as a structural modification. A user can semi-reasonably decide to call ArrayList.replaceAll while an iteration is in progress (in the same thread). Or for Vector, imagine occasional gc-like vector.replaceAll(x -> optimizedDropInReplacementFor(x)) If we implicitly revised the definition of modCount, that seems like a bug to me.
On May 14, 2018, at 12:43 PM, Martin Buchholz <martinrb@google.com> wrote:
On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com <mailto:Paul.Sandoz@oracle.com>> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”. Paul.
A user can semi-reasonably decide to call ArrayList.replaceAll while an iteration is in progress (in the same thread).
Or for Vector, imagine occasional gc-like vector.replaceAll(x -> optimizedDropInReplacementFor(x))
If we implicitly revised the definition of modCount, that seems like a bug to me.
On Mon, May 14, 2018 at 1:45 PM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
On May 14, 2018, at 12:43 PM, Martin Buchholz <martinrb@google.com> wrote:
On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification. For ArrayList and Vector in particular, List.replaceAll is as safe as List.set(int). And if there was a List implementation for which that was not the case, it would probably be a bug in replaceAll implementation. Think ordinary array assignment or AtomicReferenceArray assignment.
On May 14, 2018, at 2:04 PM, Martin Buchholz <martinrb@google.com> wrote:
On Mon, May 14, 2018 at 1:45 PM, Paul Sandoz <paul.sandoz@oracle.com <mailto:paul.sandoz@oracle.com>> wrote:
On May 14, 2018, at 12:43 PM, Martin Buchholz <martinrb@google.com <mailto:martinrb@google.com>> wrote:
On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com <mailto:Paul.Sandoz@oracle.com>> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification.
I get that, but I would argue that (placing the implementation aside) the bulk operation as a whole is compatible with the “otherwise” part of the modCount definition, since it can perturb the list and effect the yet to be traversed elements. Such usage is a likely source of hard to track down bugs that CMEs are designed to help flag. I doubt i am gonna change your mind on this :-) So we may just have to agree to disagree on the interpretation of the definition and move on. I would prefer it remains but it's your call. Paul.
For ArrayList and Vector in particular, List.replaceAll is as safe as List.set(int). And if there was a List implementation for which that was not the case, it would probably be a bug in replaceAll implementation.
Think ordinary array assignment or AtomicReferenceArray assignment.
On 15/05/2018 7:56 AM, Paul Sandoz wrote:
On May 14, 2018, at 2:04 PM, Martin Buchholz <martinrb@google.com> wrote: On Mon, May 14, 2018 at 1:45 PM, Paul Sandoz <paul.sandoz@oracle.com <mailto:paul.sandoz@oracle.com>> wrote:
On May 14, 2018, at 12:43 PM, Martin Buchholz <martinrb@google.com <mailto:martinrb@google.com>> wrote: On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com <mailto:Paul.Sandoz@oracle.com>> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification.
I get that, but I would argue that (placing the implementation aside) the bulk operation as a whole is compatible with the “otherwise” part of the modCount definition, since it can perturb the list and effect the yet to be traversed elements. Such usage is a likely source of hard to track down bugs that CMEs are designed to help flag.
I doubt i am gonna change your mind on this :-) So we may just have to agree to disagree on the interpretation of the definition and move on. I would prefer it remains but it's your call.
FWIW I agree with Martin - sorry Paul :) CME is not about changing the contents, it's about changing the structure. I can also see the case for wanting operations that invalidate all existing iterators, but that's not how CME is defined. Cheers, David -----
Paul.
For ArrayList and Vector in particular, List.replaceAll is as safe as List.set(int). And if there was a List implementation for which that was not the case, it would probably be a bug in replaceAll implementation.
Think ordinary array assignment or AtomicReferenceArray assignment.
On May 14, 2018, at 5:51 PM, David Holmes <david.holmes@oracle.com> wrote:
On 15/05/2018 7:56 AM, Paul Sandoz wrote:
On May 14, 2018, at 2:04 PM, Martin Buchholz <martinrb@google.com> wrote: On Mon, May 14, 2018 at 1:45 PM, Paul Sandoz <paul.sandoz@oracle.com <mailto:paul.sandoz@oracle.com>> wrote:
On May 14, 2018, at 12:43 PM, Martin Buchholz <martinrb@google.com <mailto:martinrb@google.com>> wrote: On Mon, May 14, 2018 at 12:18 PM, Paul Sandoz <Paul.Sandoz@oracle.com <mailto:Paul.Sandoz@oracle.com>> wrote:
A CME is not necessarily associated with just structural modifications it could, on a best effort basis, be associated with any modification, which is cheaper to do for bulk operations rather than individual operations, and this operation can be used to perturb all the elements of the list (just like sort) causing strange and hard to track bugs while in the middle of iterating. IMHO its better to fail under such circumstances rather than be silent.
It's tempting to treat modCount as a count of ALL modifications, especially given its name, but that's different from the historical behavior and design of these classes. Consistency with existing spec and implementations is the biggest argument.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification. I get that, but I would argue that (placing the implementation aside) the bulk operation as a whole is compatible with the “otherwise” part of the modCount definition, since it can perturb the list and effect the yet to be traversed elements. Such usage is a likely source of hard to track down bugs that CMEs are designed to help flag. I doubt i am gonna change your mind on this :-) So we may just have to agree to disagree on the interpretation of the definition and move on. I would prefer it remains but it's your call.
FWIW I agree with Martin - sorry Paul :)
That’s ok. Its clear i am out numbered here, and overspent my budget on debating CME for the month :-) Paul.
I mentally revised the history when doing the collections/stream API work since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification. I get that, but I would argue that (placing the implementation aside) the bulk operation as a whole is compatible with the “otherwise” part of the modCount definition, since it can perturb the list and effect the yet to be traversed elements. Such usage is a likely source of hard to track down bugs that CMEs are designed to help flag. I doubt i am gonna change your mind on this :-) So we may just have to agree to disagree on the interpretation of the definition and move on. I would prefer it remains but it's your call.
FWIW I agree with Martin - sorry Paul :)
That’s ok. Its clear i am out numbered here, and overspent my budget on debating CME for the month :-)
Don't give up so quickly, Paul. :-) I remember working on the bulk operations back in JDK 8 (with Mike Duigou?) and we discussed the issue of bulk operations vs. concurrent modification. Even though operations like replaceAll() and sort() don't "structurally modify" the list, our interpretation was that they did perturb in-progress iterations. These operations potentially change every element, thus an in-progress iteration is likely to observe some unspecified mixture of the "before" and "after" states of the list. That sounds like a programming error. The intent of CME is to prevent programming errors, so we focused on that instead of on "structural modification." The "perturb" clause seemed to cover this behavior, so we left that part of the spec unchanged. Note that in JDK 8, both ArrayList's and Vector's replaceAll() both will cause CME to be thrown by an in-progress iteration. I think the onus should be on Martin to show why replaceAll() should no longer trigger CME. An alternative would be to adjust the specification of modCount to clarify that it accommodates bulk modifications that aren't strictly structural modifications. s'marks
On Tue, May 15, 2018 at 10:19 AM, Stuart Marks <stuart.marks@oracle.com> wrote:
I mentally revised the history when doing the collections/stream API work
since we added more bulk operations, since this is on a “best effort” basis and if it’s cheap to do then there is no real harm in it and it might help.
Spec says:
"""protected transient int modCount
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results."""
replaceAll doesn't qualify as a structural modification.
Why not? It can "perturb it in such a fashion that iterations in progress may yield incorrect results”.
Why? It replaces every element "inplace" in the style of List.set(i) which is also not a structural modification.
I get that, but I would argue that (placing the implementation aside) the bulk operation as a whole is compatible with the “otherwise” part of the modCount definition, since it can perturb the list and effect the yet to be traversed elements. Such usage is a likely source of hard to track down bugs that CMEs are designed to help flag. I doubt i am gonna change your mind on this :-) So we may just have to agree to disagree on the interpretation of the definition and move on. I would prefer it remains but it's your call.
FWIW I agree with Martin - sorry Paul :)
That’s ok. Its clear i am out numbered here, and overspent my budget on debating CME for the month :-)
Don't give up so quickly, Paul. :-)
I remember working on the bulk operations back in JDK 8 (with Mike Duigou?) and we discussed the issue of bulk operations vs. concurrent modification.
Even though operations like replaceAll() and sort() don't "structurally modify" the list, our interpretation was that they did perturb in-progress iterations. These operations potentially change every element, thus an in-progress iteration is likely to observe some unspecified mixture of the "before" and "after" states of the list. That sounds like a programming error. The intent of CME is to prevent programming errors, so we focused on that instead of on "structural modification." The "perturb" clause seemed to cover this behavior, so we left that part of the spec unchanged.
Note that in JDK 8, both ArrayList's and Vector's replaceAll() both will cause CME to be thrown by an in-progress iteration. I think the onus should be on Martin to show why replaceAll() should no longer trigger CME.
(TL;DR - replaceAll incrementing modCount is a bug.) Hmmm ... my previous convincing arguments have failed to convince ?! Your argument above applies to List.set just as much as List.repladeAll, because the latter is nothing more semantically than a bunch of calls to the former. They should have the same behavior. Not having the same behavior leads to inconsistency, seen today in subList operations on ArrayList and Vector having different modCount behavior than on the root list. Again, imagine this use case: there is a periodic background task that optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress. (I also lean to removing the modCount increment in sort, but there the argument is much weaker, since that likely WILL perturb any iteration in progress)
An alternative would be to adjust the specification of modCount to clarify that it accommodates bulk modifications that aren't strictly structural modifications.
s'marks
On Tue, May 15, 2018 at 11:35 AM, Martin Buchholz <martinrb@google.com> wrote:
Your argument above applies to List.set just as much as List.repladeAll, because the latter is nothing more semantically than a bunch of calls to the former. They should have the same behavior. Not having the same behavior leads to inconsistency, seen today in subList operations on ArrayList and Vector having different modCount behavior than on the root list.
To strengthen that, the default method List.replaceAll is *specified* to be equivalent to final ListIterator<E> li = list.listIterator(); while (li.hasNext()) { li.set(operator.apply(li.next())); } https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/ut...) and incrementing modCount breaks "equivalent to"
(TL;DR - replaceAll incrementing modCount is a bug.)
I acknowledge that statement is the one in dispute.
Hmmm ... my previous convincing arguments have failed to convince ?!
Your argument above applies to List.set just as much as List.repladeAll, because the latter is nothing more semantically than a bunch of calls to the former. They should have the same behavior. Not having the same behavior leads to inconsistency, seen today in subList operations on ArrayList and Vector having different modCount behavior than on the root list.
Right, I read those arguments, and I'm not convinced. Just because an individual operation has some characteristic doesn't necessarily imply that an aggregate operation must have the same characteristic. (This is variously called a fallacy of composition, or in U.S. tax law, the step transaction doctrine.) Bringing replaceAll() to the public API exposes it as a single operation, with its own semantics. Note that Vector.replaceAll() holds the lock for the entire operation, not merely around individual set() operations. It really is different from performing individual set() operations.
Again, imagine this use case: there is a periodic background task that optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress.
I don't think it's possible to do that correctly without holding a lock around the entire iteration. If the lock is held, CME can't occur, as a concurrent replaceAll() will occur before or after the iteration, never in the middle. If an iteration over a Vector doesn't hold a lock, any read-modify-write operations (consider a loop with a ListIterator on which set() is called) can be interleaved with bulk operations (like replaceAll) which is clearly incorrect. In such cases, CME should be thrown. Also, this use case cannot be written today, because CME is thrown. I'm not aware of an actual use case that's been prevented because CME is thrown. Of course, as API designers we have to anticipate needs and not just react to complaints. My view of this kind of situation (interleaving of bulk operation(s) with an iteration loop) is much more likely to be a programming error than an actual use case.
To strengthen that, the default method List.replaceAll is specified to be equivalent to
final ListIterator<E> li = list.listIterator(); while (li.hasNext()) { li.set(operator.apply(li.next())); }
and incrementing modCount breaks "equivalent to".
Note carefully: that is an "Implementation Requirement" on the default implementation of List.replaceAll(). It doesn't govern implementations in implementing classes such as ArrayList. s'marks
I'm still with Martin on this. It makes no sense to me to allow replacing one element to not cause CME, but replacing more than one does cause CME. This is inconsistent and confusing and the end result is the programmer won't know what to expect when or why. The original intent of CME, from my recollections back in lead-up-to-Java-5 days, was to prevent iterators from breaking i.e. throwing exceptions, due to the occurrence of the "concurrent" operation that changed the structure. It was not intended as an indicator of a semantic programming error. Replacing one element whilst there is a live iterator can be just as semantically wrong as replacing them all. Cheers, David ----- On 16/05/2018 10:34 AM, Stuart Marks wrote:
(TL;DR - replaceAll incrementing modCount is a bug.)
I acknowledge that statement is the one in dispute.
Hmmm ... my previous convincing arguments have failed to convince ?!
Your argument above applies to List.set just as much as List.repladeAll, because the latter is nothing more semantically than a bunch of calls to the former. They should have the same behavior. Not having the same behavior leads to inconsistency, seen today in subList operations on ArrayList and Vector having different modCount behavior than on the root list.
Right, I read those arguments, and I'm not convinced.
Just because an individual operation has some characteristic doesn't necessarily imply that an aggregate operation must have the same characteristic. (This is variously called a fallacy of composition, or in U.S. tax law, the step transaction doctrine.)
Bringing replaceAll() to the public API exposes it as a single operation, with its own semantics. Note that Vector.replaceAll() holds the lock for the entire operation, not merely around individual set() operations. It really is different from performing individual set() operations.
Again, imagine this use case: there is a periodic background task that optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress.
I don't think it's possible to do that correctly without holding a lock around the entire iteration. If the lock is held, CME can't occur, as a concurrent replaceAll() will occur before or after the iteration, never in the middle.
If an iteration over a Vector doesn't hold a lock, any read-modify-write operations (consider a loop with a ListIterator on which set() is called) can be interleaved with bulk operations (like replaceAll) which is clearly incorrect. In such cases, CME should be thrown.
Also, this use case cannot be written today, because CME is thrown. I'm not aware of an actual use case that's been prevented because CME is thrown. Of course, as API designers we have to anticipate needs and not just react to complaints. My view of this kind of situation (interleaving of bulk operation(s) with an iteration loop) is much more likely to be a programming error than an actual use case.
To strengthen that, the default method List.replaceAll is specified to be equivalent to final ListIterator<E> li = list.listIterator(); while (li.hasNext()) { li.set(operator.apply(li.next())); }
and incrementing modCount breaks "equivalent to".
Note carefully: that is an "Implementation Requirement" on the default implementation of List.replaceAll(). It doesn't govern implementations in implementing classes such as ArrayList.
s'marks
On 5/15/18 7:37 PM, David Holmes wrote:
I'm still with Martin on this. It makes no sense to me to allow replacing one element to not cause CME, but replacing more than one does cause CME. This is inconsistent and confusing and the end result is the programmer won't know what to expect when or why.
This is the fallacy of composition, that if a single operation has some property, then an arbitrary composition of those operations must also have that property. This is decidedly not true for many things. Consider atomicity for example. My claim is that the setting of an individual element is qualitatively different from a bulk operation.
The original intent of CME, from my recollections back in lead-up-to-Java-5 days, was to prevent iterators from breaking i.e. throwing exceptions, due to the occurrence of the "concurrent" operation that changed the structure. It was not intended as an indicator of a semantic programming error. Replacing one element whilst there is a live iterator can be just as semantically wrong as replacing them all.
CME, iterators, and the fail-fast concept were introduced in JDK 1.2 with the Collections Framework. The platform has evolved a LOT since then, so I don't think it's worthwhile to make a stand on original intent. The wording around CME and fail-fast is very pragmatic, so it sense to be pragmatic about this issue. s'marks
On Wed, May 16, 2018 at 11:25 AM, Stuart Marks <stuart.marks@oracle.com> wrote:
On 5/15/18 7:37 PM, David Holmes wrote:
I'm still with Martin on this. It makes no sense to me to allow replacing one element to not cause CME, but replacing more than one does cause CME. This is inconsistent and confusing and the end result is the programmer won't know what to expect when or why.
This is the fallacy of composition, that if a single operation has some property, then an arbitrary composition of those operations must also have that property. This is decidedly not true for many things. Consider atomicity for example. My claim is that the setting of an individual element is qualitatively different from a bulk operation.
(We don't seem to be moving towards consensus ...) atomicity is "obviously" a non-composable property of an operation, but contrarily "does not modify" is "obviously" composable. Having replaceAll increment modCount is almost as wrong as having forEach increment modCount, even though """forEach is qualitatively different from get().""" Incrementing modCount is introducing a subtle race into plausible user code that would otherwise be correct. At the very least, having replaceAll increment modCount (inconsistently!) is surprising to users and is not documented anywhere.
On 17/05/2018 4:25 AM, Stuart Marks wrote:
On 5/15/18 7:37 PM, David Holmes wrote:
I'm still with Martin on this. It makes no sense to me to allow replacing one element to not cause CME, but replacing more than one does cause CME. This is inconsistent and confusing and the end result is the programmer won't know what to expect when or why.
This is the fallacy of composition, that if a single operation has some property, then an arbitrary composition of those operations must also have that property. This is decidedly not true for many things. Consider atomicity for example. My claim is that the setting of an individual element is qualitatively different from a bulk operation.
No it's not "the fallacy of composition". Just because some operations can not compose it does not follow that no operations can ever compose.
The original intent of CME, from my recollections back in lead-up-to-Java-5 days, was to prevent iterators from breaking i.e. throwing exceptions, due to the occurrence of the "concurrent" operation that changed the structure. It was not intended as an indicator of a semantic programming error. Replacing one element whilst there is a live iterator can be just as semantically wrong as replacing them all.
CME, iterators, and the fail-fast concept were introduced in JDK 1.2
Yes but believe me it was discussed a LOT when we introduced the concurrency utilities! :)
with the Collections Framework. The platform has evolved a LOT since then, so I don't think it's worthwhile to make a stand on original intent.
That seems an attempt at legitimising straying from the original intent, rather than recognizing that straying away from that intent should have been considered a bug. Besides, ArrayList clearly documents that original intent: "if the list is structurally modified ...".
The wording around CME and fail-fast is very pragmatic, so it sense to be pragmatic about this issue.
I think I am being pragmatic. A change in content is not a change in structure. CME is inappropriate in such circumstances IMHO. It seems to me that inconsistencies have been allowed to creep in wrt. concurrent "modifications" and throwing CME. Now that is recognized we have the dilemma of undoing past mistakes or embracing a stronger (undocumented) notion of "concurrent modification" and applying it uniformly. Neither path is smooth unfortunately. David -----
s'marks
On Tue, May 15, 2018 at 5:34 PM, Stuart Marks <stuart.marks@oracle.com> wrote:
Bringing replaceAll() to the public API exposes it as a single operation, with its own semantics.
Note that Vector.replaceAll() holds the lock for the entire operation, not merely around individual set() operations. It really is different from performing individual set() operations.
How many times the lock is acquired is an implementation detail, a non-obvious tradeoff even. vector.replaceAll holds the lock throughout, but vector.subList(0, size).replaceAll holds the lock for each element (sigh ... racily (really a bug!)).
Again, imagine this use case: there is a periodic background task that
optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress.
I don't think it's possible to do that correctly without holding a lock around the entire iteration.
I don't see why.
If the lock is held, CME can't occur, as a concurrent replaceAll() will occur before or after the iteration, never in the middle.
I don't see why - iteration is inherently non-atomic, so replaceAll could be called in the middle.
If an iteration over a Vector doesn't hold a lock, any read-modify-write operations (consider a loop with a ListIterator on which set() is called) can be interleaved with bulk operations (like replaceAll) which is clearly incorrect. In such cases, CME should be thrown.
Imagine your iterators are all read-only, and don't care whether they see an element or its replacement.
Also, this use case cannot be written today, because CME is thrown.
?? Imagine there's only one writer thread, with some Iterating reader threads. Every midnight, the writer thread optimizes all the elements for (int i = 0; i < size; i++) vector.set(i, optimized(vector.get(i)) This code has worked well since the '90s without CME. One day the maintainer notices shiny new Vector.replaceAll and "fixes" the code. """It's perfectly safe""". The CME is rare and so only caught when the maintainer has gone on to the next job. The CMEs only happen at midnight!
On 5/15/18 8:02 PM, Martin Buchholz wrote:
How many times the lock is acquired is an implementation detail, a non-obvious tradeoff even. vector.replaceAll holds the lock throughout, but vector.subList(0, size).replaceAll holds the lock for each element (sigh ... racily (really a bug!)).
In Vector's case it's specified, not an implementation detail. You can't perform certain bulk operations on a Vector without holding the lock externally.
Again, imagine this use case: there is a periodic background task that optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress.
I don't think it's possible to do that correctly without holding a lock around the entire iteration.
I don't see why.
If the lock is held, CME can't occur, as a concurrent replaceAll() will occur before or after the iteration, never in the middle.
I don't see why - iteration is inherently non-atomic, so replaceAll could be called in the middle.
If an iteration over a Vector doesn't hold a lock, any read-modify-write operations (consider a loop with a ListIterator on which set() is called) can be interleaved with bulk operations (like replaceAll) which is clearly incorrect. In such cases, CME should be thrown.
Imagine your iterators are all read-only, and don't care whether they see an element or its replacement.
You're imagining an incredibly narrow use case. You're choosing Vector, not ArrayList; the replaceAll() operation must an optimization that doesn't affect the semantics of the object (say, the outcome of any logic); the iteration must be read-only, not read-modify-write; and the logic within the iteration "doesn't care" whether it gets old or new values. I don't find it compelling that it's possible to imagine such a case. Most code won't conform to it. And in fact it's really hard to tell whether it does. Consider an example like this: for (T t : vectorOfT) { print(t); } Suppose that a replaceAll() on another thread occurs, and that this is allowed. Does the application care whether the eventual printout contains partly new values and partly old values? How can you tell? It seems to me that this is more likely a programming error than a valid use case.
Also, this use case cannot be written today, because CME is thrown.
?? Imagine there's only one writer thread, with some Iterating reader threads. Every midnight, the writer thread optimizes all the elements for (int i = 0; i < size; i++) vector.set(i, optimized(vector.get(i)) This code has worked well since the '90s without CME. One day the maintainer notices shiny new Vector.replaceAll and "fixes" the code. """It's perfectly safe""". The CME is rare and so only caught when the maintainer has gone on to the next job. The CMEs only happen at midnight!
By "cannot be written today" I mean that the following: for (T t: arrayList) { if (condition) { list.replaceAll(); } } has ALWAYS thrown CME, since the introduction of replaceAll(). Sure, there are cases such as you describe where CME gets thrown only rarely. That's preferable to getting incorrect results equally rarely. That's the point of fail-fast. ** (subsequent email)
(We don't seem to be moving towards consensus ...)
Apparently....
At the very least, having replaceAll increment modCount (inconsistently!) is surprising to users and is not documented anywhere.
Maybe it should be documented then. ** Why are you placing so much emphasis on *removing* the CME behavior? It's been this way since Java 8 was delivered. Is this causing a problem? What will be solved by removing it? s'marks
On Wed, May 16, 2018 at 2:21 PM, Stuart Marks <stuart.marks@oracle.com> wrote:
On 5/15/18 8:02 PM, Martin Buchholz wrote:
How many times the lock is acquired is an implementation detail, a non-obvious tradeoff even. vector.replaceAll holds the lock throughout, but vector.subList(0, size).replaceAll holds the lock for each element (sigh ... racily (really a bug!)).
In Vector's case it's specified, not an implementation detail. You can't perform certain bulk operations on a Vector without holding the lock externally.
What is specified? Vector is synchronized, but it's unspecified whether external synchronization via synchronized (vector) {...} works Vector.replaceAll simply inherits spec from List.replaceAll - it seems unspecified whether bulk operations like replaceAll on Vector are atomic. (In fact they are not atomic on the subList).
Again, imagine this use case: there is a periodic background task
that optimizes all the elements of a Vector vector.replaceAll(x -> optimized(x)) That should not break any iterations in progress.
I don't think it's possible to do that correctly without holding a lock around the entire iteration.
I don't see why.
If the lock is held, CME can't occur, as a concurrent replaceAll() will occur before or after the iteration, never in the middle.
I don't see why - iteration is inherently non-atomic, so replaceAll could be called in the middle.
If an iteration over a Vector doesn't hold a lock, any read-modify-write operations (consider a loop with a ListIterator on which set() is called) can be interleaved with bulk operations (like replaceAll) which is clearly incorrect. In such cases, CME should be thrown.
Imagine your iterators are all read-only, and don't care whether they see an element or its replacement.
You're imagining an incredibly narrow use case. You're choosing Vector, not ArrayList; the replaceAll() operation must an optimization that doesn't affect the semantics of the object (say, the outcome of any logic); the iteration must be read-only, not read-modify-write; and the logic within the iteration "doesn't care" whether it gets old or new values.
I don't find it compelling that it's possible to imagine such a case. Most code won't conform to it. And in fact it's really hard to tell whether it does.
Consider an example like this:
for (T t : vectorOfT) { print(t); }
Suppose that a replaceAll() on another thread occurs, and that this is allowed. Does the application care whether the eventual printout contains partly new values and partly old values? How can you tell? It seems to me that this is more likely a programming error than a valid use case.
Vector is unloved and there are few good uses for it at all - the List interface is concurrency-hostile. BUT one good use case for concurrency here seems to be when there are no structural modifications, when an index unambiguously identifies a "place". Sort of like AtomicReferenceArray. We could aim towards making Vector and AtomicReferenceArray and ConcurrentHashMap more useful and cross-compatible (e.g. Vector.accumulateAndGet0
Also, this use case cannot be written today, because CME is thrown.
?? Imagine there's only one writer thread, with some Iterating reader threads. Every midnight, the writer thread optimizes all the elements for (int i = 0; i < size; i++) vector.set(i, optimized(vector.get(i)) This code has worked well since the '90s without CME. One day the maintainer notices shiny new Vector.replaceAll and "fixes" the code. """It's perfectly safe""". The CME is rare and so only caught when the maintainer has gone on to the next job. The CMEs only happen at midnight!
By "cannot be written today" I mean that the following:
for (T t: arrayList) { if (condition) { list.replaceAll(); } }
has ALWAYS thrown CME, since the introduction of replaceAll().
But that's a bug (!) and the variant below has never thrown CME, and I don't want to compound the existing bug. for (T t: arrayList) { if (condition) { list.subList(0, list.size()).replaceAll(); } } Sure, there are cases such as you describe where CME gets thrown only
rarely. That's preferable to getting incorrect results equally rarely. That's the point of fail-fast.
**
(subsequent email)
(We don't seem to be moving towards consensus ...)
Apparently....
At the very least, having replaceAll increment modCount (inconsistently!)
is surprising to users and is not documented anywhere.
Maybe it should be documented then.
**
Why are you placing so much emphasis on *removing* the CME behavior? It's been this way since Java 8 was delivered. Is this causing a problem? What will be solved by removing it?
I started out "simply" optimizing ArrayList subList replaceAll Then I noticed the obvious implementation would increment modCount. Which would be introducing a new bug....
I owe all of you replies on this, but I don't have time to continue the discussion, and I suspect you all want to get on with things. Let me have a quick chat with Paul tomorrow and then I'll propose a path forward. s'marks
Hello, On 5/17/2018 8:51 PM, Stuart Marks wrote:
I owe all of you replies on this, but I don't have time to continue the discussion, and I suspect you all want to get on with things.
Let me have a quick chat with Paul tomorrow and then I'll propose a path forward.
I've spoken about this proposed change with Stuart and Paul. Given that the replaceAll methods have incremented modCount since they were introduced in JDK 8, changing that behavior in widely-used classes merits a CSR to be filed for JDK 11. The modCount aspect of the changeset could be split out if you want to proceed with the rest. Separately, it would be worthwhile to have a general discussion and review of the desired semantics for modCount and consistency of the various replaceAll implementations, including reviewing Map.replaceAll. Thanks, -Joe
On Fri, May 18, 2018 at 4:30 PM, joe darcy <joe.darcy@oracle.com> wrote:
Hello,
On 5/17/2018 8:51 PM, Stuart Marks wrote:
Given that the replaceAll methods have incremented modCount since they
were introduced in JDK 8, changing that behavior in widely-used classes merits a CSR to be filed for JDK 11.
OK, we can file a CSR but we should achieve "rough consensus" first.
The modCount aspect of the changeset could be split out if you want to proceed with the rest.
I'll do that if consensus remains elusive.
Sorry Stuart, but I'm joining the no-modCount barrage. To pick on the main issue... On 05/16/2018 05:21 PM, Stuart Marks wrote:
Suppose that a replaceAll() on another thread occurs, and that this is allowed. Does the application care whether the eventual printout contains partly new values and partly old values? How can you tell? It seems to me that this is more likely a programming error than a valid use case.
There are many data-races that are never detected via CME. CME was designed to trap only some of them (and even at that, only heuristically because modCount operations are not strictly ordered). ModCount mechanics only deal with insertions and removals. Failing to catch a race in replaceAll is no different than failing to catch one with multiple concurrent setValues. It might be "nice" to adjust modCount on any operation that might contribute to a datarace, but no reason to single out replaceAll. Having said this, I don't think that anything in the spec (vs default implementation) prohibits an implementation of replaceAll from removing and then re-inserting each element, in which case an implementation would modify modCounts. -Doug
Hi, My CME budget is definitely in the red now :-) but i would like to attempt to summarize the different positions and suggest a solution. If that does not get traction i believe a graceful retreat is in order. We appear to be in two different philosophical camps with regards to the interpretation of modCount (i suspect what both camps believe is compatible with what CME specifies [*]). One camp believes AbstractList.modCount should only (on a best-effort basis) be incremented on structural modifications that change the size (even if the size change cannot be externally observed, like the example Doug presents). The other camp believes there are other forms of modification that perturb the list leading to incorrect results since those forms can affect yet to be traversed elements. It is argued that bulk operations such as replaceAll and sort fit in this category, regardless of the implementation. I hope that is an accurate representation, and if that is so here is a rough proposal bring those camps together. The specification of AbstractList.modCount could be modified as follows: /** * The number of times this list has been <i>structurally modified</i> * or <i>globally modified</i>. * Structural modifications are those that change the size of the * list, or otherwise perturb it in such a fashion that iterations in * progress may yield incorrect results. * Global modifications are those where the size may not change but the * list is perturbed as a whole in such a fashion that iterations in progress * may yield incorrect results. Thoughts? Paul. [*] on CME * Note that this exception does not always indicate that an object has * been concurrently modified by a <i>different</i> thread. If a single * thread issues a sequence of method invocations that violates the * contract of an object, the object may throw this exception. For * example, if a thread modifies a collection directly while it is * iterating over the collection with a fail-fast iterator, the iterator * will throw this exception. * * <p>Note that fail-fast behavior cannot be guaranteed as it is, generally * speaking, impossible to make any hard guarantees in the presence of * unsynchronized concurrent modification. Fail-fast operations * throw {@code ConcurrentModificationException} on a best-effort basis. * Therefore, it would be wrong to write a program that depended on this * exception for its correctness: <i>{@code ConcurrentModificationException} * should be used only to detect bugs.</i>
On May 16, 2018, at 4:04 PM, Doug Lea <dl@cs.oswego.edu> wrote:
Sorry Stuart, but I'm joining the no-modCount barrage. To pick on the main issue...
On 05/16/2018 05:21 PM, Stuart Marks wrote:
Suppose that a replaceAll() on another thread occurs, and that this is allowed. Does the application care whether the eventual printout contains partly new values and partly old values? How can you tell? It seems to me that this is more likely a programming error than a valid use case.
There are many data-races that are never detected via CME. CME was designed to trap only some of them (and even at that, only heuristically because modCount operations are not strictly ordered). ModCount mechanics only deal with insertions and removals. Failing to catch a race in replaceAll is no different than failing to catch one with multiple concurrent setValues. It might be "nice" to adjust modCount on any operation that might contribute to a datarace, but no reason to single out replaceAll.
Having said this, I don't think that anything in the spec (vs default implementation) prohibits an implementation of replaceAll from removing and then re-inserting each element, in which case an implementation would modify modCounts.
-Doug
Sorry Paul but all this does is endorse the second camp: they want structural and content modifications to trigger CME. Further it assumes that doing a replaceAll while an iterator is active is always an error which need not be the case. CME is about the safety of the implementation, not the semantics of concurrent observation. Also it is not just a "belief" that CME is only for structural modifications on ArrayList, it is clearly specified as such. Cheers, David On 19/05/2018 7:25 AM, Paul Sandoz wrote:
Hi,
My CME budget is definitely in the red now :-) but i would like to attempt to summarize the different positions and suggest a solution. If that does not get traction i believe a graceful retreat is in order.
We appear to be in two different philosophical camps with regards to the interpretation of modCount (i suspect what both camps believe is compatible with what CME specifies [*]).
One camp believes AbstractList.modCount should only (on a best-effort basis) be incremented on structural modifications that change the size (even if the size change cannot be externally observed, like the example Doug presents).
The other camp believes there are other forms of modification that perturb the list leading to incorrect results since those forms can affect yet to be traversed elements. It is argued that bulk operations such as replaceAll and sort fit in this category, regardless of the implementation.
I hope that is an accurate representation, and if that is so here is a rough proposal bring those camps together. The specification of AbstractList.modCount could be modified as follows:
/** * The number of times this list has been <i>structurally modified</i> * or <i>globally modified</i>. * Structural modifications are those that change the size of the * list, or otherwise perturb it in such a fashion that iterations in * progress may yield incorrect results. * Global modifications are those where the size may not change but the * list is perturbed as a whole in such a fashion that iterations in progress * may yield incorrect results.
Thoughts?
Paul.
[*] on CME * Note that this exception does not always indicate that an object has * been concurrently modified by a <i>different</i> thread. If a single * thread issues a sequence of method invocations that violates the * contract of an object, the object may throw this exception. For * example, if a thread modifies a collection directly while it is * iterating over the collection with a fail-fast iterator, the iterator * will throw this exception. * * <p>Note that fail-fast behavior cannot be guaranteed as it is, generally * speaking, impossible to make any hard guarantees in the presence of * unsynchronized concurrent modification. Fail-fast operations * throw {@code ConcurrentModificationException} on a best-effort basis. * Therefore, it would be wrong to write a program that depended on this * exception for its correctness: <i>{@code ConcurrentModificationException} * should be used only to detect bugs.</i>
On May 16, 2018, at 4:04 PM, Doug Lea <dl@cs.oswego.edu> wrote:
Sorry Stuart, but I'm joining the no-modCount barrage. To pick on the main issue...
On 05/16/2018 05:21 PM, Stuart Marks wrote:
Suppose that a replaceAll() on another thread occurs, and that this is allowed. Does the application care whether the eventual printout contains partly new values and partly old values? How can you tell? It seems to me that this is more likely a programming error than a valid use case.
There are many data-races that are never detected via CME. CME was designed to trap only some of them (and even at that, only heuristically because modCount operations are not strictly ordered). ModCount mechanics only deal with insertions and removals. Failing to catch a race in replaceAll is no different than failing to catch one with multiple concurrent setValues. It might be "nice" to adjust modCount on any operation that might contribute to a datarace, but no reason to single out replaceAll.
Having said this, I don't think that anything in the spec (vs default implementation) prohibits an implementation of replaceAll from removing and then re-inserting each element, in which case an implementation would modify modCounts.
-Doug
I like thinking of ArrayList as just an optimized HashMap, Lua-style, and HashMap.replaceAll also does not increment modCount, supporting our "structural modification" position. The thing that bothers me most about the status quo is the *inconsistency* - between root list and subList, between default implementation and concrete implementation, and between ArrayList and HashMap. And there seems to be no reason for users to expect any such inconsistency.
On May 19, 2018, at 9:42 AM, Martin Buchholz <martinrb@google.com> wrote:
I like thinking of ArrayList as just an optimized HashMap, Lua-style, and HashMap.replaceAll also does not increment modCount, supporting our "structural modification" position.
The thing that bothers me most about the status quo is the *inconsistency* - between root list and subList, between default implementation and concrete implementation, and between ArrayList and HashMap. And there seems to be no reason for users to expect any such inconsistency.
FWIW my $0.02. Sadly I don’t have time to read this whole CME thread, but it seems to me that the written spec. has to lead here, not the de facto spec. of how stuff throws CME today. And we have to start with the javadoc in CME.java. Sadly, it is more allusive than definitive but we have to work with what we have. When read closely, as if it were a true spec instead of a mere handful of examples, supports a pretty narrow interpretation of when CME can be thrown: A. When race conditions are detected (not done today at all, but would be with thread-confined data!). B. When a backing collection is modified between calls to an iterator over it. C. Overlapping cases A and B, when an iterator may experience “arbitrary, non-deterministic behavior” at some later point. (This points, at least, to dangling pointers to inactive backing storage after a resize, and also to buffering of values in the iterator which duplicates the backing state.) D. Finally, a catch-all “sequence of method invocations that violates the contract of an object”. The catch-all case D has (maybe) been treated as grounds for adding lots of additional ad hoc checks in various implementations, but (to follow the written spec) those ad hoc checks must be clearly documented as “violating the contract of the object”. And that means that the object itself, in its subclass, must have documented a contract that could be violated by an invalid sequence of calls to various methods on that object. At the same time, List (for example) is a very general-purpose object and as such has a very simple (I’d say “categorical”) contract. A subtype of List might have a more rigorous contact and throw more CMEs but the default for List as a whole the criteria should be as narrow as possible, subject to the goals A, B, C, above. This CME thread (the parts on the side of more CMEs) appeals to the List specification where it says “otherwise perturb it in such a fashion that iterations in progress may yield incorrect results”, saying that when we spot a result that seems obviously incorrect, we are permitted to throw a CME. But I think in order to follow the spec. we need to insist that the phrase “incorrect results” has a definite and documented meaning, or else that it refers (as CME.java does) to some hypothetical subtype of List that has additional contracts. The phrase “incorrect results” in List cannot be carte blanche to add checks where we are afraid the programmer might have made an error even if we cannot prove one; that would be just officious mind-reading. I don’t know whose side this favors more, but I think we all can agree that a careful and disciplined reading of the spec. will sort us out, and that a narrow construction of the spec. evidence is probably safer than a broad one. Final point: Since the CME spec. says it is a best-effort bug detection, we have nothing to lose from removing CME throws, if consistency is the most important goal. Adding new CME throws surely must be justified by a careful, conservative reading of the existing spec., which is what I have tried to provide in this note. HTH — John
On 05/20/18 00:54, John Rose wrote:
The phrase “incorrect results” in List cannot be carte blanche to add checks where we are afraid the programmer might have made an error even if we cannot prove one; that would be just officious mind-reading. +1
I'm all for adding checks that prevent programming errors, but not for the price of preventing a legal and perfectly logical usecase, such as the one Martin has demonstrated. Regards, Peter
Hi John, On 20/05/2018 8:54 AM, John Rose wrote:
On May 19, 2018, at 9:42 AM, Martin Buchholz <martinrb@google.com> wrote:
I like thinking of ArrayList as just an optimized HashMap, Lua-style, and HashMap.replaceAll also does not increment modCount, supporting our "structural modification" position.
The thing that bothers me most about the status quo is the *inconsistency* - between root list and subList, between default implementation and concrete implementation, and between ArrayList and HashMap. And there seems to be no reason for users to expect any such inconsistency.
FWIW my $0.02.
Just to add ...
Sadly I don’t have time to read this whole CME thread, but it seems to me that the written spec. has to lead here, not the de facto spec. of how stuff throws CME today.
And we have to start with the javadoc in CME.java. Sadly, it is more allusive than definitive but we have to work with what we have. When
My understanding from Java 2 days is that CME was not intended to specify exactly when it should be thrown. It was provided as an exception type to be used when the various kinds of example conditions were encountered. It was then up to specific concrete collection implementations (not the abstract interfaces - as these are implementation concerns) to specify when that implementation would attempt to detect such problems and throw CME. Cheers, David -----
read closely, as if it were a true spec instead of a mere handful of examples, supports a pretty narrow interpretation of when CME can be thrown:
A. When race conditions are detected (not done today at all, but would be with thread-confined data!).
B. When a backing collection is modified between calls to an iterator over it.
C. Overlapping cases A and B, when an iterator may experience “arbitrary, non-deterministic behavior” at some later point. (This points, at least, to dangling pointers to inactive backing storage after a resize, and also to buffering of values in the iterator which duplicates the backing state.)
D. Finally, a catch-all “sequence of method invocations that violates the contract of an object”.
The catch-all case D has (maybe) been treated as grounds for adding lots of additional ad hoc checks in various implementations, but (to follow the written spec) those ad hoc checks must be clearly documented as “violating the contract of the object”. And that means that the object itself, in its subclass, must have documented a contract that could be violated by an invalid sequence of calls to various methods on that object.
At the same time, List (for example) is a very general-purpose object and as such has a very simple (I’d say “categorical”) contract. A subtype of List might have a more rigorous contact and throw more CMEs but the default for List as a whole the criteria should be as narrow as possible, subject to the goals A, B, C, above.
This CME thread (the parts on the side of more CMEs) appeals to the List specification where it says “otherwise perturb it in such a fashion that iterations in progress may yield incorrect results”, saying that when we spot a result that seems obviously incorrect, we are permitted to throw a CME.
But I think in order to follow the spec. we need to insist that the phrase “incorrect results” has a definite and documented meaning, or else that it refers (as CME.java does) to some hypothetical subtype of List that has additional contracts. The phrase “incorrect results” in List cannot be carte blanche to add checks where we are afraid the programmer might have made an error even if we cannot prove one; that would be just officious mind-reading.
I don’t know whose side this favors more, but I think we all can agree that a careful and disciplined reading of the spec. will sort us out, and that a narrow construction of the spec. evidence is probably safer than a broad one.
Final point: Since the CME spec. says it is a best-effort bug detection, we have nothing to lose from removing CME throws, if consistency is the most important goal. Adding new CME throws surely must be justified by a careful, conservative reading of the existing spec., which is what I have tried to provide in this note.
HTH — John
We seem to have enough consensus to change the modCount behavior of replaceAll. But let's leave that to a future change and get this integration in by (temporarily) reverting to the status quo: Index: src/main/java/util/ArrayList.java =================================================================== RCS file: /export/home/jsr166/jsr166/jsr166/src/main/java/util/ArrayList.java,v retrieving revision 1.61 diff -u -r1.61 ArrayList.java --- src/main/java/util/ArrayList.java 18 May 2018 03:48:34 -0000 1.61 +++ src/main/java/util/ArrayList.java 22 May 2018 15:40:27 -0000 @@ -1737,6 +1737,7 @@ @Override public void replaceAll(UnaryOperator<E> operator) { replaceAllRange(operator, 0, size); + modCount++; } private void replaceAllRange(UnaryOperator<E> operator, int i, int end) { Index: src/main/java/util/Vector.java =================================================================== RCS file: /export/home/jsr166/jsr166/jsr166/src/main/java/util/Vector.java,v retrieving revision 1.50 diff -u -r1.50 Vector.java --- src/main/java/util/Vector.java 5 May 2018 18:29:53 -0000 1.50 +++ src/main/java/util/Vector.java 22 May 2018 15:40:27 -0000 @@ -1411,6 +1411,7 @@ es[i] = operator.apply(elementAt(es, i)); if (modCount != expectedModCount) throw new ConcurrentModificationException(); + modCount++; // checkInvariants(); } Index: src/test/tck/Collection8Test.java =================================================================== RCS file: /export/home/jsr166/jsr166/jsr166/src/test/tck/Collection8Test.java,v retrieving revision 1.54 diff -u -r1.54 Collection8Test.java --- src/test/tck/Collection8Test.java 6 May 2018 22:33:06 -0000 1.54 +++ src/test/tck/Collection8Test.java 22 May 2018 15:40:27 -0000 @@ -952,7 +952,7 @@ } catch (java.io.NotSerializableException acceptable) {} } - public void testReplaceAllIsNotStructuralModification() { + public void DISABLED_testReplaceAllIsNotStructuralModification() { Collection c = impl.emptyCollection(); if (!(c instanceof List)) return;
Hi Martin, On 5/22/18 8:50 AM, Martin Buchholz wrote:
We seem to have enough consensus to change the modCount behavior of replaceAll.
Sorry, I don't agree that there is such consensus. There are about half a dozen different issues all at play, spread across several messages throughout the thread. Instead of responding to the individual messages piecemeal, I'm trying to gather them into a unified reply. And I'm still digesting the emails that came in over the weekend. More to come on this.
But let's leave that to a future change and get this integration in by (temporarily) reverting to the status quo: [...]
Thanks, I appreciate your taking this approach. I've filed JDK-8203662 to cover the future change, and I've assigned it to you. The CSR request can also be linked to that issue. I've also filed JDK-8203663 to cover the broader issue of any spec changes and the inconsistency of CME policy across the different collection fail-fast collections in the JDK. s'marks
On 5/22/18 2:58 PM, Stuart Marks wrote:
On 5/22/18 8:50 AM, Martin Buchholz wrote:
We seem to have enough consensus to change the modCount behavior of replaceAll.
Sorry, I don't agree that there is such consensus. There are about half a dozen different issues all at play, spread across several messages throughout the thread. Instead of responding to the individual messages piecemeal, I'm trying to gather them into a unified reply. And I'm still digesting the emails that came in over the weekend. More to come on this.
I've collected my thoughts on this issue into a document I've placed here: http://cr.openjdk.java.net/~smarks/misc/comod0.html I've also placed a link into the CSR request, JDK-8203704. s'marks
I agree with "tempest in a teapot". The practical effect either way will be small. We seem unlikely to convince each other with new arguments. My thinking may be influenced by my growing belief that having ConcurrentModificationException in the core libraries at all was a mistake - they belong in a special "debug" version of the core libraries. I have never regarded CME as a *service* to *my* java programs. Since ConcurrentModificationException can also occur inside a single thread, there are valid (recursive) (admittedly rare) calls involving ArrayList, not just Vector. On Wed, May 30, 2018 at 3:58 PM, Stuart Marks <stuart.marks@oracle.com> wrote:
I've collected my thoughts on this issue into a document I've placed here:
Hi John, On 5/19/2018 3:54 PM, John Rose wrote:
On May 19, 2018, at 9:42 AM, Martin Buchholz <martinrb@google.com> wrote:
I like thinking of ArrayList as just an optimized HashMap, Lua-style, and HashMap.replaceAll also does not increment modCount, supporting our "structural modification" position.
The thing that bothers me most about the status quo is the *inconsistency* - between root list and subList, between default implementation and concrete implementation, and between ArrayList and HashMap. And there seems to be no reason for users to expect any such inconsistency. FWIW my $0.02.
I'll take your $0.02 and raise you another $0.02. Commenting on your commentary rather than on the particular merits or demerits of whether replaceAll should throw CME, I think a more holistic approach should be used to evaluate CME and replaceAll and similar issues. The specification of CME is not sacrosanct. We can and do evolve the specification of types all the time, even ones that have been in the platform for many years. We certainly have constraints on that evolution to maintain compatibility (https://wiki.openjdk.java.net/display/csr/Main). However, given the first and last sentence of the main CME javadoc: "This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible. ... ConcurrentModificationException should be used only to detect bugs." having a CME thrown during replaceALL seems well within the general intended semantics of the CME exception type. Specifications are updated for a variety of reasons. Sometimes they just need refinement. Sometimes our understanding of the problem space changes. And sometimes the needs of the platform change and the existing types need to be adapted. The replaceAll methods under discussion have been coded to throw CME since they were introduced in JDK 8. To my knowledge, there have been no bugs filed complaining about this behavior. If as stewards of the platform we conclude it is useful for replaceAll to throw CME, I think we should not stop ourselves from doing so on the mistaken belief that our use of CME cannot be evolved. I do think it is reasonable to have consistent use of modCount and CME in the different replaceAll implementations. Cheers, -Joe
On Tue, May 22, 2018 at 3:41 PM, joe darcy <joe.darcy@oracle.com> wrote:
The specification of CME is not sacrosanct. We can and do evolve the specification of types all the time, even ones that have been in the platform for many years. We certainly have constraints on that evolution to maintain compatibility (https://wiki.openjdk.java.net/display/csr/Main). However, given the first and last sentence of the main CME javadoc:
"This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible. ... ConcurrentModificationException should be used only to detect bugs."
having a CME thrown during replaceALL seems well within the general intended semantics of the CME exception type.
Specifications of exception classes are (naturally) more general and vague than the specification of concrete classes and methods that throw them. The spec for ConcurrentModificationException does not mention the concept of "structural modification" but e.g. the spec for Vector states """if the vector is structurally modified at any time after the iterator is created"""
On 23/05/2018 4:40 PM, Martin Buchholz wrote:
On Tue, May 22, 2018 at 3:41 PM, joe darcy <joe.darcy@oracle.com> wrote:
The specification of CME is not sacrosanct. We can and do evolve the specification of types all the time, even ones that have been in the platform for many years. We certainly have constraints on that evolution to maintain compatibility (https://wiki.openjdk.java.net/display/csr/Main). However, given the first and last sentence of the main CME javadoc:
"This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible. ... ConcurrentModificationException should be used only to detect bugs."
having a CME thrown during replaceALL seems well within the general intended semantics of the CME exception type.
Specifications of exception classes are (naturally) more general and vague than the specification of concrete classes and methods that throw them. The spec for ConcurrentModificationException does not mention the concept of "structural modification" but e.g. the spec for Vector states """if the vector is structurally modified at any time after the iterator is created"""
Which is exactly the point I made in response to John's email and prior to Joe's. CME is not the definitive source of knowledge on when to throw CME. You can throw a CME any time it's general/vague description matches your usecase. You're not required to throw it just because someone else thinks a particular usecase fits its description! The key part of the CME spec with regard to intended use is "when such modification is not permissible". Granted "permissible" then becomes highly subjective. But to me this relates to a decision the designer - not user - of the data structure makes as to whether a modification will lead to an error condition or not, regardless of the semantic context of use for the data structure. The fact there has been "misuse" of modCount modifications since JDK 8 and no one has complained does not lead to a conclusion that therefore this throwing of CME is perfectly acceptable, it more likely means noone has run into it because they don't use their Collections that way anyway, and all this is a storm in a teacup from a practical development perspective. But the lack of consistency is the problem. I don't think I have any cycles left for this regardless of what Stuart comes back with. Cheers, David
participants (9)
-
Claes Redestad
-
David Holmes
-
Doug Lea
-
joe darcy
-
John Rose
-
Martin Buchholz
-
Paul Sandoz
-
Peter Levart
-
Stuart Marks