Collections.addAll: remove outdated performance advice and add a note about atomicity
Hello! I suggest a patch for java.util.Collections#addAll JavaDoc: --- Collections.java 2018-01-31 09:39:31.599107500 +0700 +++ Collections.java.patched 2018-01-31 09:51:11.929059600 +0700 @@ -5406,4 +5406,8 @@ * The behavior of this convenience method is identical to that of - * {@code c.addAll(Arrays.asList(elements))}, but this method is likely - * to run significantly faster under most implementations. + * {@code c.addAll(Arrays.asList(elements))} except possible + * difference in intermediate state visibility for concurrent or + * synchronized collections. Calling this method does not guarantee + * that the intermediate state (some of elements are added) is invisible, + * even if the collection itself provides such guarantee for its + * {@link Collection#addAll(Collection)} method. * First, currently it says that Collections#addAll is likely to run significantly faster. However it's only marginally faster for collections which delegate their addAll method to standard AbstractCollection#addAll implementation. Also it could be much slower for collections which have optimized addAll (like ArrayList, CopyOnWriteArrayList, ConcurrentLinkedDeque, etc.). I don't know a single example of collection where Collections#addAll is actually significantly faster. Also it says that the behavior is identical, while it's not. If, e.g. c is a collection returned from synchronizedCollection, then intermediate state of c.addAll(Arrays.asList(elements)) would not be visible under synchronized(c) in another thread. On the other hand, replacing such call with Collections.addAll(c, elements) (to make it "significantly faster") will lift this guarantee: now you can see partially added array. What do you think? Should I file an issue? With best regards, Tagir Valeev.
I tried to tackle this here: http://openjdk.markmail.org/thread/eet2zd6ig3pfpv5g and it's still on my TODO list but not likely to get to top spot soon. On Tue, Jan 30, 2018 at 7:00 PM, Tagir Valeev <amaembo@gmail.com> wrote:
Hello!
I suggest a patch for java.util.Collections#addAll JavaDoc:
--- Collections.java 2018-01-31 09:39:31.599107500 +0700 +++ Collections.java.patched 2018-01-31 09:51:11.929059600 +0700 @@ -5406,4 +5406,8 @@ * The behavior of this convenience method is identical to that of - * {@code c.addAll(Arrays.asList(elements))}, but this method is likely - * to run significantly faster under most implementations. + * {@code c.addAll(Arrays.asList(elements))} except possible + * difference in intermediate state visibility for concurrent or + * synchronized collections. Calling this method does not guarantee + * that the intermediate state (some of elements are added) is invisible, + * even if the collection itself provides such guarantee for its + * {@link Collection#addAll(Collection)} method. *
First, currently it says that Collections#addAll is likely to run significantly faster. However it's only marginally faster for collections which delegate their addAll method to standard AbstractCollection#addAll implementation. Also it could be much slower for collections which have optimized addAll (like ArrayList, CopyOnWriteArrayList, ConcurrentLinkedDeque, etc.). I don't know a single example of collection where Collections#addAll is actually significantly faster. Also it says that the behavior is identical, while it's not. If, e.g. c is a collection returned from synchronizedCollection, then intermediate state of c.addAll(Arrays.asList(elements)) would not be visible under synchronized(c) in another thread. On the other hand, replacing such call with Collections.addAll(c, elements) (to make it "significantly faster") will lift this guarantee: now you can see partially added array.
What do you think? Should I file an issue?
With best regards, Tagir Valeev.
Links to existing material in OpenJDK: http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050326.ht... https://bugs.openjdk.java.net/browse/JDK-8193031 I agree with the removal of the performance advice. We should also remove "identical"; my suggested replacement was "as if". We can even add some hedging about whether the operation is atomic for certain collections (i.e., the synchronized ones). However, I'm not sure if there is anything useful to say about atomicity of bulk updates. They're only atomic for the synchronized collections, which are largely disused. It's pointless to talk about atomicity for non-concurrent collections, and the operations aren't atomic for things like CopyOnWriteArrayList and a Set projection of ConcurrentHashMap. So I'm not sure discussion of atomicity is useful or warranted here. (Also, adding a collection of elements one at a time to a CopyOnWriteArrayList: *shudder*.) I think the most useful thing is to define a new array-reading default method. Each implementation can then override it to use the best technique for that implementation. (I had previously called this "addEach" but I'm flexible on naming.) Adding a new default method is kind of far afield from where this started. If the spec is bothersome, perhaps we could consider a spec cleanup change separately from implementation changes and definition of a new default method. s'marks On 1/30/18 7:07 PM, Martin Buchholz wrote:
I tried to tackle this here: http://openjdk.markmail.org/thread/eet2zd6ig3pfpv5g and it's still on my TODO list but not likely to get to top spot soon.
On Tue, Jan 30, 2018 at 7:00 PM, Tagir Valeev <amaembo@gmail.com> wrote:
Hello!
I suggest a patch for java.util.Collections#addAll JavaDoc:
--- Collections.java 2018-01-31 09:39:31.599107500 +0700 +++ Collections.java.patched 2018-01-31 09:51:11.929059600 +0700 @@ -5406,4 +5406,8 @@ * The behavior of this convenience method is identical to that of - * {@code c.addAll(Arrays.asList(elements))}, but this method is likely - * to run significantly faster under most implementations. + * {@code c.addAll(Arrays.asList(elements))} except possible + * difference in intermediate state visibility for concurrent or + * synchronized collections. Calling this method does not guarantee + * that the intermediate state (some of elements are added) is invisible, + * even if the collection itself provides such guarantee for its + * {@link Collection#addAll(Collection)} method. *
First, currently it says that Collections#addAll is likely to run significantly faster. However it's only marginally faster for collections which delegate their addAll method to standard AbstractCollection#addAll implementation. Also it could be much slower for collections which have optimized addAll (like ArrayList, CopyOnWriteArrayList, ConcurrentLinkedDeque, etc.). I don't know a single example of collection where Collections#addAll is actually significantly faster. Also it says that the behavior is identical, while it's not. If, e.g. c is a collection returned from synchronizedCollection, then intermediate state of c.addAll(Arrays.asList(elements)) would not be visible under synchronized(c) in another thread. On the other hand, replacing such call with Collections.addAll(c, elements) (to make it "significantly faster") will lift this guarantee: now you can see partially added array.
What do you think? Should I file an issue?
With best regards, Tagir Valeev.
Hello! Thank you for pointers to the previous discussions. Yes, I would suggest to consider a spec cleanup separately, because adding new Collection default method would certainly take much longer time to discuss. I agree that we can exclude the statement about atomicity if it causes doubts. Removal of performance advice and "identical" word would suffice. At least this will not make people drawing wrong conclusions anymore. Thanks, Tagir. On Fri, Feb 2, 2018 at 7:55 AM, Stuart Marks <stuart.marks@oracle.com> wrote:
Links to existing material in OpenJDK:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050326.ht...
https://bugs.openjdk.java.net/browse/JDK-8193031
I agree with the removal of the performance advice. We should also remove "identical"; my suggested replacement was "as if".
We can even add some hedging about whether the operation is atomic for certain collections (i.e., the synchronized ones). However, I'm not sure if there is anything useful to say about atomicity of bulk updates. They're only atomic for the synchronized collections, which are largely disused. It's pointless to talk about atomicity for non-concurrent collections, and the operations aren't atomic for things like CopyOnWriteArrayList and a Set projection of ConcurrentHashMap. So I'm not sure discussion of atomicity is useful or warranted here.
(Also, adding a collection of elements one at a time to a CopyOnWriteArrayList: *shudder*.)
I think the most useful thing is to define a new array-reading default method. Each implementation can then override it to use the best technique for that implementation. (I had previously called this "addEach" but I'm flexible on naming.)
Adding a new default method is kind of far afield from where this started. If the spec is bothersome, perhaps we could consider a spec cleanup change separately from implementation changes and definition of a new default method.
s'marks
On 1/30/18 7:07 PM, Martin Buchholz wrote:
I tried to tackle this here: http://openjdk.markmail.org/thread/eet2zd6ig3pfpv5g and it's still on my TODO list but not likely to get to top spot soon.
On Tue, Jan 30, 2018 at 7:00 PM, Tagir Valeev <amaembo@gmail.com> wrote:
Hello!
I suggest a patch for java.util.Collections#addAll JavaDoc:
--- Collections.java 2018-01-31 09:39:31.599107500 +0700 +++ Collections.java.patched 2018-01-31 09:51:11.929059600 +0700 @@ -5406,4 +5406,8 @@ * The behavior of this convenience method is identical to that of - * {@code c.addAll(Arrays.asList(elements))}, but this method is likely - * to run significantly faster under most implementations. + * {@code c.addAll(Arrays.asList(elements))} except possible + * difference in intermediate state visibility for concurrent or + * synchronized collections. Calling this method does not guarantee + * that the intermediate state (some of elements are added) is invisible, + * even if the collection itself provides such guarantee for its + * {@link Collection#addAll(Collection)} method. *
First, currently it says that Collections#addAll is likely to run significantly faster. However it's only marginally faster for collections which delegate their addAll method to standard AbstractCollection#addAll implementation. Also it could be much slower for collections which have optimized addAll (like ArrayList, CopyOnWriteArrayList, ConcurrentLinkedDeque, etc.). I don't know a single example of collection where Collections#addAll is actually significantly faster. Also it says that the behavior is identical, while it's not. If, e.g. c is a collection returned from synchronizedCollection, then intermediate state of c.addAll(Arrays.asList(elements)) would not be visible under synchronized(c) in another thread. On the other hand, replacing such call with Collections.addAll(c, elements) (to make it "significantly faster") will lift this guarantee: now you can see partially added array.
What do you think? Should I file an issue?
With best regards, Tagir Valeev.
participants (3)
-
Martin Buchholz
-
Stuart Marks
-
Tagir Valeev