RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently
Patrick Zhang OS
patrick at os.amperecomputing.com
Sat May 4 07:21:43 UTC 2019
Hi Stuart,
Thank you very much for the detailed explanation and examples, which solved most of my previous questions and confusion. Really appreciate that. I agree with the opinion about "effort vs benefit".
Re-thought my original tests concerning CMEs that passed with jdk8u but failed with jdk11 (should be 9+, but I only checked LTS builds), I was struggling between "fixing" the "problems" in jdk11 and "making jdk8 fail similarly". However so far it looks these tests might be over strict, or "verifying CMEs" itself might be an invalid (or unreliable) operation, perhaps I should drop all of them. Lastly, a suggestion would be: adding more comments for this in case anyone else would revisit it with similar confusions, e.g. HashMap.clear.
Regards
Patrick
-----Original Message-----
From: Stuart Marks <stuart.marks at oracle.com>
Sent: Thursday, May 2, 2019 8:39 AM
To: Patrick Zhang OS <patrick at os.amperecomputing.com>
Cc: core-libs-dev <core-libs-dev at openjdk.java.net>
Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently
>> ...merely to serve as a discussion point about the policy for
>> throwing ConcurrentModificationException?
> Yes, for the time being, I want to see and welcome more ideas on this.
> It seems to me that the policy for throwing CME here is not a unified
> one, mostly based on experience and testing. Clear, compute, and
> computeIfAbsent are more special as I described.
OK. For reference, here are some of the words from the ConcurrentModificationException specification: [1]
> This exception may be thrown by methods that have detected concurrent
> modification of an object when such modification is not permissible.
>
> For example, it is not generally permissible for one thread to modify
> a Collection while another thread is iterating over it. In general,
> the results of the iteration are undefined under these circumstances.
> Some Iterator implementations (including those of all the general
> purpose collection implementations provided by the JRE) may choose to
> throw this exception if this behavior is detected. Iterators that do
> this are known as fail-fast iterators, as they fail quickly and
> cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future.
>
> Note that this exception does not always indicate that an object has
> been concurrently modified by a different thread. If a single thread
> issues a sequence of method invocations that violates the contract of
> an object, the object may throw this exception. For example, if a
> thread modifies a collection directly while it is iterating over the
> collection with a fail-fast iterator, the iterator will throw this exception.
>
> Note that fail-fast behavior cannot be guaranteed as it is, generally
> speaking, impossible to make any hard guarantees in the presence of
> unsynchronized concurrent modification. Fail-fast operations throw
> ConcurrentModificationException on a best-effort basis. Therefore, it
> would be wrong to write a program that depended on this exception for
> its
> correctness: ConcurrentModificationException should be used only to
> detect bugs.
[1]
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ConcurrentModificationException.html
Similar words are repeated in several different locations around the
specification, such as in the ArrayList and HashMap class specifications.
I'm not entirely sure what your concerns are with
ConcurrentModificationException (and the "fail-fast" concurrent modification
policy), but let me discuss a few points.
1. throwing of CME is not guaranteed - "best effort"
Unlike most Java specifications, the specification around CME is fairly
indefinite. The wording is hedged -- "This exception may be thrown...." This
implies that CME might or might not be thrown, even in cases where one might
expect it to be.
It also says that CME is thrown on a "best effort" basis. This doesn't mean that
the library makes the maximum possible effort to throw CME in every possible
situation. Maybe "best effort" is somewhat misleading. Perhaps "reasonable"
effort is more descriptive.
For example, ArrayList keeps a modCount field and increments and checks it
occasionally. No synchronization is done. If the ArrayList is modified by
another thread, the update to modCount might not be visible to all threads,
which might result in data corruption instead of a CME.
One way to "fix" this would be to make access to modCount synchronized (or to
make it volatile, or to make it an AtomicInteger or something) to improve the
reliability of detecting concurrent modifications from other threads. This would
add complexity to the code and also slow down common operations. Making this
extra effort doesn't seem to be worthwhile.
2. throwing CME sometimes done even when not absolutely necessary
Another point is that the detection and throwing of a CME is an approximation of
when concurrent modification would have any impact. In some cases CME will be
thrown even when one wouldn't think it strictly necessary.
For example, consider a loop in the middle of iterating an ArrayList.
ArrayList's iterator simply keeps an index to represent its current position. If
an element is added to or removed from the front of the list, this would result
in the iteration skipping or repeating elements. Thus, throwing CME seems
warranted in this case.
Now consider an iteration in the middle of an ArrayList, and an addition or
removal is made to the *end* of the list. This doesn't affect the current
iteration; yet CME is thrown anyway. Why?
To avoid throwing CME in this case (but to throw it in the previous case) the
ArrayList and its Iterators would have to keep track of more information about
what changes were made and would have to do more checking at each iteration
step. This could increase code complexity considerably. Again, this seems like
it isn't worthwhile. Keeping a simple counter (modCount) and checking it at each
iteration step is quite cheap, although it arguably does throw CME unnecessarily
in this case.
Some of the cases you're talking about seem to fall into this category. A CME is
thrown from compute() if an operation is merely attempted, even if the actual
operation performed would have no ill effect.
3. state-dependent behavior
I discussed this in a previous message. My personal design style is to try to
avoid this, although the library isn't wholly consistent in this regard. The
discussion regarding CME and the compute() and similar methods is also related
to state-dependent behavior.
4. edge cases
There are a number of edge cases that aren't treated wholly consistently across
the libraries. Again with ArrayList, consider the following:
List<Integer> list = new ArrayList<>(List.of(0, 1, 2))
[0, 1, 2]
var it = list.iterator()
it.hasNext()
true
it.next()
0
it.hasNext()
true
it.next()
1
list.remove(0)
0
it.hasNext()
false
Arguably, hasNext() should throw CME. If this were in a for-loop, the concurrent
modification would be missed and the loop would terminate normally. Many
iterators check for concurrent modification in their next() method but not in
hasNext(). Perhaps this should be fixed, but it might break code that is
apparently behaving well today, so we've left it unchanged.
There is also the case of JDK-8114832, where a failed attempt at modification
will still cause a CME. This is more state-dependent behavior: should modCount
be incremented when a modification is *attempted* or when modification
*actually* occurs? (Martin says, "attempted murder is still a crime!")
This bug is still open, though Martin and I agree that no change should be made
here. It's questionable to me whether we want to go through the old collections
and update things to be more consistent. The effort is high, the benefit is
fairly low, in my estimation, and there is a risk of breaking existing code. So
we live with the inconsistencies.
*******
I'm kind of rambling here. Is this the kind of discussion you're interested in?
Do you have any specific questions?
s'marks
More information about the core-libs-dev
mailing list