RFR 8005704: Update ConcurrentHashMap to v8

Mike Duigou mike.duigou at oracle.com
Thu May 30 18:18:04 UTC 2013


On May 29 2013, at 06:11 , Doug Lea wrote:

> On 05/28/13 15:07, Mike Duigou wrote:
>> Hi Chris & Doug;
> 
>> - I don't see the advantage to exposing the ConcurrentHashMap.KeySetView type
>> particularly for newKeySet(). Why not return Set<K>? The additional methods
>> don't seem to offer much that's desirable for the newKeySet() case.
> 
> Since we don't have a ConcurrentSet interface, people are reduced to
> instanceof checks to see if they have a concurrent one. But without
> exposing the class, they couldn't even do this, so complained. This
> will arise more frequently with  j.u.streams.Collectors.

I'd rather introduce a vacuous ConcurrentSet than expose ConcurrentHashMap.KeySetView. Any reason not to add it?

>> - I am reluctant to deprecate contains(Object) here unless we deprecate it in
>> Hashtable as well. I recognize that this has been a source of errors
>> (https://issues.apache.org/bugzilla/show_bug.cgi?id=48755 for one example).
>> Is it time to deprecate it there as well?
> 
> Sure, but why bother. Anyone still using Hashtable is not going to care
> if they get more deprecation warnings :-)

Fair enough. I will create a bug for deprecating Hashtable.contains as well but am fine with marking it deprecated here.

>> - I think there could be more complete description of the
>> parallelismThreshold and interaction with common pool. i.e. does "maximal
>> parallelism" mean one thread per element or "size() /
>> getCommonPoolParallelism()". Some advice for choosing in-between values would
>> be good unless "1" is the best advice for cases where you just don't know. It
>> would be a shame to see people shooting themselves in the foot with this.
>> 
> 
> Changed to:
> 
> * <p>These bulk operations accept a {@code parallelismThreshold}
> * argument. Methods proceed sequentially if the current map size is
> * estimated to be less than the given threshold. Using a value of
> * {@code Long.MAX_VALUE} suppresses all parallelism.  Using a value
> * of {@code 1} results in maximal parallelism by partitioning into
> * enough subtasks to utilize all processors. Normally, you would
> * initially choose one of these extreme values, and then measure
> * performance of using in-between values that trade off overhead
> * versus throughput. Parallel forms use the {@link
> * ForkJoinPool#commonPool()}.
> *

only one change I would make is :

Using a value of {@code 1} results in maximal parallelism by partitioning into
enough subtasks to fully utilize the {@link ForkJoinPool#commonPool()}.

> 
> I'd rather not explicitly mention here any particular values for people
> to try, since any number we might put is likely to be misused.

Yes, that would almost certainly be misused.

> Sometime, we should write up a little how-to document about
> tuning parallelism separately.
> 
> -Doug
> 
> 




More information about the core-libs-dev mailing list