Bag/MultiSet [wasRe: Loose-ends wrapup]

Sat May 11 03:42:53 PDT 2013

ConcurrentHashBag discussion from lambda-spec list.

Just to add to Kevin's comments. I would consider a "Bag" to be an
interface defined as per Guava or Commons Collections:
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multiset.html
http://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections/Bag.html

Both have almost identical definitions of the methods in a Bag. In
particular, I'd expect there to be extras methods:
- add(E element, int nCopies)
- remove(E element, int nCopies)
- count(E element)
- uniqueSet()/elementSet()
(I'm less convinced by Guava's setCount)

So, while the proposal to add ConcurrentHashBag simply implementing
Collection is interesting, it doesn't meet developer expectations set
over many years for what a Bag is and is used for. Renaming it to
ConcurrentHashBuffer wouldn't really help IMO as "buffer" has other
meanings, ConcurrentHashCollection or ConcurrentCollection would be
more OK as names.

One option would be to add the class to the JDK but hidden and only
accessible via a public factory method for JDK8. That way it could be
used for the Stream use case, but doesn't give false impressions for
other use cases.

(I do think adding a Bag interface to the JDK would be a good thing,
but it would require more infrastructure classes like
synchronized/unmodifiable, which would be a struggle to agree in the
JD8 timescale.)

On null-handling, perhaps it could have an internal mode/flag as to
whether it does or does not allow nulls.

Stephen

On 10 May 2013 21:00, Doug Lea <dl at cs.oswego.edu> wrote:
> On 05/10/13 15:32, Kevin Bourrillion wrote:
>>
>> On Fri, May 10, 2013 at 11:09 AM, Doug Lea <dl at cs.oswego.edu
>> <mailto:dl at cs.oswego.edu>> wrote:
>>
>>     It only supports the methods defined in the Collection API.
>>
>>
>> Oh. I believe that severely curtails its usefulness, but could try to back
>> that
>> up with stats from Google's codebase if necessary.
>>
>
> Are you referring to the usages for which we decided to
> recommend (pasting from javadoc...)
>
>
>  * <p>A ConcurrentHashMap can be used as scalable frequency map (a
>  * form of histogram or multiset) by using {@link
>  * java.util.concurrent.atomic.LongAdder} values and initializing via
>  * {@link #computeIfAbsent computeIfAbsent}. For example, to add a count
>  * to a {@code ConcurrentHashMap<String,LongAdder> freqs}, you can use
>  * {@code freqs.computeIfAbsent(k -> new LongAdder()).increment();}
>  *
>
> -Doug
>