Numeric (and accumulators)
Doug Lea
dl at cs.oswego.edu
Sun Sep 16 06:52:13 PDT 2012
On 09/15/12 09:23, Doug Lea wrote:
>
> But there's a much better move here -- take the opportunity to provide
> standardized forms of structured accumulators. ...
>
> All of these might live in package java.util.functions?
>
After fleshing out a bit, I'm now thinking that a variant of
this would be better in j.u.c as a refactoring of
jsr166e.LongAdder etc. The availability of non-thread-safe forms
as well would then just be an opportunistic byproduct.
One reason for integrating into j.u.c is that keyed versions of
these forms would need to generalize what is now LongAdderTable
(http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/LongAdderTable.html)
Reminder: LongAdderTable supports creation of frequency maps,
histograms etc in parallel in a much, much better way (less overhead,
less GC, more scalability) than multimap-like constructions.
This is another application of reduce-while-building designs;
a parallel mutative analog of flatMap. (Note that the design
advantage holds primarily for parallel/concurrent applications.
It matters much less in sequential code where locality and
multiple pass set-up/tear-down have less impact.)
The current LongAdderTable exploits CHM's thread-safe
lambda-accepting methods that make most methods short
and simple. The two most commonly used methods are:
public void add(K key, long x) {
map.computeIfAbsent(key, createLongAdderFunc).add(x);
}
public long sum(K key) {
LongAdder a = map.get(key);
return a == null ? 0L : a.sum();
}
These could easily be generalized for arbitrary accumulators,
as in:
public void update(K key, long x) {
map.computeIfAbsent(key, accCreationFunc).update(x);
}
public long accumulation(K key) {
LongAccumulator a = map.get(key);
return a == null ? basisValue : a.get();
}
But the prospects for further generalizing to cover all
forms of {int, long, double, T} accumulators are not so
nice: Four boringly similar classes, each of which share
the same problem as the original of not itself being
a Map, but just a thin veneer to simplify some
common usages. The ones other than LongAdderTable don't get
a very high score on the "does it pull its weight" JDK
inclusion scale.
So what would it take to avoid defining these classes at all?
If all the common usage constructions could be written as
expression-y 1-liners on CHM itself, then these would be
more appropriate as javadoc examples than classes.
There are only a few such methods, so it seems worth a try.
One place to start is to finally add a (reduction-argument-style)
default-returning version of get to CHM:
<T> getValueOrDefault(K key, T valueIfAbsent);
This could be used to save a line or two here and there
when using a straight CHM<Key, XAccumulator>. For
example, accumulation via:
f = map.getValueOrDefault(k, EMPTY_ACCUMULATOR).get();
Even nicer might be:
<U> mapValueFor(K key, Fun<? super V, ? extends U> f, U ifAbsent);
Unfortunately, this wouldn't help a lot in the main target use cases
of collecting simple counts etc -- these usages go beyond what
we can do to evade boxing and incorporate numerics. And
it doesn't help at all with method reset() or getThenReset().
But considering that update() and get() are by far the
most common operations, maybe we can get by with adding
some javadoc (to CHM) along the lines of:
/**
* Example: Maps from keys to Accumulators can be used to create Histograms
* (frequency maps) and related tables.
*
* To increment a count, installing if not already present:
* map.computeIfAbsent(key, () -> LongAccumulator.optimisticSum()).update(1L);
*
* To determine the sum for a key, it is convenient to create a
* singleton static EMPTY_ACCUMULATOR, enabling
* long sum = map.getValueOrDefault(key, EMPTY_ACCUMULATOR).get();
*/
Any thoughts on this?
-Doug
More information about the lambda-libs-spec-observers
mailing list