Dot Product Thoughts

Paul Sandoz paul.sandoz at
Fri Apr 19 00:17:28 PDT 2013

On Apr 19, 2013, at 1:09 AM, Richard Warburton <richard.warburton at> wrote:

> Hi,
> Implementing a dot product between a pair of vectors brought up a few
> observations about the library:
> 1. There's a Double::sum, but no Double::multiply, etc.  I appreciate you
> have to stop somewhere, but is sum the place to stop?  Might be worth
> adding other basic arithmetic operations.
> 2. There appears to be a zip function now, but with no overloads for
> primitive streams.  I managed to guess at as the location, so
> one data point, but good news on that front.
> 3. If I have an double[] is there is a cleaner way of making a DoubleStream
> than adding every element to the builder?  Perhaps I'm missing something?
> Is the expectation that people who care about unboxed primitives are using
> specialised collections libraries, eg GNU Trove, and these libraries will
> provide specialised .stream() methods returning DoubleStream, et al. ?
> 4. A performance comparison with a trivial imperative example resulted in a
> ~16x slowdown moving to lambdas.  I'm willing to take some performance hit
> for the nicer code, but 16x is a lot higher than I would have expected or
> hoped for.  I'll try to have a look at some more 'real world' examples in
> future, but even then being this much slower on mathematical problems will
> cause some people trouble.
> Imperative code (x and y are arrays)
> final int N = x.length;
> double sum = 0;
> for (int i = 0; i < N; i++) {
>    sum += x[i] * y[i];
> }
> return sum;
> Java 8 Lambdas (x and y are streams):
>, y, (left, right) -> left * right)
>       .reduce(0.0, Double::sum);

Note that zip as currently written not particularly optimized. It currently extracts the iterator from each stream and wraps them both in a new iterator. 

We can certainly remove some layering by directly using spliterator instead. 

Parallel-wise this is trickier to optimize since order needs to be preserved for both sources when zipping and the spliting of both is unlikely to be in sync in terms of sizes, hence why parallelism is currently "extracted" from the wrapping iterator instead.

So there is more work to here :-)


More information about the lambda-dev mailing list