Numerical Stream code
Howard Lovatt
howard.lovatt at gmail.com
Wed Feb 13 22:34:23 PST 2013
Hi,
I have been trying out lambdas on:
openjdk version "1.8.0-ea"
OpenJDK Runtime Environment (build
1.8.0-ea-lambda-nightly-h3307-20130211-b77-b00)
OpenJDK 64-Bit Server VM (build 25.0-b15, mixed mode)
To see if scientific type numerical code can use Streams. I wrote a
synthetic benchmark that applies a kernel repeatedly over time and space to
solve a diffusion equation in 1 D, e.g. heat diffusing into a metal rod
from either end. The core of the code is:
private enum Styles implements Style {
CLike {
@Override public double run() {
uM1[0] = uT0; // t = 0
for (int xi = 1; xi < numXs - 1; xi++) { uM1[xi] = u0X; }
uM1[numXs - 1] = uT1;
for (int ti = 1; ti < numTs; ti++, uTemp = uM1, uM1 = u0, u0 =
uTemp) { // t > 0
u0[0] = uT0; // x = 0
for (int xi = 1; xi < numXs - 1; xi++) { u0[xi] =
explicitFDM.u00(uM1[xi - 1], uM1[xi], uM1[xi + 1]); } // 0 < x < 1
u0[numXs - 1] = uT1; // x = 1
}
double sum = 0; // Calculate average of last us
for (final double u : uM1) { sum += u; }
return sum / numXs;
}
},
SerialStream {
@Override public double run() {
Arrays.indices(uM1).forEach(this::t0);
for (int ti = 1; ti < numTs; ti++, uTemp = uM1, uM1 = u0, u0 =
uTemp) { // t > 0
Arrays.indices(uM1).forEach(this::tg0);
}
return Arrays.stream(uM1).average().getAsDouble(); // Really slow!
}
},
ParallelStream {
@Override public double run() {
Arrays.indices(uM1).parallel().forEach(this::t0);
for (int ti = 1; ti < numTs; ti++, uTemp = uM1, uM1 = u0, u0 =
uTemp) { // t > 0
Arrays.indices(uM1).parallel().forEach(this::tg0);
}
return Arrays.stream(uM1).parallel().average().getAsDouble(); //
Really really slow!!
}
};
double[] u0 = new double[numXs];
double[] uM1 = new double[numXs];
double[] uTemp = null;
void t0(final int xi) {
if (xi == 0) { uM1[0] = uT0; }
else if (xi == numXs - 1) { uM1[numXs - 1] = uT1; }
else { uM1[xi] = u0X; }
}
void tg0(final int xi) {
if (xi == 0) { u0[0] = uT0; }
else if (xi == numXs - 1) { u0[numXs - 1] = uT1; }
else { u0[xi] = explicitFDM.u00(uM1[xi - 1], uM1[xi], uM1[xi + 1]); }
}
}
And when run it produces:
CLike: time = 2351 ms, result = 99.99581170383331
SerialStream: time = 20532 ms, result = 99.99581170383331
ParallelStream: time = 131317 ms, result = 99.99581170383331
The slowness is a pity because the coding comes out quite well!
I wasn't particularly expecting the Stream implementation to be fast,
because they are a work in progress after all. However a factor of almost
10 for the serial case and over 50 for the parallel case seems excessive. I
therefore suspect that I am doing something wrong.
Can anyone enlighten me?
Thanks,
-- Howard.
More information about the lambda-dev
mailing list