Alternative Spliterator for SpinedBuffer
Peter Levart
peter.levart at gmail.com
Fri Apr 12 08:10:53 PDT 2013
On 04/11/2013 09:53 PM, brian.goetz at oracle.com wrote:
> Changeset: 5e7f8797b9c1
> Author: briangoetz
> Date: 2013-04-11 15:53 -0400
> URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/5e7f8797b9c1
>
> Spec pass on StreamBuilder
>
> ! src/share/classes/java/util/stream/StreamBuilder.java
> ! src/share/classes/java/util/stream/Streams.java
>
>
Hi,
I played a little with StreamBuilder and it works quite optimally using
private SpinedBuffer under the hood. The only sub-optimality is when it
starts small (min. 16 elements) and then grows big and after that it is
used in a parallel computation where leaf task maximum size is more than
16 elements. For example, this code:
public class SpinedBufferTest {
static int dump(Spliterator<Integer> spliterator, int level, int
iterateSize) {
System.out.print(" ".substring(0, level));
System.out.print(spliterator.estimateSize());
Spliterator<Integer> spl2;
if (spliterator.estimateSize() > iterateSize && (spl2 =
spliterator.trySplit()) != null) {
System.out.println(": split " + spl2.estimateSize() + " + "
+ spliterator.estimateSize());
return dump(spl2, level + 1, iterateSize) +
dump(spliterator, level + 1, iterateSize);
}
else {
Integer[] minMax = new Integer[2];
spliterator.forEachRemaining(x -> {
minMax[0] = minMax[0] == null ? x : Math.min(minMax[0], x);
minMax[1] = minMax[1] == null ? x : Math.max(minMax[1], x);
});
System.out.println(": (" + minMax[0] + "..." + minMax[1] +
")");
return 1;
}
}
public static void main(String[] args) {
SpinedBuffer<Integer> buf = new SpinedBuffer<>();
for (int i = 0; i < 16384; i++) {
buf.accept(i);
}
int leafs = dump(buf.spliterator(), 0, 1024);
System.out.println("\nTotal leafs: " + leafs);
}
}
Prints the splits that would occur for 16k elements when max. leaf task
size is 1k:
16384: split 16 + 16368
16: (0...15)
16368: split 16 + 16352
16: (16...31)
16352: split 32 + 16320
32: (32...63)
16320: split 64 + 16256
64: (64...127)
16256: split 128 + 16128
128: (128...255)
16128: split 256 + 15872
256: (256...511)
15872: split 512 + 15360
512: (512...1023)
15360: split 1024 + 14336
1024: (1024...2047)
14336: split 2048 + 12288
2048: split 1024 + 1024
1024: (2048...3071)
1024: (3072...4095)
12288: split 4096 + 8192
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (4096...5119)
1024: (5120...6143)
2048: split 1024 + 1024
1024: (6144...7167)
1024: (7168...8191)
8192: split 4096 + 4096
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (8192...9215)
1024: (9216...10239)
2048: split 1024 + 1024
1024: (10240...11263)
1024: (11264...12287)
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (12288...13311)
1024: (13312...14335)
2048: split 1024 + 1024
1024: (14336...15359)
1024: (15360...16383)
Total leafs: 22
7 leaf tasks are half the max. size or less.
Now SpinedBuffer has chunks with sizes increasing exponentially using
power of 2: 16, 16, 32, 64, 128, 256, 512, ... Each chunk has capacity
equal to the sum of capacities of all previous chunks. So nearly optimal
split (modulo last non-full chunk) would be to always split so that the
last chunk is cut-off and the returned Spliterator contains all the
chunks up to the last and this Spliterator only contains the last chunk.
Here's my attempt at modifying the spliterator to act like that (I only
did it for the generic Object variant in 1st try, but I can prepare the
primitive variants too, if desired):
http://dl.dropboxusercontent.com/u/101777488/jdk8-lambda/SpinedBufferAltSpliterator/webrev.01/index.html
With this modification, the above code prints:
16384: split 8192 + 8192
8192: split 4096 + 4096
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (0...1023)
1024: (1024...2047)
2048: split 1024 + 1024
1024: (2048...3071)
1024: (3072...4095)
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (4096...5119)
1024: (5120...6143)
2048: split 1024 + 1024
1024: (6144...7167)
1024: (7168...8191)
8192: split 4096 + 4096
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (8192...9215)
1024: (9216...10239)
2048: split 1024 + 1024
1024: (10240...11263)
1024: (11264...12287)
4096: split 2048 + 2048
2048: split 1024 + 1024
1024: (12288...13311)
1024: (13312...14335)
2048: split 1024 + 1024
1024: (14336...15359)
1024: (15360...16383)
Total leafs: 16
Regards, Peter
More information about the lambda-dev
mailing list