Alternative Spliterator for SpinedBuffer

Brian Goetz brian.goetz at oracle.com
Fri Apr 12 10:10:48 PDT 2013


Thanks for taking a look at this code!  Your approach does look like it 
would yield a better decomposition -- and this is a really good time to 
be looking at this, as the APIs have mostly settled.

I agree that there are improvements to be made here.  Have your 
performance measurements as improved runtime as well as reduced task 
count?

On 4/12/2013 11:10 AM, Peter Levart wrote:
> On 04/11/2013 09:53 PM, brian.goetz at oracle.com wrote:
>> Changeset: 5e7f8797b9c1
>> Author:    briangoetz
>> Date:      2013-04-11 15:53 -0400
>> URL:       http://hg.openjdk.java.net/lambda/lambda/jdk/rev/5e7f8797b9c1
>>
>> Spec pass on StreamBuilder
>>
>> ! src/share/classes/java/util/stream/StreamBuilder.java
>> ! src/share/classes/java/util/stream/Streams.java
>>
>>
>
> Hi,
>
> I played a little with StreamBuilder and it works quite optimally using
> private SpinedBuffer under the hood. The only sub-optimality is when it
> starts small (min. 16 elements) and then grows big and after that it is
> used in a parallel computation where leaf task maximum size is more than
> 16 elements. For example, this code:
>
>
> public class SpinedBufferTest {
>
>      static int dump(Spliterator<Integer> spliterator, int level, int
> iterateSize) {
> System.out.print(" ".substring(0, level));
>          System.out.print(spliterator.estimateSize());
>          Spliterator<Integer> spl2;
>          if (spliterator.estimateSize() > iterateSize && (spl2 =
> spliterator.trySplit()) != null) {
>              System.out.println(": split " + spl2.estimateSize() + " + "
> + spliterator.estimateSize());
>              return dump(spl2, level + 1, iterateSize) +
>                     dump(spliterator, level + 1, iterateSize);
>          }
>          else {
>              Integer[] minMax = new Integer[2];
>              spliterator.forEachRemaining(x -> {
>                  minMax[0] = minMax[0] == null ? x : Math.min(minMax[0],
> x);
>                  minMax[1] = minMax[1] == null ? x : Math.max(minMax[1],
> x);
>              });
>              System.out.println(": (" + minMax[0] + "..." + minMax[1] +
> ")");
>              return 1;
>          }
>      }
>
>      public static void main(String[] args) {
>          SpinedBuffer<Integer> buf = new SpinedBuffer<>();
>          for (int i = 0; i < 16384; i++) {
>              buf.accept(i);
>          }
>          int leafs = dump(buf.spliterator(), 0, 1024);
>          System.out.println("\nTotal leafs: " + leafs);
>      }
> }
>
>
> Prints the splits that would occur for 16k elements when max. leaf task
> size is 1k:
>
>
> 16384: split 16 + 16368
>   16: (0...15)
>   16368: split 16 + 16352
>    16: (16...31)
>    16352: split 32 + 16320
>     32: (32...63)
>     16320: split 64 + 16256
>      64: (64...127)
>      16256: split 128 + 16128
>       128: (128...255)
>       16128: split 256 + 15872
>        256: (256...511)
>        15872: split 512 + 15360
>         512: (512...1023)
>         15360: split 1024 + 14336
>          1024: (1024...2047)
>          14336: split 2048 + 12288
>           2048: split 1024 + 1024
>            1024: (2048...3071)
>            1024: (3072...4095)
>           12288: split 4096 + 8192
>            4096: split 2048 + 2048
>             2048: split 1024 + 1024
>              1024: (4096...5119)
>              1024: (5120...6143)
>             2048: split 1024 + 1024
>              1024: (6144...7167)
>              1024: (7168...8191)
>            8192: split 4096 + 4096
>             4096: split 2048 + 2048
>              2048: split 1024 + 1024
>               1024: (8192...9215)
>               1024: (9216...10239)
>              2048: split 1024 + 1024
>               1024: (10240...11263)
>               1024: (11264...12287)
>             4096: split 2048 + 2048
>              2048: split 1024 + 1024
>               1024: (12288...13311)
>               1024: (13312...14335)
>              2048: split 1024 + 1024
>               1024: (14336...15359)
>               1024: (15360...16383)
>
> Total leafs: 22
>
>
> 7 leaf tasks are half the max. size or less.
>
> Now SpinedBuffer has chunks with sizes increasing exponentially using
> power of 2:  16, 16, 32, 64, 128, 256, 512, ... Each chunk has capacity
> equal to the sum of capacities of all previous chunks. So nearly optimal
> split (modulo last non-full chunk) would be to always split so that the
> last chunk is cut-off and the returned Spliterator contains all the
> chunks up to the last and this Spliterator only contains the last chunk.
>
> Here's my attempt at modifying the spliterator to act like that (I only
> did it for the generic Object variant in 1st try, but I can prepare the
> primitive variants too, if desired):
>
> http://dl.dropboxusercontent.com/u/101777488/jdk8-lambda/SpinedBufferAltSpliterator/webrev.01/index.html
>
>
>
> With this modification, the above code prints:
>
> 16384: split 8192 + 8192
>   8192: split 4096 + 4096
>    4096: split 2048 + 2048
>     2048: split 1024 + 1024
>      1024: (0...1023)
>      1024: (1024...2047)
>     2048: split 1024 + 1024
>      1024: (2048...3071)
>      1024: (3072...4095)
>    4096: split 2048 + 2048
>     2048: split 1024 + 1024
>      1024: (4096...5119)
>      1024: (5120...6143)
>     2048: split 1024 + 1024
>      1024: (6144...7167)
>      1024: (7168...8191)
>   8192: split 4096 + 4096
>    4096: split 2048 + 2048
>     2048: split 1024 + 1024
>      1024: (8192...9215)
>      1024: (9216...10239)
>     2048: split 1024 + 1024
>      1024: (10240...11263)
>      1024: (11264...12287)
>    4096: split 2048 + 2048
>     2048: split 1024 + 1024
>      1024: (12288...13311)
>      1024: (13312...14335)
>     2048: split 1024 + 1024
>      1024: (14336...15359)
>      1024: (15360...16383)
>
> Total leafs: 16
>
>
> Regards, Peter
>


More information about the lambda-dev mailing list