Regex Point Lambdafication Patch

Brian Goetz brian.goetz at oracle.com
Mon Mar 11 09:22:33 PDT 2013


Even simpler:

Spliterator<String> s
     = Spliterators.spliteratorUnknownSize(new CharSeqSpliterator(),
                                           ORDERED));

Just write the iterator and leave the spliterating to the library.

On 3/11/2013 12:19 PM, Ben Evans wrote:
> On Sun, Mar 10, 2013 at 6:26 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>> Because you're not overriding trySplit, this spliterator will not allow
>> streams to be parallelized at all, even if the downstream operations could
>> benefit from such, such as in:
>>
>>    pattern.splitAsStream(bigString)
>>           .parallel()
>>           .map(expensiveTransform)
>>           .forEach(...);
>>
>> Even though the above pipeline will be limited by the sequential regex
>> splitting at its source, if the downstream operations are expensive enough,
>> they could still benefit from parallelization.  But the spliterator, as
>> written, won't permit that -- it is strictly sequential.
>
> OK, makes sense.
>
>> You can fix this easily by creating an Iterator<String> and wrapping that
>> with Spliterators.spliteratorUnknownSize(iterator, characteristics) instead
>> of writing a sequential-only iterator.
>
> This part I'm not sure I fully understand.
>
> Did you mean an implementation something like this:
>
>      private static class CharSequenceSpliterator implements
> Spliterator<String> {
>          private CharSequence input;
>          private final HelperIterator it;
> 	private int current = 0;
>
> 	CharSequenceSpliterator(CharSequence in, Matcher m) {
> 	    input = in;
> 	    it = new HelperIterator(m);
> 	}
>
> 	private class HelperIterator implements Iterator<String> {
> 	    private final Matcher curMatcher;
> 	
> 	    HelperIterator(Matcher m) {
> 		curMatcher = m;
> 	    }
>
> 	    public String next() {
> 		String nextChunk = input.subSequence(current, curMatcher.start()).toString();
> 		current = curMatcher.end();
> 		return nextChunk;
> 	    }
>
> 	    public boolean hasNext() {
> 		return curMatcher.find();
> 	    }
> 	}
>
> 	public boolean tryAdvance(Consumer<? super String> action) {
> 	    if (it.hasNext()) {
> 		action.accept(it.next());
> 		// Match the behaviour of Pattern::split
> 		if (current == input.length()) return false;
> 		return true;
> 	    }
> 			
> 	    action.accept(input.subSequence(current, input.length()).toString());
> 	    return false;
> 	}
>
> 	public Spliterator<String> trySplit() {
> 	    return Spliterators.spliteratorUnknownSize(it, Spliterator.ORDERED);
> 	}
>
> 	public int characteristics() {
> 	    return Spliterator.ORDERED;
> 	}
>      }
>
> Note that as currently written, the HelperIterator in the above needs
> to be able to see current in the enclosing class.
>
> Thanks,
>
> Ben
>


More information about the lambda-dev mailing list