Regex Point Lambdafication Patch

Remi Forax forax at univ-mlv.fr
Mon Mar 11 09:42:12 PDT 2013


On 03/11/2013 05:22 PM, Brian Goetz wrote:
> Even simpler:
>
> Spliterator<String> s
>       = Spliterators.spliteratorUnknownSize(new CharSeqSpliterator(),
>                                             ORDERED));
>
> Just write the iterator and leave the spliterating to the library.

and without the typo

Spliterator<String> s
      = Spliterators.spliteratorUnknownSize(new CharSeqIterator(),
                                            ^^^^^^^^^^
                                            ORDERED));

BTW, all iterators taken as parameter in Spliterators should be 
Iterator<? extends T> and not Iterator<T>.

Rémi

>
> On 3/11/2013 12:19 PM, Ben Evans wrote:
>> On Sun, Mar 10, 2013 at 6:26 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>> Because you're not overriding trySplit, this spliterator will not allow
>>> streams to be parallelized at all, even if the downstream operations could
>>> benefit from such, such as in:
>>>
>>>     pattern.splitAsStream(bigString)
>>>            .parallel()
>>>            .map(expensiveTransform)
>>>            .forEach(...);
>>>
>>> Even though the above pipeline will be limited by the sequential regex
>>> splitting at its source, if the downstream operations are expensive enough,
>>> they could still benefit from parallelization.  But the spliterator, as
>>> written, won't permit that -- it is strictly sequential.
>> OK, makes sense.
>>
>>> You can fix this easily by creating an Iterator<String> and wrapping that
>>> with Spliterators.spliteratorUnknownSize(iterator, characteristics) instead
>>> of writing a sequential-only iterator.
>> This part I'm not sure I fully understand.
>>
>> Did you mean an implementation something like this:
>>
>>       private static class CharSequenceSpliterator implements
>> Spliterator<String> {
>>           private CharSequence input;
>>           private final HelperIterator it;
>> 	private int current = 0;
>>
>> 	CharSequenceSpliterator(CharSequence in, Matcher m) {
>> 	    input = in;
>> 	    it = new HelperIterator(m);
>> 	}
>>
>> 	private class HelperIterator implements Iterator<String> {
>> 	    private final Matcher curMatcher;
>> 	
>> 	    HelperIterator(Matcher m) {
>> 		curMatcher = m;
>> 	    }
>>
>> 	    public String next() {
>> 		String nextChunk = input.subSequence(current, curMatcher.start()).toString();
>> 		current = curMatcher.end();
>> 		return nextChunk;
>> 	    }
>>
>> 	    public boolean hasNext() {
>> 		return curMatcher.find();
>> 	    }
>> 	}
>>
>> 	public boolean tryAdvance(Consumer<? super String> action) {
>> 	    if (it.hasNext()) {
>> 		action.accept(it.next());
>> 		// Match the behaviour of Pattern::split
>> 		if (current == input.length()) return false;
>> 		return true;
>> 	    }
>> 			
>> 	    action.accept(input.subSequence(current, input.length()).toString());
>> 	    return false;
>> 	}
>>
>> 	public Spliterator<String> trySplit() {
>> 	    return Spliterators.spliteratorUnknownSize(it, Spliterator.ORDERED);
>> 	}
>>
>> 	public int characteristics() {
>> 	    return Spliterator.ORDERED;
>> 	}
>>       }
>>
>> Note that as currently written, the HelperIterator in the above needs
>> to be able to see current in the enclosing class.
>>
>> Thanks,
>>
>> Ben
>>



More information about the lambda-dev mailing list