Regex Point Lambdafication Patch
Brian Goetz
brian.goetz at oracle.com
Mon Mar 11 09:22:33 PDT 2013
Even simpler:
Spliterator<String> s
= Spliterators.spliteratorUnknownSize(new CharSeqSpliterator(),
ORDERED));
Just write the iterator and leave the spliterating to the library.
On 3/11/2013 12:19 PM, Ben Evans wrote:
> On Sun, Mar 10, 2013 at 6:26 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>> Because you're not overriding trySplit, this spliterator will not allow
>> streams to be parallelized at all, even if the downstream operations could
>> benefit from such, such as in:
>>
>> pattern.splitAsStream(bigString)
>> .parallel()
>> .map(expensiveTransform)
>> .forEach(...);
>>
>> Even though the above pipeline will be limited by the sequential regex
>> splitting at its source, if the downstream operations are expensive enough,
>> they could still benefit from parallelization. But the spliterator, as
>> written, won't permit that -- it is strictly sequential.
>
> OK, makes sense.
>
>> You can fix this easily by creating an Iterator<String> and wrapping that
>> with Spliterators.spliteratorUnknownSize(iterator, characteristics) instead
>> of writing a sequential-only iterator.
>
> This part I'm not sure I fully understand.
>
> Did you mean an implementation something like this:
>
> private static class CharSequenceSpliterator implements
> Spliterator<String> {
> private CharSequence input;
> private final HelperIterator it;
> private int current = 0;
>
> CharSequenceSpliterator(CharSequence in, Matcher m) {
> input = in;
> it = new HelperIterator(m);
> }
>
> private class HelperIterator implements Iterator<String> {
> private final Matcher curMatcher;
>
> HelperIterator(Matcher m) {
> curMatcher = m;
> }
>
> public String next() {
> String nextChunk = input.subSequence(current, curMatcher.start()).toString();
> current = curMatcher.end();
> return nextChunk;
> }
>
> public boolean hasNext() {
> return curMatcher.find();
> }
> }
>
> public boolean tryAdvance(Consumer<? super String> action) {
> if (it.hasNext()) {
> action.accept(it.next());
> // Match the behaviour of Pattern::split
> if (current == input.length()) return false;
> return true;
> }
>
> action.accept(input.subSequence(current, input.length()).toString());
> return false;
> }
>
> public Spliterator<String> trySplit() {
> return Spliterators.spliteratorUnknownSize(it, Spliterator.ORDERED);
> }
>
> public int characteristics() {
> return Spliterator.ORDERED;
> }
> }
>
> Note that as currently written, the HelperIterator in the above needs
> to be able to see current in the enclosing class.
>
> Thanks,
>
> Ben
>
More information about the lambda-dev
mailing list