Regex Point Lambdafication Patch
Ben Evans
benjamin.john.evans at gmail.com
Mon Mar 11 09:19:21 PDT 2013
On Sun, Mar 10, 2013 at 6:26 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
> Because you're not overriding trySplit, this spliterator will not allow
> streams to be parallelized at all, even if the downstream operations could
> benefit from such, such as in:
>
> pattern.splitAsStream(bigString)
> .parallel()
> .map(expensiveTransform)
> .forEach(...);
>
> Even though the above pipeline will be limited by the sequential regex
> splitting at its source, if the downstream operations are expensive enough,
> they could still benefit from parallelization. But the spliterator, as
> written, won't permit that -- it is strictly sequential.
OK, makes sense.
> You can fix this easily by creating an Iterator<String> and wrapping that
> with Spliterators.spliteratorUnknownSize(iterator, characteristics) instead
> of writing a sequential-only iterator.
This part I'm not sure I fully understand.
Did you mean an implementation something like this:
private static class CharSequenceSpliterator implements
Spliterator<String> {
private CharSequence input;
private final HelperIterator it;
private int current = 0;
CharSequenceSpliterator(CharSequence in, Matcher m) {
input = in;
it = new HelperIterator(m);
}
private class HelperIterator implements Iterator<String> {
private final Matcher curMatcher;
HelperIterator(Matcher m) {
curMatcher = m;
}
public String next() {
String nextChunk = input.subSequence(current, curMatcher.start()).toString();
current = curMatcher.end();
return nextChunk;
}
public boolean hasNext() {
return curMatcher.find();
}
}
public boolean tryAdvance(Consumer<? super String> action) {
if (it.hasNext()) {
action.accept(it.next());
// Match the behaviour of Pattern::split
if (current == input.length()) return false;
return true;
}
action.accept(input.subSequence(current, input.length()).toString());
return false;
}
public Spliterator<String> trySplit() {
return Spliterators.spliteratorUnknownSize(it, Spliterator.ORDERED);
}
public int characteristics() {
return Spliterator.ORDERED;
}
}
Note that as currently written, the HelperIterator in the above needs
to be able to see current in the enclosing class.
Thanks,
Ben
More information about the lambda-dev
mailing list