Pattern.splitAsStream/asPredicate

Paul Sandoz paul.sandoz at oracle.com
Fri Apr 26 04:00:27 PDT 2013


On Apr 24, 2013, at 7:13 PM, Brian Goetz <brian.goetz at Oracle.COM> wrote:

> There definitely could be more to this.  For example, a common usage pattern for matching is:
> 
>  while (more) { 
>      // get the next match
>      // get the stuff between the last match and the start of this match
>      // do something with that
>      // do something with the current match
>  }
> 
> So while getting the matches is good, getting at the stuff between the matches is also sometimes useful.  Is there an easy way to do that, such as providing a Stream<Match>?  
> 

It's awkward with the current types. A Matcher of a Pattern is mutable and MatchResult (which would need to be cloned via Matcher.toMatchResult) only provides access to a match. 

The prefix characters before a match need to be tracked independently, as do the remaining characters after no further matches. So we would require a stream of say (String prefix, MatchResult r) where r is null, or an empty match, for the last tuple in the stream.

We can add methods to Matcher that behave the same way as the String bearing methods:

    public String replaceAll(Function<MatchResult, String> f)

    public String replaceFirst(Function<MatchResult, String> f)
 

> There's an easy way for streams like this to be never-parallel -- create them from a Spliterator whose trySplit always returns null.  Then, even parallel execution will always be serial.  I don't think there's a need for an abstraction for that -- just build off a non-splittable iterator.  
> 
> But, there may also be some parallelism to extract, if the post-processing on a match is high-Q.  Then you might still be able to overcome the sequentiality of generating matches if the per-match post processing is high enough.  
> 

Right, i think it would be incorrect to make any predictions about Q.

Paul.


More information about the lambda-libs-spec-experts mailing list