Regex Point Lambdafication Patch

Brian Goetz brian.goetz at oracle.com
Sun Mar 10 11:26:14 PDT 2013


You should return only ORDERED as your spliterator characteristics. 
SIZED would suggest you know a priori how many matches there will be; 
IMMUTABLE suggests that the set of matches can't change for a given 
input.  Which would be true if the input were a String, but not true for 
a CharSequence.

Because you're not overriding trySplit, this spliterator will not allow 
streams to be parallelized at all, even if the downstream operations 
could benefit from such, such as in:

   pattern.splitAsStream(bigString)
          .parallel()
          .map(expensiveTransform)
          .forEach(...);

Even though the above pipeline will be limited by the sequential regex 
splitting at its source, if the downstream operations are expensive 
enough, they could still benefit from parallelization.  But the 
spliterator, as written, won't permit that -- it is strictly sequential.

You can fix this easily by creating an Iterator<String> and wrapping 
that with Spliterators.spliteratorUnknownSize(iterator, characteristics) 
instead of writing a sequential-only iterator.

On 3/10/2013 8:41 AM, Ben Evans wrote:
> Hi,
>
> I've completed a first cut of the point lambdafication for regexps patch.
>
> This is my first time running webrev - so please let me know if I've
> done this right - I'm supposed to host the webrev for the patch on a
> webserver somewhere?
>
> Assuming this is the case, I've put it up at:
>
> http://www.java7developer.com/webrev-kittylyst.001/
>
> What should I do now?
>
> Thanks,
>
> Ben
>


More information about the lambda-dev mailing list