Stateful lambdas (was: Simplifying sequential() / parallel())

Brian Goetz brian.goetz at oracle.com
Thu Mar 21 14:13:06 PDT 2013


Here's what I've currently got in the package info for non-interference 
and statefulness:


  * <h2><a name="Non-Interference">Non-interference</h2>
  *
  * The {@code java.util.stream} package enables you to execute 
possibly-parallel
  * bulk-data operations over a variety of data sources, including even 
non-thread-safe
  * collections such as {@code ArrayList}.  This is possible only if we can
  * prevent <em>interference</em> with the data source during the 
execution of a
  * stream pipeline.  (Execution begins when the terminal operation is 
invoked, and ends
  * when the terminal operation completes.)  For most data sources, 
preventing interference
  * means ensuring that the data source is <em>not modified at all</em> 
during the execution
  * of the stream pipeline.  (Some data sources, such as concurrent 
collections, are
  * specifically designed to handle concurrent modification, in which 
case their
  * {@code Spliterator} will report the {@code CONCURRENT} characteristic.)
  *
  * <p>Accordingly, lambdas passed to stream methods should never modify 
the stream's data
  * source.  A lambda (or other object implementing the appropriate 
functional interface)
  * is said to <em>interfere</em> with the data source if it modifies, 
or causes to be modified,
  * the stream's data source.  The need for non-interference applies to 
all pipelines, not just parallel
  * ones.  Unless the stream source is concurrent, modifying a stream's 
data source during
  * execution of a stream pipeline can cause exceptions, incorrect 
answers, or nonconformant
  * results.
  *
  * <p>Further results may be nondeterministic or incorrect if the 
lambdas passed to
  * stream operations are <em>stateful</em>.  A stateful lambda (or 
other object implementing the
  * appropriate functional interface) is one whose result depends on any 
state which might change
  * during the execution of the stream pipeline.  An example of a 
stateful lambda is:
  * <pre>
  *     Set<Integer> seen = Collections.synchronizedSet(new HashSet<>());
  *     stream.map(e -> { if (seen.add(e)) return 0; else return e; })...
  * </pre>
  * Stream pipelines with stateful lambdas may produce nondeterministic 
or incorrect results.
  *


And @param doc for map arguments is:

      * @param mapper a <a 
href="package-summary.html#NonInterference">non-interfering, stateless</a>
      *               function to be applied to each element



On 3/21/2013 4:34 PM, Tim Peierls wrote:
> On Thu, Mar 21, 2013 at 4:21 PM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
>             I'm pretty comfortable just saying "lambdas passed to stream
>             ops should not be stateful, if they are, this may result in
>             wrong or non-deterministic outcomes."
>
>
>     Does anyone feel otherwise?
>
>
> Not me. I'd even be comfortable with something stronger: "Passing
> stateful lambdas to stream ops will give you a rash."
>
> --tim
>


More information about the lambda-libs-spec-experts mailing list