Characterizing stream operation
Brian Goetz
brian.goetz at oracle.com
Fri Feb 15 12:04:14 PST 2013
We've divided stream operations as follows:
Intermediate operations. Always lazy. Always produce another stream.
Stateful operations. A kind of intermediate operation. Currently
always transforms to the same stream type (e.g., Stream<T> to
Stream<T>), though this could conceivably change (we haven't found any,
though). Must provide their own parallel implementation. Parallel
pipelines containing stateful operations are implicitly "sliced" into
segments on stateful operation boundaries, and executed in segments.
Terminal operations. The only thing that kicks off stream computation.
Produces a non-stream result (value or side-effects.)
For each of these, once you perform an operation on a stream
(intermediate or terminal), the stream is *consumed* and no more
operations can be performed on that stream. (Not entirely true, as the
TCK team will almost certainly point out to us eventually; there are
some ops that are no-ops and probably will succeed unless we add
consumed checks.)
These names are fine from the perspective of the implementation; when
implementing an operation, you will be implementing one of these three
types, and conveniently there is a base type for each to subclass.
From the user perspective, though, they may not be as helpful as some
alternative taxonomies, such as:
- lazy operation -- what we now call intermediate operation
- stateful lazy operation -- what we now call stateful
- consuming operation -- what we now call terminal
These are good in that they keep a key characteristic -- when the
computation happens -- in full view. However, they also create less
clean boundaries. For example, iterator() is a consuming operation from
the perspective of the stream, but from the perspective of the user, may
be thought of as lazy.
Thoughts on how to adjust this naming to be more intuitive to users?
More information about the lambda-libs-spec-experts
mailing list