Creating streams from ranges, functions and suppliers

Brian Goetz brian.goetz at oracle.com
Thu Mar 21 07:19:37 PDT 2013


Some follow-up questions:
  - Do any of these seem NOT to carry their weight?
  - Are there any forms that are obviously missing?  Specifically, 
indexed forms (or is range().map() good enough?)
  - If we have an ints() or longs(), should these be ranges or truly 
infinite?  Or both?


On 3/21/2013 6:38 AM, Paul Sandoz wrote:
> Hi,
>
> In the lambda repo there are bunch of ways to create streams from non-collection sources:
>
> - Primitive ranges
> - Infinite streams whose elements are the result of repeatedly applying a function
> - Infinite streams whose elements are generated from a supplier
>
> The infinite streams are intended to be used in conjunction with limit(), substream() or short-circuiting terminal operations.
>
> Thoughts?
>
> Paul.
>
> --
>
> Ranges:
>
>      public static IntStream intRange(int start, int end) {
>      public static IntStream intRange(int start, int end, int step) {
>
>      public static LongStream longRange(long start, final long end) {
>      public static LongStream longRange(long start, final long end, final long step) {
>
>      public static DoubleStream doubleRange(double start, double end) {
>      public static DoubleStream doubleRange(double start, double end, double step) {
>
>
> A stream created from doubleRange will have a maximum element count of Long.MAX_VALUE. The implementation of doubleRange is equivalent to:
>
>      long size = (long) Math.ceil((start - end) / step);
>      DoubleStream ds = Streams.longStream(0, size).doubles().map(i -> start + step * i);
>
> By providing a method we can ensure developers will do the right thing in terms of splitting (require consistent values on traversal), check for edge numerical cases e.g. elements are not all distinct, and be performant.
>
>
> There are no ints() or longs() methods but these could be useful when used in conjunction with limit() or substream(). Implementation-wise they are trivial:
>
>    intRange(0, Integer.MAX_VALUE)
>
> I think they could be a useful idiom (see below on the generate method).
>
> doubles() may also be useful but perhaps should be restricted to a max size of 2^53 + 1 to ensure integer values are precisely represented.
>
> --
>
> Infinite streams whose elements are the result of repeatedly applying a function:
>
>      public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f) {
>      public static IntStream iterateInt(final int seed, final IntUnaryOperator f) {
>      public static LongStream iterateLong(final long seed, final LongUnaryOperator f) {
>      public static DoubleStream iterateDouble(final double seed, final DoubleUnaryOperator f) {
>
>
> Essentially create streams of (seed, f(seed), f(f(seed)), ...). This is a common functional idiom.
>
> The stream that is created has an encounter order and permits limited parallelism since an iterator is used to repeatedly apply the function to the previous result.
>
> --
>
> Infinite streams whose elements are generated from a supplier:
>
>      public static<T> Stream<T> generate(Supplier<T> s) {
>      public static IntStream generateInt(IntSupplier s) {
>      public static LongStream generateLong(LongSupplier s) {
>      public static DoubleStream generateDouble(DoubleSupplier s) {
>
> The use in java.util.Random is a good example:
>
>      public IntStream ints() {
>          return Streams.generateInt(this::nextInt);
>      }
>
> Currently the generators are iterator based and thus only permits limited parallelism, and there are guarantees that the nth element encountered corresponds to the nth call of Supplier.get().
>
> We could change the implementation to be equivalent to:
>
>    longs().map(e -> s.get())
>
> The stream would not have an encounter order, and enables balanced parallel computation (and therefore better resource utilization for reduction operations).  For parallel streams the supplier would be invoked concurrently.
>
> The stream would no longer be known to be infinite, but would be "not known to be finite" from the perspective of the caller e.g. calling forEach my take more time and do more work than one expects unless one limits the number of elements to a known finite size.
>
> The existing functionality could be achieved using iterate:
>
>    iterate(s.get(), i- > s.get());
>
> So i think we should change generate as described.
>


More information about the lambda-libs-spec-experts mailing list