StreamOpFlag.* and OutOfMemoryError when going parall
Brian Goetz
brian.goetz at oracle.com
Wed Dec 5 11:07:51 PST 2012
It is quite likely that "limit" is just not a sharp enough tool here.
The intent of skip+limit was to handle things like pagination -- "give
me results 20-30".
We're working on something that should handle cancelation more
generally, to support cases like "find the best answer you can in 5m",
that might be a better choice. Stay tuned.
As to the timing, we make no promises :)
On 12/5/2012 1:54 PM, Christian Mallwitz wrote:
> Very well :-)
>
> Leaving the parallel aspect aside, If I just wanted to find the first
> n prime numbers based on a possibly infinity stream of numbers (I
> don't want to use a Collection with a known size), is there another
> 'officially' approved way to covert an Iterator to a Stream (instead
> of using Streams.stream())?
>
> Regarding a fully lazy, parallel limit() and its future: does 'future'
> mean pre- or post-JDK8?
>
> Cheers!
> Christian
>
> On Wed, Dec 5, 2012 at 6:34 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>> There's a good reason you are struggling with it -- this API is not intended
>> for "casual" builders of streams. The intention is that 99.9% of users will
>> never write code to call the Streams.stream() method directly; they will use
>> arrays, collections, generators, or other packaged providers of streams.
>>
>> A safe default is 0. You can OR together the IS_ version of the flags if
>> you know something about the nature of your stream. You probably want
>> IS_ORDERED if you want to represent a stream for which the order of elements
>> means something (e.g., a range generator, a List, an array, etc.)
>>
>> The reason it is failing with OOME is that the parallel implementation of
>> limit() computes the entire results rather than operating fully lazily.
>> While it would be preferable to operate more lazily, a lazy parallel
>> implementation of operations like limit is not trivial. We hope to improve
>> this in the future.
>>
>> On 12/5/2012 1:18 PM, Christian Mallwitz wrote:
>>>
>>> Hi,
>>>
>>> Using lambda-8-b67-linux-i586-03_dec_2012 I'm trying to compute n (not
>>> necessarily the first n) prime numbers:
>>>
>>> import java.util.*;
>>> import java.util.function.*;
>>> import java.util.stream.*;
>>>
>>> public class LambdaExample3 {
>>>
>>> public static boolean isPrime(long n) {
>>> if (n <= 1) { return false; }
>>> if (n == 2) { return true; }
>>> for (int i = 2; i <= Math.sqrt(n) + 1; i++) { if (n % i == 0)
>>> return false; }
>>> return true;
>>> }
>>>
>>> public static void main(String[] args) {
>>>
>>> Stream<Long> stream =
>>> Streams.parallel(Streams.spliteratorUnknownSize(new Iterator<Long>() {
>>> private long n = 0;
>>> @Override public boolean hasNext() { return true; }
>>> @Override public Long next() { return ++n; }
>>> }),
>>> // fails with OutOfMemoryError
>>> // StreamOpFlag.toStreamFlags(StreamOpFlag.NOT_SIZED,
>>> StreamOpFlag.INITIAL_OPS_VALUE)
>>> // fails with OutOfMemoryError
>>> // StreamOpFlag.NOT_SIZED
>>> // works, but no speed-up to non-parallel version
>>> StreamOpFlag.INITIAL_OPS_VALUE
>>> );
>>>
>>> stream
>>> .filter(LambdaExample3::isPrime)
>>> .limit(300000)
>>> .forEach(l -> { /*System.out.println(l);*/ });
>>> }
>>> }
>>>
>>> I'm struggling with the StreamOpFlag parameter. What should I pick?
>>> INITIAL_OPS_VALUE seems to work but isn't running anything in parallel
>>> (at least it is not faster than the serial version). NOT_SIZED isn't
>>> working but failing miserably with an OutOfMemoryError. IS_PARALLEL
>>> is not needed because I already use parallel() - should specifying
>>> IS_PARALLEL and using Streams.stream() supposed to go parallel as
>>> well?
>>>
>>> Is the OutOfMemoryError caused by a bug? The OutOfMemoryError is
>>> reported from a fork/join pool thread so at least it is going
>>> parallel? Am I missing something?
>>>
>>> Thanks
>>> Christian
>>>
>>
>
More information about the lambda-dev
mailing list