Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams

forax at univ-mlv.fr forax at univ-mlv.fr
Mon Mar 18 23:38:24 UTC 2019


----- Mail original -----
> De: "Stuart Marks" <stuart.marks at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Samedi 16 Mars 2019 02:04:05
> Objet: Re: Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams

>> I'm talking about the perf difference between stream.forEach and for(var
>> element: stream), forEachRemaining may be slower because for the VM the ideal
>> case is to see the creation of the Stream and the call to the terminal
>> operation inside the same inlining horizon so the creation of the Stream itself
>> can be elided.
>> 
>> A bit of history: they have been several prototypes of how to implement the
>> stream API before the current one, one of them (i think it's the first one) was
>> based on iterators and iterators of iterators, one for each step of the Stream.
>> The perf of that implementation was good until there was too many intermediary
>> ops calls on the Stream and at that point perf were really bad. It's because
>> the VM has two way to find the type of something in a generic code, it can
>> build a profile by remembering what class was used for a method call or it can
>> propagate the type of an argument to the type of the corresponding parameter.
>> Because an iterator stores the element to return in a field, you are loosing
>> the later way to optimize and the former only work if you have no more than 2
>> different classes in the profile.
>> So while Stream.iterator() may be optimized, it's not that simple.
> 
> Yes, I remember this prototype. Sure, iterating from stream.iterator() will
> likely be slower than stream.forEach(), because of (current) limitations in JIT
> compilation. This may be important for performance-critical applications. So if
> you have such an application, you should be aware of possible performance issues
> using such an iterator(), measure, and recode if necessary.
> 
> Is this an argument not to allow Stream in a for-loop? I don't think so. There's
> a (fairly narrow) set of use cases where it's really necessary, and in most
> cases performance isn't an issue. After all, people use things like
> List<Integer> which is known to be terrible for large, performance-critical
> applications. But most apps are small and aren't performance critical, and for
> those, it's just fine.

I suppose that if you do a presentation a devoxx on that subject and if IDEs recommend to use Stream.forEach instead of the enhanced for loop when possible then it will be fine.

> 
>>>> This proposal has the side effect of making Stream more different from its
>>>> primitive counterpart IntStream, LongStream and DoubleStream which may be
>>>> problematic because we are trying to introduce reified generics as part of
>>>> Valhalla (there is a recent mail of Brian about not adding methods to
>>>> OptionalInt for the same reason).
>>>
>>> Well, yes, I think that it means that Stream evolves somewhat independently of
>>> Int/Long/DoubleStream, but I don't see that this imposes an impediment on
>>> generic specialization in Valhalla. In that world, Stream<int> should (mostly)
>>> just work. It may also be possible in a specialized world to add the specific
>>> things from IntStream (such as sum() and max()) to Stream<int>.
>> 
>> We may want more here, like having Stream<int> being a subtype of IntStream so
>> there is only one implementation for IntStream and Stream<int>.
>> Thus adding a method that make IntStream and Stream<Object> different just make
>> this kind of retrofitting more unlikely.
> 
> I think the argument about specialization runs the other way, which is not to
> add stuff to IntStream.
> 
> Adding IterableOnce to Stream shouldn't really affect anything with respect to
> generic specialization. The type is already Stream<T>. The Iterable<T> methods
> that are inherited (iterator, spliterator, forEach) all match existing methods
> on Stream, at least structurally. So I don't see that this would cause a
> problem.
> 
> (Hm, I note that there is a slight semantic disagreement between
> Iterable::forEach and Stream::forEach. Stream::forEach allows parallelism, which
> isn't mentioned in Iterable::forEach. Somebody could conceivably call
> Iterable::forEach with a consumer that's not thread-safe, and if a parallel
> stream gets passed in, it would break that consumer. This strikes me as an edge
> case to be filed off, rather than a fatal problem, though.)

My fear is more than we will want in the future to have one code for all kinds of Stream, but Stream<Object> will have to implement Iterable<Object> while Stream<int> will not, this not something you can actually do with the current generics, we may be able to do that with the reified generics but some languages that already have reified generics like Swift are not able to do that.
So by making Stream<T> to have different set of supertypes than Stream<int>, you are forcing the future reified generics implementation to work on this case because we will never introduce an implementation of reified generics that doesn't support the classe of java.util.


> 
> 
>>>> And, the real issue is how to deal with checked exceptions inside the Stream
>>>> API, i would prefer to fix that issue instead of trying to find a way to
>>>> workaround it.
>>>
>>> Well I'd like to have a solution for checked exceptions as well, but there
>>> doesn't appear to be one on the horizon. I mean, there are some ideas floating
>>> around, but nobody is working on them as far as I know.
>> 
>> as far as i know, there are two of them,
>> - one is to get ride of checked exception, even Kotlin which tout itself as a
>> language that is more safe that Java doesn't have checked exception, basically
>> Java is the only language that run of the JVM and have checked exception.
>> - the other is to automatically wrap checked exceptions into a corresponding
>> unchecked exception by letting the compiler generate the code that users
>> currently write when the checked exception appear some context
>>    by example with the keyword autowrap,
>>    - you have the autowrap block (syntactically like a synchronized block)
>>        autowrap {
>>          return Files.newInputStream(path);   // IOException is transformed to
>>          UncheckedIOException by calling IOException.wrap()
>>        }
>>    - you can use autowrap on a method declaration
>>       void foo(Path path) autowrap {
>>         return Files.newInputStream(path);   // IOException is transformed to
>>         UncheckedIOException by calling IOException.wrap()
>>       }
>>    - you can use autowrap with a functional interface
>>       void runBlock(autoWrap Consumer<String> consumer) { ... }
>>       ...
>>       runblock(() -> {
>>         Files.newInputStream(path);         // IOException is transformed to
>>         UncheckedIOException by calling IOException.wrap()
>>       });
> 
> I can think of several other approaches but I don't want to discuss them here.

the thing is that we have already established that we only want the enhanced for loop in a small number of use cases, if we get ride of checked exceptions one way or another, we may end up will a really small number of use cases, for the enhanced for loop over a stream.

> 
>>> But checked exceptions aren't the only reason to prefer iteration in some cases;
>>> loops offer more flexible control flow (break/continue) and easier handling of
>>> side effects. The Streams+IterableOnce feature benefits these cases as well as
>>> exception handling.
>> 
>> the break/continue equivalent on Stream are
>> skip/limit/findFirst/takeWhile/dropWhile i.e. any short-circuit terminal
>> operations.
> 
> Right: many, but not all loops with "break" can be rewritten to use streams with
> a short-circuit terminal operation. But sometimes it's difficult, or you have to
> contort the stream in a particular way in order to get the result you want. For
> cases like those, sometimes it's just easier to write a loop.

fair enough.

so to summarize, i'm not against the making the enhanced for loop working with streams
- if Brian think it's Ok to have Stream and IntStream to have different super types
- if people are Ok that maybe the enhanced for loop is slower than a call to stream.forEach and that's fine
- if the semantics of Iterable is fixed to allow to use the iterator only once instead of introducing IterableOnce.

> 
> s'marks

regards,
Rémi


More information about the core-libs-dev mailing list