Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams

Thu Mar 14 08:07:22 UTC 2019

----- Mail original -----
> De: "Stuart Marks" <stuart.marks at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mardi 12 Mars 2019 22:45:12
> Objet: Re: Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams

> Hi Remi,
> 
>> Stream.iterator() can be really really slow, it uses a pull semantics while
>> the whole Stream push values. When designing it, the lambda EG saw it as an
>> "escape hatch" in order to interropt with a legacy code than require an
>> Iterator and not more.
> 
> If Stream.iterator() is slow, then perhaps it needs to be optimized. Tagir had
> some ideas for how to do that. Of course, I don't know if that's exactly the
> right way; some additional investigation should be done. But poor performance
> relative to Spliterator.tryAdvance() or forEachRemaining() shouldn't be an
> argument against doing this. People repeatedly bump into the gap in the
> programming model between streams and the enhanced-for loop, and it's time to
> fill it in.

I'm talking about the perf difference between stream.forEach and for(var element: stream), forEachRemaining may be slower because for the VM the ideal case is to see the creation of the Stream and the call to the terminal operation inside the same inlining horizon so the creation of the Stream itself can be elided.

A bit of history: they have been several prototypes of how to implement the stream API before the current one, one of them (i think it's the first one) was based on iterators and iterators of iterators, one for each step of the Stream. The perf of that implementation was good until there was too many intermediary ops calls on the Stream and at that point perf were really bad. It's because the VM has two way to find the type of something in a generic code, it can build a profile by remembering what class was used for a method call or it can propagate the type of an argument to the type of the corresponding parameter. Because an iterator stores the element to return in a field, you are loosing the later way to optimize and the former only work if you have no more than 2 different classes in the profile.
So while Stream.iterator() may be optimized, it's not that simple.

> 
>> This proposal has the side effect of making Stream more different from its
>> primitive counterpart IntStream, LongStream and DoubleStream which may be
>> problematic because we are trying to introduce reified generics as part of
>> Valhalla (there is a recent mail of Brian about not adding methods to
>> OptionalInt for the same reason).
> 
> Well, yes, I think that it means that Stream evolves somewhat independently of
> Int/Long/DoubleStream, but I don't see that this imposes an impediment on
> generic specialization in Valhalla. In that world, Stream<int> should (mostly)
> just work. It may also be possible in a specialized world to add the specific
> things from IntStream (such as sum() and max()) to Stream<int>.

We may want more here, like having Stream<int> being a subtype of IntStream so there is only one implementation for IntStream and Stream<int>.
Thus adding a method that make IntStream and Stream<Object> different just make this kind of retrofitting more unlikely. 

> 
>> And, the real issue is how to deal with checked exceptions inside the Stream
>> API, i would prefer to fix that issue instead of trying to find a way to
>> workaround it.
> 
> Well I'd like to have a solution for checked exceptions as well, but there
> doesn't appear to be one on the horizon. I mean, there are some ideas floating
> around, but nobody is working on them as far as I know.

as far as i know, there are two of them,
- one is to get ride of checked exception, even Kotlin which tout itself as a language that is more safe that Java doesn't have checked exception, basically Java is the only language that run of the JVM and have checked exception. 
- the other is to automatically wrap checked exceptions into a corresponding unchecked exception by letting the compiler generate the code that users currently write when the checked exception appear some context
  by example with the keyword autowrap,
  - you have the autowrap block (syntactically like a synchronized block)
      autowrap {
        return Files.newInputStream(path);   // IOException is transformed to UncheckedIOException by calling IOException.wrap()
      }
  - you can use autowrap on a method declaration
     void foo(Path path) autowrap {
       return Files.newInputStream(path);   // IOException is transformed to UncheckedIOException by calling IOException.wrap()
     }
  - you can use autowrap with a functional interface
     void runBlock(autoWrap Consumer<String> consumer) { ... }
     ...
     runblock(() -> {
       Files.newInputStream(path);         // IOException is transformed to UncheckedIOException by calling IOException.wrap()
     });

> 
> But checked exceptions aren't the only reason to prefer iteration in some cases;
> loops offer more flexible control flow (break/continue) and easier handling of
> side effects. The Streams+IterableOnce feature benefits these cases as well as
> exception handling.

the break/continue equivalent on Stream are skip/limit/findFirst/takeWhile/dropWhile i.e. any short-circuit terminal operations.

> 
> s'marks

Rémi