Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams
Stuart Marks
stuart.marks at oracle.com
Sat Mar 16 01:04:05 UTC 2019
> I'm talking about the perf difference between stream.forEach and for(var element: stream), forEachRemaining may be slower because for the VM the ideal case is to see the creation of the Stream and the call to the terminal operation inside the same inlining horizon so the creation of the Stream itself can be elided.
>
> A bit of history: they have been several prototypes of how to implement the stream API before the current one, one of them (i think it's the first one) was based on iterators and iterators of iterators, one for each step of the Stream. The perf of that implementation was good until there was too many intermediary ops calls on the Stream and at that point perf were really bad. It's because the VM has two way to find the type of something in a generic code, it can build a profile by remembering what class was used for a method call or it can propagate the type of an argument to the type of the corresponding parameter. Because an iterator stores the element to return in a field, you are loosing the later way to optimize and the former only work if you have no more than 2 different classes in the profile.
> So while Stream.iterator() may be optimized, it's not that simple.
Yes, I remember this prototype. Sure, iterating from stream.iterator() will
likely be slower than stream.forEach(), because of (current) limitations in JIT
compilation. This may be important for performance-critical applications. So if
you have such an application, you should be aware of possible performance issues
using such an iterator(), measure, and recode if necessary.
Is this an argument not to allow Stream in a for-loop? I don't think so. There's
a (fairly narrow) set of use cases where it's really necessary, and in most
cases performance isn't an issue. After all, people use things like
List<Integer> which is known to be terrible for large, performance-critical
applications. But most apps are small and aren't performance critical, and for
those, it's just fine.
>>> This proposal has the side effect of making Stream more different from its
>>> primitive counterpart IntStream, LongStream and DoubleStream which may be
>>> problematic because we are trying to introduce reified generics as part of
>>> Valhalla (there is a recent mail of Brian about not adding methods to
>>> OptionalInt for the same reason).
>>
>> Well, yes, I think that it means that Stream evolves somewhat independently of
>> Int/Long/DoubleStream, but I don't see that this imposes an impediment on
>> generic specialization in Valhalla. In that world, Stream<int> should (mostly)
>> just work. It may also be possible in a specialized world to add the specific
>> things from IntStream (such as sum() and max()) to Stream<int>.
>
> We may want more here, like having Stream<int> being a subtype of IntStream so there is only one implementation for IntStream and Stream<int>.
> Thus adding a method that make IntStream and Stream<Object> different just make this kind of retrofitting more unlikely.
I think the argument about specialization runs the other way, which is not to
add stuff to IntStream.
Adding IterableOnce to Stream shouldn't really affect anything with respect to
generic specialization. The type is already Stream<T>. The Iterable<T> methods
that are inherited (iterator, spliterator, forEach) all match existing methods
on Stream, at least structurally. So I don't see that this would cause a problem.
(Hm, I note that there is a slight semantic disagreement between
Iterable::forEach and Stream::forEach. Stream::forEach allows parallelism, which
isn't mentioned in Iterable::forEach. Somebody could conceivably call
Iterable::forEach with a consumer that's not thread-safe, and if a parallel
stream gets passed in, it would break that consumer. This strikes me as an edge
case to be filed off, rather than a fatal problem, though.)
>>> And, the real issue is how to deal with checked exceptions inside the Stream
>>> API, i would prefer to fix that issue instead of trying to find a way to
>>> workaround it.
>>
>> Well I'd like to have a solution for checked exceptions as well, but there
>> doesn't appear to be one on the horizon. I mean, there are some ideas floating
>> around, but nobody is working on them as far as I know.
>
> as far as i know, there are two of them,
> - one is to get ride of checked exception, even Kotlin which tout itself as a language that is more safe that Java doesn't have checked exception, basically Java is the only language that run of the JVM and have checked exception.
> - the other is to automatically wrap checked exceptions into a corresponding unchecked exception by letting the compiler generate the code that users currently write when the checked exception appear some context
> by example with the keyword autowrap,
> - you have the autowrap block (syntactically like a synchronized block)
> autowrap {
> return Files.newInputStream(path); // IOException is transformed to UncheckedIOException by calling IOException.wrap()
> }
> - you can use autowrap on a method declaration
> void foo(Path path) autowrap {
> return Files.newInputStream(path); // IOException is transformed to UncheckedIOException by calling IOException.wrap()
> }
> - you can use autowrap with a functional interface
> void runBlock(autoWrap Consumer<String> consumer) { ... }
> ...
> runblock(() -> {
> Files.newInputStream(path); // IOException is transformed to UncheckedIOException by calling IOException.wrap()
> });
I can think of several other approaches but I don't want to discuss them here.
>> But checked exceptions aren't the only reason to prefer iteration in some cases;
>> loops offer more flexible control flow (break/continue) and easier handling of
>> side effects. The Streams+IterableOnce feature benefits these cases as well as
>> exception handling.
>
> the break/continue equivalent on Stream are skip/limit/findFirst/takeWhile/dropWhile i.e. any short-circuit terminal operations.
Right: many, but not all loops with "break" can be rewritten to use streams with
a short-circuit terminal operation. But sometimes it's difficult, or you have to
contort the stream in a particular way in order to get the result you want. For
cases like those, sometimes it's just easier to write a loop.
s'marks
More information about the core-libs-dev
mailing list