RFR: JDK-8277095 : Empty streams create too many objects

Philippe Marschall duke at openjdk.java.net
Mon Nov 15 18:11:41 UTC 2021


On Fri, 5 Nov 2021 12:53:46 GMT, kabutz <duke at openjdk.java.net> wrote:

> This is a draft proposal for how we could improve stream performance for the case where the streams are empty. Empty collections are common-place. If we iterate over them with an Iterator, we would have to create one small Iterator object (which could often be eliminated) and if it is empty we are done. However, with Streams we first have to build up the entire pipeline, until we realize that there is no work to do. With this example, we change Collection#stream() to first check if the collection is empty, and if it is, we simply return an EmptyStream. We also have EmptyIntStream, EmptyLongStream and EmptyDoubleStream. We have taken great care for these to have the same characteristics and behaviour as the streams returned by Stream.empty(), IntStream.empty(), etc. 
> 
> Some of the JDK tests fail with this, due to ClassCastExceptions (our EmptyStream is not an AbstractPipeline) and AssertionError, since we can call some methods repeatedly on the stream without it failing. On the plus side, creating a complex stream on an empty stream gives us upwards of 50x increase in performance due to a much smaller object allocation rate. This PR includes the code for the change, unit tests and also a JMH benchmark to demonstrate the improvement.

I have a similar project at [empty-streams](https://github.com/marschall/empty-streams). A couple of notes:

1. I found the need for streams to be stateful. I had need for the following state: 
   1. closed
   2. ordered
   3. parallel
   4. sorted
   5. closeHandler
   5. comparator (on EmptyStream)
   A shared instance can not be used because of `#close`.
2. I have a `PrimitiveIterator` that short circuits `#next` and `#forEachRemaining` as well. 
3. I made many methods just return `this` after checking for operated on and closed:
    1. `#filter` `#map`, `#flatMap`, `#mapMulti`, `#distinct`, `#peek`, `#limit`, `#skip`, `#dropWhile`, `#takeWhile`.
    2. These do a state change state as well `#sequential`, `#parallel`, `#unordered`, `#sorted`, `#onClose`.
 4. I made my `EmptyBaseStream` implement `BaseStream` and make `EmptyIntLongDoubleStream` extend from this class as `IntLongDoubleStream` all extend `BaseStream`. This allowed me to move the following methods up in the hierarchy `#isParallel` , `#onClose`, `#sequential`, `#parallel`, `#unordered`.
 5. Is there any reason why you make `#parallel` "bail out" to `StreamSupport` rather than just do a state change?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6275


More information about the core-libs-dev mailing list