RFR: 8280915: Better parallelization for AbstractSpliterator and IteratorSpliterator when size is unknown [v4]
Tagir F.Valeev
tvaleev at openjdk.java.net
Thu Feb 10 03:22:42 UTC 2022
> See the bug description for details.
>
> I propose a simple solution. Let's allow ArraySpliterator to be non-SIZED and report artificial estimatedSize(), much bigger than the real one. This will allow AbstractSpliterator and IteratorSpliterator to produce prefix whose size is comparable to Long.MAX_VALUE (say, starting with Long.MAX_VALUE/2), and this will enable further splitting of the prefix. This change will drastically improve parallel streaming for affected streams of size <= 1024 and significantly improve for streams of size 1025..20000. The cost is higher-grained splitting for huge streams of unknown size. This might add a minor overhead for such scenarios which, I believe, is completely tolerable.
>
> No public API changes are necessary, sequential processing should not be affected, except an extra field in ArraySpliterator which increases a footprint by 8 bytes.
>
> I added a simple test using an artificial collector to ensure that at least two non-empty parts are created when parallelizing Stream.iterate source. More testing ideas are welcome.
Tagir F. Valeev has updated the pull request incrementally with one additional commit since the last revision:
Benchmark to demonstrate the patch usefulness
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/7279/files
- new: https://git.openjdk.java.net/jdk/pull/7279/files/726e73e3..fbe8a704
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7279&range=03
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7279&range=02-03
Stats: 80 lines in 1 file changed: 80 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/jdk/pull/7279.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/7279/head:pull/7279
PR: https://git.openjdk.java.net/jdk/pull/7279
More information about the core-libs-dev
mailing list