8072784: Better spliterator implementation for BitSet.stream()

Mon Nov 21 13:59:03 UTC 2016

Hello!

Is it necessary to report SIZED characteristic and calculate the
cardinality in advance making two-pass traversal? This improves
bitSet.stream().toArray() performance (also bitSet.stream().count(),
but it's rare case in practice). However this is just waste of time
for any other stream operation. It's possible to remove SIZED
characteristic using "wordsInUse << ADDRESS_BITS_PER_WORD" as initial
size estimation (or better getFence() result). This would make
implementation simpler and a little bit faster for all scenarios
excluding toArray().

What do you think?

With best regards,
Tagir Valeev

PS> Hi,

PS> Please review this patch implementing a spliterator for BitSet:

PS>  
PS> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8072784-bitset-stream-spliterator/webrev/

PS> The spliterator is SIZED but not SUBSIZED, the bit set’s
PS> cardinality is used (same for the iterator) as the size of the
PS> root spliterator, and estimates from that are derived thereafter
PS> for splits. Splitting attempts to balance the small vs. large, dense vs. sparse cases.

PS> Most of the complexity is dealing with the edge case of there
PS> being a bit set at Integer.MAX_VALUE.

PS> Testing-wise i have leveraged the existing tests. It would be
PS> nice to consider placing the spliterator testing functionality
PS> into a separate library for reuse as
PS> SpliteratorTraversingAndSplittingTest is getting large. Likewise
PS> for more formally for streams, which is possible to reuse but a
PS> little clunky. I can log issues for those.

PS> Paul.