8072784: Better spliterator implementation for BitSet.stream()
Paul Sandoz
paul.sandoz at oracle.com
Mon Nov 21 16:36:47 UTC 2016
Hi Tagir,
In the original issue i was pondering using SIZED for smaller bit sets and non-SIZED for larger bit sets, since we stride over the longs themselves when calculating the size (should be intrinsic to count the set bits, at least on x86). Supporting both is fairly simple, but you are correct not reporting SIZED would simplify things further and it may not matter in practice.
Can you log an issue? I will follow up with some performance analysis.
Thanks,
Paul.
> On 21 Nov 2016, at 05:59, Tagir F. Valeev <amaembo at gmail.com> wrote:
>
> Hello!
>
> Is it necessary to report SIZED characteristic and calculate the
> cardinality in advance making two-pass traversal? This improves
> bitSet.stream().toArray() performance (also bitSet.stream().count(),
> but it's rare case in practice). However this is just waste of time
> for any other stream operation. It's possible to remove SIZED
> characteristic using "wordsInUse << ADDRESS_BITS_PER_WORD" as initial
> size estimation (or better getFence() result). This would make
> implementation simpler and a little bit faster for all scenarios
> excluding toArray().
>
> What do you think?
>
> With best regards,
> Tagir Valeev
>
>
> PS> Hi,
>
> PS> Please review this patch implementing a spliterator for BitSet:
>
> PS>
> PS> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8072784-bitset-stream-spliterator/webrev/
>
> PS> The spliterator is SIZED but not SUBSIZED, the bit set’s
> PS> cardinality is used (same for the iterator) as the size of the
> PS> root spliterator, and estimates from that are derived thereafter
> PS> for splits. Splitting attempts to balance the small vs. large, dense vs. sparse cases.
>
> PS> Most of the complexity is dealing with the edge case of there
> PS> being a bit set at Integer.MAX_VALUE.
>
>
> PS> Testing-wise i have leveraged the existing tests. It would be
> PS> nice to consider placing the spliterator testing functionality
> PS> into a separate library for reuse as
> PS> SpliteratorTraversingAndSplittingTest is getting large. Likewise
> PS> for more formally for streams, which is possible to reuse but a
> PS> little clunky. I can log issues for those.
>
> PS> Paul.
>
More information about the core-libs-dev
mailing list