RFR 8170159 Improve the performance of BitSet traversal

Martin Buchholz martinrb at google.com
Fri Jun 15 14:57:04 UTC 2018


Code looks correct to me, but as usual there are some things I would try to
do differently,

1292                         while (word != 0 && tz < 63) {


I'd try to make this loop simply
while (word != 0)

---

Probably neither of us is fond of the bug magnet special case for
Integer.MAX_VALUE.  Maybe just store fence as a long or as a wordindex
which can't overflow (while giving up on intra-word splitting?)

Here's a fun program demonstrating cardinality overflow:

import java.util.BitSet;

public class BitSetCardinality {
    public static void main(String[] args) throws Throwable {
        BitSet s = new BitSet();
        s.flip(0, Integer.MAX_VALUE);
        System.out.println(s.cardinality());
        s.flip(Integer.MAX_VALUE);
        System.out.println(s.cardinality());
    }
}


==> java -Xmx20g -esa -ea BitSetCardinality
2147483647
-2147483648


On Thu, Jun 14, 2018 at 2:35 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:

> Hi,
>
> Please review this enhancement to improve the performance of BitSet
> traversal when using a stream.
>
>   http://cr.openjdk.java.net/~psandoz/jdk/JDK-8170159-
> bitset-traverse/webrev/ <http://cr.openjdk.java.net/~
> psandoz/jdk/JDK-8170159-bitset-traverse/webrev/>
>
> The associated issue started out life referring to the BitSet’s
> spliterator reporting SIZED, and to report that the cardinality has to be
> computed. This has some cost but performance analysis (see issue for
> attached benchmark) has shown that the cost is small compared to the cost
> of traversal. It is recognized that the cost is higher for smaller BitSets
> but not unduly so. Thus it was concluded that it was reasonable for the
> spliterator to report SIZED.
>
> The issue was adjusted to focus on improving the performance of the
> BitSet’s spliterator forEachRemaining method. For large randomized BitSets
> a 1.6x speedup can be observed, and for smaller sizes a more modest speed
> up. The prior conclusion about reporting SIZED still holds.
>
> Thanks,
> Paul.


More information about the core-libs-dev mailing list