RFR: 8300487: Store cardinality as a field in BitSet [v5]
Uwe Schindler
uschindler at openjdk.org
Wed Jan 18 14:18:23 UTC 2023
On Wed, 18 Jan 2023 12:43:31 GMT, fabioromano1 <duke at openjdk.org> wrote:
>> The enanchment is useful for applications that make heavy use of BitSet objects as sets of integers, and therefore they need to make a lot of calls to cardinality() method, which actually require linear time in the number of words in use by the bit set.
>> This optimization reduces the cost of calling cardinality() to constant time, as it simply returns the value of the field, and it also try to make as little effort as possible to update the field, when needed.
>>
>> Moreover, it has been implemented a new method for testing wheter a bit set includes another bit set (i.e., the set of true bits of the parameter is a subset of the true bits of the instance).
>
> fabioromano1 has updated the pull request incrementally with one additional commit since the last revision:
>
> Added author and reverse the cicle order in includes(BitSet)
As one dealing with bitsets very often: In my opinion, adding this to bitset adds too much overhead on the hot methods like set/clear methods. Especially loops that populate a BitSet with values, I am not sure if the whole loop can be vectorized by Hotspot after this patch.
>From my experience with Apache Lucene it has been shown, that you use bitsets often to track existence/not existence in very hot loops. Calling cardinality() is done only after you have build the bitset and before you use it in a read-only way.
I know there are other usages patterns, but also code using BitSet as a compact `Set<Integer>` does not need cardinality information.
I also have a comment about the code: There is an assert-only method called `checkInvariants()` that is executed after all bulk operations. This one should also check the cardinality fits the current words. This is especially important after those crazy methods that intersect bitsets and have special handling of first/last word.
-------------
PR: https://git.openjdk.org/jdk/pull/11837
More information about the core-libs-dev
mailing list