RFR: 8305734: BitSet.get(int, int) always returns the empty BitSet when the Integer.MAX VALUE is set

Stuart Marks smarks at openjdk.org
Sat Apr 22 05:25:43 UTC 2023


On Sat, 22 Apr 2023 01:26:08 GMT, Chen Liang <liach at openjdk.org> wrote:

>> @AlanBateman I would be happy to fill out the CSR, but unfortunately I don't think I am able to currently as I'm not an author on OpenJDK and as such don't have a JBS account. (I reported the bug through the Oracle site, rather than through bugs.openjdk.org .)
>> Would you be able to help out with this?
>
>> I would be happy to fill out the CSR, but unfortunately I don't think I am able to currently as I'm not an author on OpenJDK and as such don't have a JBS account. (I reported the bug through the Oracle site, rather than through bugs.openjdk.org .) Would you be able to help out with this?
> 
> You can provide text for the Summary, Problem, Solution, and Specification parts of the CSR in this GitHub pull request, and I can help create a CSR for you with your provided contents. See https://wiki.openjdk.org/display/csr/Fields+of+a+CSR+Request
> 
>  - The Summary describes your CSR in a few concise sentences, like "Reject interactions with Integer.MAX_VALUE bit in BitSet"
>  - The Problem is what prompts the CSR, for example, it may be `BitSet.length()` may return a negative value when the `Integer.MAX_VALUE` bit is set on a non-sticky-sized bitset, created either by setting `Integer.MAX_VALUE` bit or passing in very huge buffer or arrays in `valueOf` factories.
>  - The Solution is what resolves the problem described in this CSR, for example, it may be that we explicitly reject interacting with `Integer.MAX_VALUE` bit; and for array/buffer-initialized bitsets, we ignore bits higher than or equal to `Integer.MAX_VALUE `(or throw an exception if such an instance is attempted to be created), so we always have a nonnegative int length().
>    - I personally recommend silently ignoring `Integer.MAX_VALUE` bit and truncating everything else beyond than throwing, since currently, all bits beyond are inaccessible due to rejection of negative indices
>  - The Specification is usually the Javadoc changes that describes these behavior changes. For example, you may edit the specification of `length()` to indicate it returns nonnegative; edit the specification of `valueOf()` to indicate the bits higher than or equal to Integer.MAX_VALUE are ignored (or throw an exception to prevent such creations); and edit the specification of other methods `set` `get` etc. to reject `Integer.MAX_VALUE` as an inclusive upper-bound by throwing an `IndexOutOfRangeException`.
>    - The specification changes is usually rendered as a git patch, so you can just commit your code in this pr and I can generate a patch as specification for you.
> 
> Also, you need to describe the compatibility risk: It should be "low", as existing users that access `Integer.MAX_VALUE` bit will fail in all non-sticky-sized modes. The changes to `valueOf` behavior (either silently ignore `Integer.MAX_VALUE` bit or throwing) should be mentioned as well.

@liach Thanks for offering to help with the CSR.

Before we proceed with the CSR though, we need to decide what to do. I see the initial choice as: either (1) improve the behavior incrementally, by fixing get() for most bits other than the MAX_VALUE bit, and leaving other current misbehaviors as they are; or (2) pursue a comprehensive fix.

Note that (1) does not involve a spec change and thus doesn't require a CSR.

(2) will most likely change the spec and require a CSR. However, if we elect to pursue a comprehensive solution, there will need to be a design discussion. There are a large variety of approaches here. One is to reject setting of the MAX_VALUE bit along the lines of what @liach described above. A variation would be to have the rejection or noisy (i.e., throwing an exception). Still another possibility is to support setting and getting of all non-negative bits, with special-case behavior (e.g., returning MIN_VALUE in certain cases), and possibly also adding long-returning or -accepting APIs that provide sensible behaviors for the full range of bits.

There are also other possibilities, such as investigating sparse bitsets. That would be a big project, and would most likely involve investigating and doing something with external libraries such as [Roaring Bitmaps](https://roaringbitmap.org/). While interesting, I'd say that's out of scope for what started out as a simple bugfix.

What do you think?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13388#issuecomment-1518515543


More information about the core-libs-dev mailing list