RFR: 8305734: BitSet.get(int, int) always returns the empty BitSet when the Integer.MAX VALUE is set

Stuart Marks smarks at openjdk.org
Tue May 9 00:15:16 UTC 2023


On Fri, 7 Apr 2023 12:22:03 GMT, Andy-Tatman <duke at openjdk.org> wrote:

> See https://bugs.java.com/bugdatabase/view_bug?bug_id=8305734

Re disallowing bit Integer.MAX_VALUE: I have a feeling we can do better than this.

The first paragraph of the spec states: "The bits of a BitSet are indexed by nonnegative integers." That is, valid indexes are in the range 0 to Integer.MAX_VALUE, inclusive. There are clearly some issues when bits at or near MAX_VALUE are set, but certain operations work perfectly fine. For example, setting and getting individual bits works for the full range, and printing or streaming the BitSet works fine. Other APIs such as cardinality() and length() mostly work, except when they need to represent exactly 2^31; in those cases they return Integer.MIN_VALUE because of wraparound. However, they work if the return value is interpreted as an _unsigned_ int. A sufficiently careful program can work with these values effectively by using some rudimentary unsigned support methods such as Integer.toUnsignedLong().

Some fixes in fairly recent times (JDK 9) were made to improve the behavior here. See [JDK-8076442](https://bugs.openjdk.org/browse/JDK-8076442).

The API methods that take ranges such as clear(), flip(), get(), and set() are deficient in that they can't operate on the full range of bits. That doesn't imply that the Integer.MAX_VALUE bit should be disallowed. We could accept their limitations; or we allow a special case values for the range that includes Integer.MIN_VALUE (effectively treating them as unsigned values); or we add long-based APIs for these and other methods that encounter problems with Integer.MAX_VALUE. This can quickly reach the point of diminishing returns, but it seems like there are some reasonable possibilities in this space.

> Problem:
BitSet.length() returns a negative value when Integer.MAX_VALUE is set, such as by using the set(int) method or by passing large arrays to a BitSet constructor.
This also causes the get(fromIndex, toIndex) method to always return the empty BitSet when passed with valid parameters, regardless of the value of fromIndex and toIndex.

This isn't correct. When length() returns Integer.MIN_VALUE, this doesn't _necessarily_ cause get(from, to) to always return an empty BitSet. Well, it does in the current implementation, but that's simply a bug that can be fixed. The implementation of the get() method can access the internal data of the object and do the right thing, regardless of what the length() method does. That change should probably be made, regardless of other spec changes we've been discussing.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13388#issuecomment-1539218598


More information about the core-libs-dev mailing list