RFR: 8213415: BitMap::word_index_round_up overflow problems
Kim Barrett
kim.barrett at oracle.com
Thu Nov 28 22:01:34 UTC 2019
> On Nov 28, 2019, at 7:42 AM, Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>
> Hi Kim,
>
> http://cr.openjdk.java.net/~kbarrett/8213415/open.03/src/hotspot/share/utilities/bitMap.hpp.udiff.html
>
> + // Limit max_size_in_bits so aligning up to a word never overflows.
> + static idx_t max_size_in_words() { return raw_to_words_align_down(~idx_t(0)); }
> + static idx_t max_size_in_bits() { return max_size_in_words() * BitsPerWord; }
>
> Could we have better comments? I first thought that max_size_in_words() means bitmap size, only the static specifier ticked me off.
I'm not sure I understand your issue. I modified the comment a bit to
address what I think you are saying, but I'm not sure I've covered it.
> So, if I understand this correctly, we need this since we use the same type for bit and word indices and since we have a n:1 relationship between those two the max. bit index is necessary smaller than <type>_MAX?
The point is to avoid overflow of the type used for bit indices when
aligning a value up to a multiple of the word size. This doesn't
really have anything to do with using the same types for bit indices
and word indices, though using different types might affect the
details of some of the calculations, and the range for the word type
would need to be suitably chosen to accomodate the bit range.
> + // Assumes relevant validity checking for bit has already been done.
> + static idx_t raw_to_words_align_up(idx_t bit) {
> + return raw_to_words_align_down(bit + (BitsPerWord - 1));
> + }
>
> Interestingly, this could break should we ever have different types for word- and bit indices. Should we ever want to templatize the underlying types, eg. to support small bitmaps where 8 or 16 bits would be enough to encode a bitmap index. Then it could make sense to have a larger type for word indices than for bit indices and we could overflow here.
I don't understand why one would need a larger type for word indices?
The range for word indices is necessarily smaller than the range for
bit indices.
> (Just idle musings.. we have talked about using different types for bit- and word-indexes before. I have worked on a patch for this, but it got snowed under by other work. See: http://cr.openjdk.java.net/~stuefe/webrevs/bitmap-improvements/bitmap-better-types. What do you think, does this still make sense?)
Making the types different obviously has some benefits for type
safety, if implemented in a way that actually provides a distinction.
Just using different typedefs for the same underlying type doesn't
accomplish that though. I think a strong type for words could be
hidden in the private implementation, and have thought about doing
something like that, but not for this change.
> +
> + // Verify size_in_bits does not exceed maximum size.
> + static void verify_size(idx_t size_in_bits) NOT_DEBUG_RETURN;
>
> "maximum size" is unclear. Can you make the comment more precise?
I modified the comments for the verify functions to be more explicit.
> Also, could this function be private?
I put the new verify functions near the pre-existing ones. There are a
lot of protected functions in this class that seem like they ought to
be private. I don't want to address that in this change.
> + // Verify bit is less than size.
> + void verify_index(idx_t bit) const NOT_DEBUG_RETURN;
> + // Verify bit is not greater than size.
> + void verify_limit(idx_t bit) const NOT_DEBUG_RETURN;
> + // Verify [beg,end) is a valid range, e.g. beg <= end <= size().
> + void verify_range(idx_t beg, idx_t end) const NOT_DEBUG_RETURN;
>
> I have a bit of a trouble understanding these variants and how they are used.
>
> I believe to understand that verify_limit() is to verify range sizes, since the valid range for a size type is always one larger than that for an index type, right? But it is used to test indices for validity, for example via to_words_align_down() -> word_addr(). Which would mean for a 64bit sized bitmap (1 word size) I can feed an invalid "64" as index to word_addr(), which verify_limit() would accept but would get me the invalid word index "1". I would have expected word_addr() to only return usable word adresses and to assert in this case.
verify_limit checks that the argument is a valid index or
one-past-the-last designator, e.g. it is a valid iteration limit.
"limit" was the preferred term from discussion with tschatzl.
There are uses of word_addr() to obtain a one-past-the-last pointer.
> I also feel that if I got it right the naming is the wrong way around. I would have expected "verify_size()" to refer to the bitmap object map size, and "verify_limit()" to the type limit (similar to limits.h)..
verify_size is about checking the validity of a value that will be
used as the size of a bitmap. Maybe the comment changes help?
New webrevs:
full: https://cr.openjdk.java.net/~kbarrett/8213415/open.04/
incr: https://cr.openjdk.java.net/~kbarrett/8213415/open.04.inc/
More information about the hotspot-dev
mailing list