RFR: 8155043: BitMap set operations assume clear bits beyond unaligned end
Kim Barrett
kim.barrett at oracle.com
Mon Aug 8 18:56:48 UTC 2016
> On Aug 5, 2016, at 9:27 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>
> Hi Kim,
>
> This looks mostly good to me. I have a few comments inlined:
Thanks for looking at this.
New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8155043/webrev.01/
incr: http://cr.openjdk.java.net/~kbarrett/8155043/webrev.01.inc/
Responses inline.
> On 2016-08-05 00:02, Kim Barrett wrote:
>> Please review this change to the BitMap set operations to avoid
>> depending on or modifying bits beyond the size of the BitMap when the
>> size is not word aligned. This change also adds some unit tests for
>> these operations.
>>
>> In addition:
>>
>> - Changed set_from to use Copy::disjoint_words. For other set
>> operations, use a consistent iteration idiom for all.
>>
>> - In the set_xxx_with_result operations, eliminated conditionals in
>> the accumulation of the result.
>
> I guess this was done for performance reasons? Have you measured and seen any gains with this? The generated code becomes significantly larger when using the or/xor combination in your patch.
It was supposed to be an optimization. Even if there is some register
usage restriction on x86[_64] that inhibits the obvious (though small)
improvement, the result should be no worse there, and should still be
better on more regular architectures.
Unfortunately, with the new change accumulation, gcc4.9 on x86_64
*completely* loses its way, and instead generates very strange code.
Yuck! The inner loop appears to be not obviously better or worse than
with the old change accumulation, but there's a huge amount of
additional setup and teardown code around that inner loop.
So backing that out. Thanks for noticing that.
>> - Removed unused BitMap::set_intersection_at_offset, rather than
>> writing tests for it.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8155043
>>
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8155043/webrev.00/
>
> Some minor comments/suggestions that you might want to consider:
>
> ==========
> http://cr.openjdk.java.net/~kbarrett/8155043/webrev.00/src/share/vm/utilities/bitMap.cpp.frames.html
>
> There are a few places where the tail calculations are slightly different compared to the calculations of the aligned part. Would it be good to use the same bitwise operations, just to get consistency between the two parts?
>
> For example, in BitMap::contains we have
>
> 408 bm_word_t value = dest_map[index];
> 409 if (value != (value | other_map[index])) return false;
>
> vs
>
> 414 return (rest == 0) || tail_of_set(~dest_map[limit] & other_map[limit], rest) == 0;
>
> We could replace 408-409 with 'if (~dest_map[limit] & other_map[limit]) == 0) return false’.
That one was a "mistake"; thanks for spotting it.
Also regularized is_full.
> ----------
> I find that the introduction of the 'bm_word_t orig' variable just makes the code extra noisy, and would have preferred the calculations without it.
I find the repeated use of "dest_map[index]" noisy and harder to read,
and prefer to keep the orig variable. One place I kept the repetition
was in the assignment, since making orig a reference, e.g.
bm_word_t& orig = ...; ... uses orig ...; orig = ...;
seemed very unlike Hotspot style.
> ----------
>
> The new code has a few functions named *_of_set. The terminology "set" is previously not used in the BitMap file, so I'm a bit reluctant to introduce it. Could we get rid of it by renaming the functions:
> tail_of_set -> tail_of_map
> merge_tail_of_set -> merge_tail_of_map
>
> or maybe:
> tail_of_set -> tail_bits
> merge_tail_of_set -> merge_tail_bits
I like tail_of_set => tail_of_map. Done.
> ==========
> http://cr.openjdk.java.net/~kbarrett/8155043/webrev.00/test/native/utilities/test_bitMap_setops.cpp.html
>
> The file tests more than the set operations, so maybe the _setops suffix should be dropped?
I think the only non-setop is set_from, and it's here because it has
the same considerations. Full coverage of BitMap requires far more
than what's here, and I don't think a single monolithic test file is
necessary, or even good in a case like this class that has a lot of
functionality. For example, I have a new test_bitMap_search.cpp
(testing iterate and get_next_one_offset & friends) in an mq patch
waiting for submission.
> ----------
> Some of the tests restore the previous bitmap value after all checks, but some don't. It would probably be good to make that more consistent.
I've added WithBit{Set,Clear} scoped helpers and use them throughout.
Also fixed some inconsistencies in use of ASSERT_xxx.
> ----------
> class BitMapMemory {
> ...
> private:
> idx_t _words;
> bm_word_t* _memory;
> };
>
> Most of the classes in the HotSpot have the members at the top of the class.
It seems I've just been continually surprised and mildly irritated by
a lot of our class layouts, without twigging to it being a consistent
"style". Now that you've mentioned it, I looked around and found
that, while there is the occasional counter-example, that order is by
far dominant.
I'm surprised because it is contrary to every C++ style guide / best
practice recommendation I've seen. [Variable declarations first seems
to be common Java style, but this is C++ code.] Irritated because I
agree with the rationale for those recommendations, e.g. usually one
is looking at a class definition as a client interested in its public
API, and having to scan / scroll down past private implementation
details is wasting time in that case. Oh, well, when in Rome...
> ----------
> static bm_word_t make_even_bits() {
> bm_word_t result = 1;
> while (true) {
> bm_word_t next = (result << 2) | 1;
> if (next == result) return result;
> result = next;
> }
> }
>
> I'd prefer if you moved down the return statement to a separate line and add braces.
Done.
More information about the hotspot-dev
mailing list