BitMap vs VectorSet?

Liu, Xin xxinliu at amazon.com
Mon Oct 31 17:02:19 UTC 2022


hi, John,

Thank you for your pointer. It makes a lot of sense.  I also prefer
BitMap rather than VectorSet. I check who is including 'vectset.hpp'.
They are all from C2 compiler except 'share/ci/bcEscapeAnalyzer.hpp'.

I manage to change VectorSet to BitMap in Unique_Node_List. It's not
drop-in substitution. The interfaces are different. VectorSet is
elastic. eg. VectorSet::test returns false when index is out of range.
It can automatically grow to the specified index in VectorSet::test_set.
 Storage-wise, VectorSet supports both Resource and Arena but
ResourceBitMap and ArenaBitMap are two independent classes in bitMap.hpp.

I will file a JBS to fill the gap first.

Here is what I have done. It doesn't create any trouble after
substitution. of course, we are going to test it thoroughly.

https://github.com/openjdk/jdk/compare/master...navyxliu:jdk:refactor_bitmap

thanks,
--lx


On 10/27/22 11:07 AM, John Rose wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> This is a good question, and a reasonable time to engage with it as well.
> 
> The code in src/hotspot/share/libadt is very old (~30 years) and has not been vigorously maintained, relative to changes in C++ practice.
> 
> BitMap is newer and its code is more modern and actively maintained.  (Thanks Kim B. et al.)  In my opinion, it deserves to be used instead of VectorSet.  Having two classes for the same set of tasks is technical debt.  It is the sort of thing that naturally happens on a large project when two groups settle on different solutions for a common problem.  And the sort of thing a large project has to clean up over time.
> 
> So, I support taking steps (reasonable, gradual, limited-impact steps) to retire the classes in libadt, replacing them with more modern ones, in src/hotspot/share/utilities.
> 
> As you note, Xin, this is somewhat more urgent as we increase our gtest coverage, since we don’t want to develop two sets of new tests for the same application area.
> 
> It’s also more urgent as hardware technology advances:  If we choose to work harder to vectorize bitset operations, we don’t want to do that work twice.
> 
> It’s also more urgent as software technology advances.  I’m reminded that there are many new algorithms for bit-twiddling (the name Daniel Lemire comes to mind, for one) and (again: get the pattern?) we don’t want to upgrade the algorithms twice.
> 
> People working on HotSpot routinely deal with extremely challenging software problems, specific to optimization, garbage collection, concurrency, platform support, performance, and more.  It’s hard to always, also, select well-tested, well-designed shared libraries to help with these problems; sometimes you just have to roll your own.  But while understandable, that tends to create technical debt.  That sometimes leads to failures (unscheduled payments on technical debt) due to lack of test coverage, simply because some work somewhere didn’t use common code but “rolled their own” algorithms.
> 
> So, yes, let’s double down on BitSet and retire VectorSet.
> 
> (And Dict, while we are at it: There are some really nice hash table classes developed recently by the runtime group; surely we are not too far from merging the job of Dict into that group of classes?)
> 
> — John
> 
> On 27 Oct 2022, at 0:07, Liu, Xin wrote:
> 
>> hi,
>>
>> Recently, I extended VectorSet(libadt/vectset.hpp) to support
>> intersection. Then I found that there is yet another "vector set" called
>> 'BitMap'. From my reading, the data structure of BitMap is same as
>> VectorSet. BitMap has already supported intersection/union/difference
>> and it even owns corresponding gtests.
>>
>> Given that, why does HotSpot still need to maintain both VectorSet and
>> BitMap? I grep VectorSet and there are still many references. Is that
>> too troublesome to replace VectorSet with BitMap or there are other
>> reasons?
>>
>> I modified VectorSet because Unique_Node_List(node.hpp) uses it. I
>> hesitate now. Should I invest a new gtest for the new code or I just use
>> BitMap instead?
>>
>> thanks,
>> --lx
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xB9D934C61E047B0D.asc
Type: application/pgp-keys
Size: 3675 bytes
Desc: OpenPGP public key
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20221031/edd2f877/OpenPGP_0xB9D934C61E047B0D-0001.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20221031/edd2f877/OpenPGP_signature-0001.sig>


More information about the hotspot-runtime-dev mailing list