RFR: 8345314: Add a red–black tree as a utility data structure
Johan Sjölen
jsjolen at openjdk.org
Mon Dec 2 15:46:17 UTC 2024
On Mon, 25 Nov 2024 13:11:52 GMT, Casper Norrbin <cnorrbin at openjdk.org> wrote:
> Hi everyone,
>
> This effort began as an exploration of replacing the current NMT treap with a red-black tree. Along the way, I discovered that others were also interested in having a general-purpose tree structure available within HotSpot.
>
> The red-black tree is designed to serve as a drop-in replacement for the existing NMT treap, keeping a nearly identical interface. However, I’ve also added a few additional requested features, such as an iterator.
>
> Testing builds off the treap tests, adding a few extra that inserts/removes and checks that the tree is correct. Testing uses the function `verify_self`, which iterates over the tree and checks att all red-black tree properties hold. Additionally, the tree has been tested in vmatree instead of the treap without any errors.
>
> For those who may want to revisit the fundamentals of red-black trees, [Wikipedia](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree) offers a great summary with tables covering the various balancing cases. Alternatively, your favorite data structure book could provide even more insight.
Initial review
src/hotspot/share/utilities/rbTree.hpp line 47:
> 45:
> 46: private:
> 47: enum Color { BLACK, RED };
`enum Color : uint8_t { BLACK, RED };`
Minimize to 1 byte
src/hotspot/share/utilities/rbTree.hpp line 60:
> 58: RBNode* _left;
> 59: RBNode* _right;
> 60: unsigned int _black_height; // MSB encodes Red/Black, 0 - red, 1 - black
Suggestion: Use explicitly sized type `uint32_t`
src/hotspot/share/utilities/rbTree.hpp line 61:
> 59: RBNode* _left;
> 60: RBNode* _right;
> 61: Color _color;
Move to the end, this avoids padding and might save us some memory.
src/hotspot/share/utilities/rbTree.hpp line 79:
> 77: #endif // ASSERT
> 78: _key = new_key;
> 79: }
Seems unused? Kill it, it's too dangerous to be left alive!
src/hotspot/share/utilities/rbTree.hpp line 89:
> 87: _black_height &= (-1U >> 1);
> 88: }
> 89: RBNode(const K& k, const V& v)
Space missing
src/hotspot/share/utilities/rbTree.hpp line 96:
> 94: return _parent != nullptr && _parent->_right == this;
> 95: }
> 96: bool is_left_child() {
Space
src/hotspot/share/utilities/rbTree.hpp line 317:
> 315: void fix_violations(RBNode* node) {
> 316: if(node->is_black()) { // node's value was updated
> 317: return; // Tree is already correct
Fix indentation of comment
src/hotspot/share/utilities/rbTree.hpp line 377:
> 375: }
> 376:
> 377: void remove_inner(RBNode* node) {
Seems like this function is made to remove nodes whose color is black and whose removal would cause an imbalance. Maybe we can have a better name for this function or documentation, and also a few asserts?
src/hotspot/share/utilities/rbTree.hpp line 381:
> 379: RBNode* parent = node->_parent;
> 380: while (parent != nullptr) {
> 381: RBNode* sibling = node->is_left_child() ? parent->_right : parent->_left;
I'm missing why this is guaranteed to not be null
src/hotspot/share/utilities/rbTree.hpp line 474:
> 472: RBNode* parent = node->_parent;
> 473: RBNode* left = node->_left;
> 474: RBNode* right = node->_right;
We don't get that much usage out of these. If you make them references, or `RBNode*&`, then you can stop accessing `node->_left` etc and use these directly.
src/hotspot/share/utilities/rbTree.hpp line 475:
> 473: RBNode* left = node->_left;
> 474: RBNode* right = node->_right;
> 475: if (left != nullptr) { // node has a left only-child
In each branch assert that the other child is missing. Document this assumption in a comment above the function (along with a rudimentary explanation). Also, add in the assertions regarding the nodes' color.
src/hotspot/share/utilities/rbTree.hpp line 482:
> 480: node->_left->_parent = node->_parent;
> 481: if (parent == nullptr) {
> 482: _root = node->_left;
`assert` that node is root.
src/hotspot/share/utilities/rbTree.hpp line 710:
> 708: }
> 709:
> 710: void insert(const K& k, const V& v) {
Why is `insert` needed on top of `upsert`?
src/hotspot/share/utilities/rbTree.hpp line 933:
> 931:
> 932: return true;
> 933: }
I think these are *great* and I want them in. They might be difficult for some reviewers to accept, because they allow you to break stuff like node count. I know that I asked you to implement these, but I think it's best to remove them for now and add them back in a future PR. I'm sorry!!!
-------------
PR Review: https://git.openjdk.org/jdk/pull/22360#pullrequestreview-2458414366
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1856601465
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865455956
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1856602354
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1856603838
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865456938
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865457281
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865616079
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865498056
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865527034
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865471302
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865465101
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1865469976
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1856612496
PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1856616952
More information about the hotspot-dev
mailing list