RFR: 8349211: Add support for intrusive trees to the utilities red-black tree [v7]

Johan Sjölen jsjolen at openjdk.org
Mon Feb 10 18:22:21 UTC 2025


On Mon, 10 Feb 2025 10:44:35 GMT, Casper Norrbin <cnorrbin at openjdk.org> wrote:

>> Hi everyone,
>> 
>> The recently integrated red-black tree can be made more flexible by adding support of intrusive trees. In an intrusive tree, the user has full control over node allocation and placement instead of having the tree manage it internally.
>> 
>> Two key changes enable this feature:
>> 1. Nodes can now be created outside of the tree's internal allocation mechanism, enabling users to allocate and prepare nodes before inserting them into the tree.
>> 2. Cursors have been added to simplify navigation and iteration over the tree. These cursors are when inserting and removing nodes in an intrusive tree, where the internal tree allocator is not used. Additionally, cursors enable iteration over the tree and provide a convenient way to access node values.
>> 
>> 
>> Many of the auxiliary tree functions have been updated to use these new features, resulting in simplified and cleaned-up code. More tests have also been added to cover both new and existing functionality.
>> 
>> An example of how you could use the intrusive tree is found below:
>> 
>> ```c++
>>     struct MyIntrusiveStructure {
>>       Node node; // The tree node is part of an external structure
>>       int data;
>> 
>>       MyIntrusiveStructure(int data, Node node) : node(node), data(data) {}
>>       Node* get_node() { return &node; }
>>       static MyIntrusiveStructure* cast_to_self(Node* node) { return (MyIntrusiveStructure*)node; }
>>     };
>> 
>>     Tree my_intrusive_tree;
>> 
>>     Cursor insert_cursor = my_intrusive_tree.cursor_find(0);
>>     Node insert_node = Node(0);
>> 
>>     // Custom allocation here is just malloc
>>     MyIntrusiveStructure* place = (MyIntrusiveStructure*)os::malloc(sizeof(MyIntrusiveStructure), mtTest);
>>     new (place) MyIntrusiveStructure(0, insert_node);
>> 
>>     my_intrusive_tree.insert_at_cursor(place->get_node(), insert_cursor);
>> 
>>     Cursor find_cursor = my_intrusive_tree.cursor_find(0);
>>     int found_data = MyIntrusiveStructure::cast_to_self(find_cursor.node())->data;
>> 
>> 
>> 
>> Please let me know any feedback or concerns!
>
> Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   empty base optimization reference

src/hotspot/share/utilities/rbTree.hpp line 150:

> 148:   // If a cursor is valid (valid() == true) it points somewhere in the tree.
> 149:   // If the cursor points to an existing node (found() == true), node() can be used to access that node,
> 150:   // Otherwise nullptr is returned, regardless if the node is valid or not.

Style: "nullptr" should be "null" here.

src/hotspot/share/utilities/rbTree.hpp line 166:

> 164:     bool found() const { return *_insert_location != nullptr; }
> 165:     RBNode* node() { return _insert_location == nullptr ? nullptr : *_insert_location; }
> 166:     RBNode* node() const { return _insert_location == nullptr ? nullptr : *_insert_location; }

Is there any case where I don't need to check the validity of the cursor? That is, do I ever want to use `node()` without first calling `valid()` or afterwards checking whether the returned value was null?

If the answer to that is: "No, there is no such case", then we shouldn't return null on `!valid()` node. We should instead add `assert(valid(), "must be");". If the answer is yes, then could you please tell me what that situation is :P?

src/hotspot/share/utilities/rbTree.hpp line 227:

> 225:   // Gets the cursor to the given node.
> 226:   Cursor get_cursor(const RBNode* node);
> 227:   const Cursor get_cursor(const RBNode* node) const;

How about `cursor_of`, or just `cursor`?

src/hotspot/share/utilities/rbTree.hpp line 229:

> 227:   const Cursor get_cursor(const RBNode* node) const;
> 228: 
> 229:   // Moves to the next valid node.

"Valid" is a strange choice of word here. I assume you mean "existing"? As in, not a null child.

src/hotspot/share/utilities/rbTree.hpp line 241:

> 239:   // Finds the cursor to the node associated with the given key.
> 240:   Cursor cursor_find(const K& key);
> 241:   const Cursor cursor_find(const K& key) const;

This is `get_cursor` but with a key value instead of RBNode pointer. I think it's good if this has the same name as `get_cursor`, so use overloading.

src/hotspot/share/utilities/rbTree.hpp line 244:

> 242: 
> 243:   // Inserts the given node at the cursor location
> 244:   // The cursor must not point to an existing node

Missing `.` to end sentences.

src/hotspot/share/utilities/rbTree.hpp line 257:

> 255:   // For all nodes with key < old_node, must also have key < new_node
> 256:   // For all nodes with key > old_node, must also have key > new_node
> 257:   void replace_at_cursor(RBNode* new_node, const Cursor& cursor);

`old_key`, `new_key`. Note, if I miss saying this in cpp file: We might want to run the verification code in an assert when this function is called. It's a pretty dangerous function that really requires that you know your stuff :-).

src/hotspot/share/utilities/rbTree.hpp line 268:

> 266:     const Cursor cursor = cursor_find(key);
> 267:     return cursor.found() ? &cursor.node()->_value : nullptr;
> 268:   }

Are these important to have?

src/hotspot/share/utilities/rbTree.hpp line 283:

> 281:   // Inserts a node with the given key into the tree,
> 282:   // does nothing if the key already exist.
> 283:   void upsert(const K& key) {

`upsert` is a bad name for function, as it is a portmanteau of "update" and "insert", indicating that this should change something if the key is found. What's the goal with this function? It should probably return the allocated node if it's to be useful.

src/hotspot/share/utilities/rbTree.inline.hpp line 192:

> 190: template <typename K, typename V, typename COMPARATOR, typename ALLOCATOR>
> 191: inline const typename RBTree<K, V, COMPARATOR, ALLOCATOR>::Cursor
> 192: RBTree<K, V, COMPARATOR, ALLOCATOR>::cursor_find(const K& key) const {

```c++
using Tree = RBTree<K, V, COMPARATOR, ALLOCATOR:
using Cur = typename RBTree<K, V, COMPARATOR, ALLOCATOR>::Cursor;


Though I'm pretty sure Thomas gave you a `typedef` for `Tree`, I don't think the `Cur` can be done with a `typedef`.

src/hotspot/share/utilities/rbTree.inline.hpp line 207:

> 205:     } else {
> 206:       insert_location = &((*insert_location)->_right);
> 207:     }

while (*insert_location != nullptr) {
  auto find_a_Good_Name = *insert_location; // And replace the usage sites
}

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949653015
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949657034
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949622460
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949623575
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949625408
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949628390
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949629951
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949634506
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949640519
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949645253
PR Review Comment: https://git.openjdk.org/jdk/pull/23416#discussion_r1949649069


More information about the hotspot-dev mailing list