RFR: 8349211: Add support for intrusive trees to the utilities red-black tree [v9]
Thomas Stuefe
stuefe at openjdk.org
Sun Feb 16 07:46:10 UTC 2025
On Wed, 12 Feb 2025 13:39:32 GMT, Casper Norrbin <cnorrbin at openjdk.org> wrote:
>> Hi everyone,
>>
>> The recently integrated red-black tree can be made more flexible by adding support of intrusive trees. In an intrusive tree, the user has full control over node allocation and placement instead of having the tree manage it internally.
>>
>> Two key changes enable this feature:
>> 1. Nodes can now be created outside of the tree's internal allocation mechanism, enabling users to allocate and prepare nodes before inserting them into the tree.
>> 2. Cursors have been added to simplify navigation and iteration over the tree. These cursors are when inserting and removing nodes in an intrusive tree, where the internal tree allocator is not used. Additionally, cursors enable iteration over the tree and provide a convenient way to access node values.
>>
>>
>> Many of the auxiliary tree functions have been updated to use these new features, resulting in simplified and cleaned-up code. More tests have also been added to cover both new and existing functionality.
>>
>> An example of how you could use the intrusive tree is found below:
>>
>> ```c++
>> struct MyIntrusiveStructure {
>> Node node; // The tree node is part of an external structure
>> int data;
>>
>> MyIntrusiveStructure(int data, Node node) : node(node), data(data) {}
>> Node* get_node() { return &node; }
>> static MyIntrusiveStructure* cast_to_self(Node* node) { return (MyIntrusiveStructure*)node; }
>> };
>>
>> Tree my_intrusive_tree;
>>
>> Cursor insert_cursor = my_intrusive_tree.cursor_find(0);
>> Node insert_node = Node(0);
>>
>> // Custom allocation here is just malloc
>> MyIntrusiveStructure* place = (MyIntrusiveStructure*)os::malloc(sizeof(MyIntrusiveStructure), mtTest);
>> new (place) MyIntrusiveStructure(0, insert_node);
>>
>> my_intrusive_tree.insert_at_cursor(place->get_node(), insert_cursor);
>>
>> Cursor find_cursor = my_intrusive_tree.cursor_find(0);
>> int found_data = MyIntrusiveStructure::cast_to_self(find_cursor.node())->data;
>>
>>
>>
>> Please let me know any feedback or concerns!
>
> Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision:
>
> renamed non-value upsert to insert
Hi @caspernorrbin,
Sorry for the delay. I tried to use the tree in a tiny small allocator (basically what I plan to do in Metaspace and in other places), and though it is now possible in principle I think we can simplify things and make them more user-friendly. Here my findings:
- It would be good to have RBNode simplified and defined outside of the tree. Possibly even in a different header. I can see RBNode being used in places where I don't know exactly what tree it goes into. It could even go into multiple trees at the same time or at different times (so the data structure would have either multiple RBNode inlined or a single one that gets repurposed for different trees).
- In that line, it would be good to have the key in RBNode being mutable. Having it const means I am forced to write constructors for containing structures. That is cumbersome.
- To me, it seems the code could be a lot simpler if you were to just use standard subclassing (AbstractRBTree->(NonIntrusiveRBTree|IntrusiveRBTree) etc. All these `std::is_same` are a bit much. There would also be no need for the `RBTreeNoopAllocator`. The vtable tax you'd pay won't matter much in reality since I don't foresee many cases where the specific tree type is not known.
- I found I had little need for cursors at all. Mostly, they just got in the way. Cursor exists to modify tree structure, but why would I ever do that manually? It is different with simple structures like linked lists, but here the tree balances itself, so it has the last say about its structure anyway. I would be perfectly happy with just the simple ability to add/remove nodes manually, use nodes to find nearby nodes (as in, nodes of nearby keys), iterate nodes with a functor etc.
- The few cases I needed a cursor it was because the API forced me to (e.g. when removing a node from the tree). With insertion, it got very weird. So I have an RBNode*, want to insert it into the tree, now I need an empty Cursor to do that? So, I create an empty cursor with that key, then use that with insert_at_cursor? Why?
- Let's say I already have a nearby node (result of closest_gt, for instance), but it does not satisfy me, so I add a new one. For that I need to call normal insert, so the search is done all over again (see my remark above above). It would be good if we could have an insertion with a node as an insertion hint.
- I found that I miss a closest_ge (greater or equal). Please add that. It's really needed for memory management trees.
- In closest_gt() etc, please extend the comments to say what the behavior is when the node is not found. I assume null is returned?
I'll continue the work later (have vacation next week).
-------------
PR Review: https://git.openjdk.org/jdk/pull/23416#pullrequestreview-2592252537
More information about the hotspot-dev
mailing list