RFR: 8349211: Add support for intrusive trees to the utilities red-black tree

Thu Feb 20 13:49:54 UTC 2025

On Tue, 4 Feb 2025 10:39:52 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Hi everyone,
>> 
>> The recently integrated red-black tree can be made more flexible by adding support of intrusive trees. In an intrusive tree, the user has full control over node allocation and placement instead of having the tree manage it internally.
>> 
>> Two key changes enable this feature:
>> 1. Nodes can now be created outside of the tree's internal allocation mechanism, enabling users to allocate and prepare nodes before inserting them into the tree.
>> 2. Cursors have been added to simplify navigation and iteration over the tree. These cursors are when inserting and removing nodes in an intrusive tree, where the internal tree allocator is not used. Additionally, cursors enable iteration over the tree and provide a convenient way to access node values.
>> 
>> 
>> Many of the auxiliary tree functions have been updated to use these new features, resulting in simplified and cleaned-up code. More tests have also been added to cover both new and existing functionality.
>> 
>> An example of how you could use the intrusive tree is found below:
>> 
>> ```c++
>>     struct MyIntrusiveStructure {
>>       Node node; // The tree node is part of an external structure
>>       int data;
>> 
>>       MyIntrusiveStructure(int data, Node node) : node(node), data(data) {}
>>       Node* get_node() { return &node; }
>>       static MyIntrusiveStructure* cast_to_self(Node* node) { return (MyIntrusiveStructure*)node; }
>>     };
>> 
>>     Tree my_intrusive_tree;
>> 
>>     Cursor insert_cursor = my_intrusive_tree.cursor_find(0);
>>     Node insert_node = Node(0);
>> 
>>     // Custom allocation here is just malloc
>>     MyIntrusiveStructure* place = (MyIntrusiveStructure*)os::malloc(sizeof(MyIntrusiveStructure), mtTest);
>>     new (place) MyIntrusiveStructure(0, insert_node);
>> 
>>     my_intrusive_tree.insert_at_cursor(place->get_node(), insert_cursor);
>> 
>>     Cursor find_cursor = my_intrusive_tree.cursor_find(0);
>>     int found_data = MyIntrusiveStructure::cast_to_self(find_cursor.node())->data;
>> 
>> 
>> 
>> Please let me know any feedback or concerns!
>
> @caspernorrbin could you massage this patch a bit to reduce the delta to the last version? That is a good idea in general (I usually do a manual minimize-delta sweep before I undraft a PR for review). A lot seems to be code movement at first glance.

Hi @tstuefe

I'll give some background before I discuss this implementation.

I think a lot of the design here has been taken from or is a consequence of our work in ZGC where we are rewriting parts of our internal memory management. The initial work used an early version of @caspernorrbin's non-intrusive RB-tree. We wanted to reduce the amount of work required under our allocation lock, so we had the idea of making the RB-tree intrusive and permit 'updates' where we can modify a node, and/or replace it as long as we do not break the normal strict node ordering invariant. So I built ZIntrusiveRBTree as an experiment and it worked out very well from a performance perspective. The success of us using an intrusive tree along with your wishes to also have an intrusive tree, made @jdksjolen ask us if we could use Casper's tree if it enabled support for a intrusive nodes. We would very much appreciate using a shared implementation.

So I stopped iterating on the `ZIntrusiveRBTree`. But here is the implementation:
https://github.com/xmas92/jdk/blob/757daf9bac98e7cb04121664d2e562f31e817faa/src/hotspot/share/gc/z/zIntrusiveRBTree.hpp
https://github.com/xmas92/jdk/blob/757daf9bac98e7cb04121664d2e562f31e817faa/src/hotspot/share/gc/z/zIntrusiveRBTree.inline.hpp

> It would be good to have RBNode simplified and defined outside of the tree. Possibly even in a different header. I can see RBNode being used in places where I don't know exactly what tree it goes into. It could even go into multiple trees at the same time or at different times (so the data structure would have either multiple RBNode inlined or a single one that gets repurposed for different trees).

I would prefer this as well, at least in the `ZIntrusiveRBTree` at some point I found it cumbersome having the node be an inner class of the tree, so I moved it out of the tree.

> In that line, it would be good to have the key in RBNode being mutable. Having it const means I am forced to write constructors for containing structures. That is cumbersome.

Our use case would need this, or rather in our use case the key would probably be a dummy. In 'ZIntrusiveRBTree' we materialize the `Key` from `Node*`. Essentially we are keeping tracked of memory regions, so the virtual key is a Memory Range. https://github.com/xmas92/jdk/blob/757daf9bac98e7cb04121664d2e562f31e817faa/src/hotspot/share/gc/z/zMappedCache.cpp#L42-L49

And our lookups uses the actual address (a smaller key) to find where to insert a value. So our compare is always Compare(Key, Node*) (the Compare(Node*, Node*) is only used for verification).
https://github.com/xmas92/jdk/blob/757daf9bac98e7cb04121664d2e562f31e817faa/src/hotspot/share/gc/z/zMappedCache.cpp#L95-L107

STL has a similar concept with `template< class K > iterator find( const K& x );` which allows for a different type to be used for the lookup. But the whole "the key is a function of the `Node*`" is a completely intrusive property, so it creates a discrepancy between the two trees types.

We can use this list as it is, as we can just use the address of the key, but it relies on that implementation hands out a reference to the actual key in the node and not a copy. It feels flaky (and wastes bytes). I think that in the intrusive case having the Comparator compare the key to a `Node*` is preferable. And just having the intrusive node just carry the tree structure data, and let the containing object worry about the key and the value. Or maybe there is some more elegant API I am missing. Could also use meta-programming to select the (Key, Node*) operator if it exist and use (Key, Key) otherwise. 

> I found I had little need for cursors at all. Mostly, they just got in the way. Cursor exists to modify tree structure, but why would I ever do that manually? It is different with simple structures like linked lists, but here the tree balances itself, so it has the last say about its structure anyway. I would be perfectly happy with just the simple ability to add/remove nodes manually, use nodes to find nearby nodes (as in, nodes of nearby keys), iterate nodes with a functor etc.
> The few cases I needed a cursor it was because the API forced me to (e.g. when removing a node from the tree). With insertion, it got very weird. So I have an RBNode*, want to insert it into the tree, now I need an empty Cursor to do that? So, I create an empty cursor with that key, then use that with insert_at_cursor? Why?
> Let's say I already have a nearby node (result of closest_gt, for instance), but it does not satisfy me, so I add a new one. For that I need to call normal insert, so the search is done all over again (see my remark above above). It would be good if we could have an insertion with a node as an insertion hint.

I was in the process of trying to use iterators instead of the cursors when I stopped working on the `ZIntrusiveRBTree`. It works right now for our use case. However there are currently may ways to do the same thing, you can use both Nodes, Cursors and Iterators to walk the tree. Both nodes and iterators track a location in the tree while the cursors have more contextual information, as they also track the insert location.

The cost of using insertion hints is that it requires at least on redundant compare, which should be irrelevant. And requires some extra handling of the Tree::end() / nullptr hint, the easiest way I can think of is to have the tree keeps track of the right most node so it can find the insert location fast. 

Iterator hints are safer to use externally, as they are just hints, the insertion will still respect the comparator function. While with the cursor, you can insert an invalid node somewhere.

As long as we do not care to about wether insert succeeds or not, I think the API where you consume an iterator hint and produce a new iterator as a result could end up useful. Similar to `std::set` / `std::map` API.

The main difference with `Node*`s/iterators over cursors is that one would use `gt` over `find` as `find`. Which I think is more natural in many places as it describes. 
```c++
  // Insert or merge
  auto it = tree.gt(key);
  if (should_merge(*it, node))
    tree.replace(it, merge_node(*it, node));
  else
    tree.insert(it, node);

  // Remove all nodes >=key
  for (auto it = tree.find(key); 
       it != tree.end();
       it = tree.remove(it));

The nice thing with iterators over `Node*`s is that they are more versatile, and we it allows us to remove traversing the tree through `Node*`'s public API, as such they are just some space for the tree to store its data. And I think we also can get a safer interface and type system. You use `Node*` when referring to something that may not be part of the tree and you use iterators when referring to something that is in the tree (sans iterator invalidation, or using the iterator on a different tree). I think I mentioned in #22360 that implementing lightweight iterators (effectively just a fancy `Node*`, in `ZIntrusiveRBTree` I also keep track of the tree to allow for stronger asserts and bidirectionally) allows for a lot of the APIs and the implementation to be implemented in terms of the iterators.

I think regardless of what is used we should have API overloads that do not require using a hint or cursor. `Tree::remove(Node*)` and `Tree::insert(Node*)` (or `Tree::insert(Key, Node*)`).

Just my 2 cents.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23416#issuecomment-2671552936