RFR: 8345314: Add a red–black tree as a utility data structure [v12]

Casper Norrbin cnorrbin at openjdk.org
Fri Jan 17 16:55:40 UTC 2025


On Thu, 16 Jan 2025 17:15:34 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   clarified comment
>
> src/hotspot/share/utilities/rbTree.inline.hpp line 92:
> 
>> 90: #ifdef ASSERT
>> 91: template <typename K, typename V, typename COMPARATOR, typename ALLOCATOR>
>> 92: inline bool RBTree<K, V, COMPARATOR, ALLOCATOR>::RBNode::is_correct(
> 
> Some proposals for better/more useful/more expressive verification:
> 
> 1) I would use asserts inside the `Node::is_correct` function. I would not bother returning a bool but assert right away. This is used during tree verification only, and ideally, we want to crash hard and fast right there to have aa point to quickly debug. Arguably, I would rename the function to something like `verify()` (I'll use that name from now on)
> 
> 2) I would then also not use combined asserts but assert each condition separately.
> 
> 3) Please use size_t (or uintx, which I prefer but I know others don't) for the number of nodes/number of blacks in the arguments of this function. 
> 
> 4) I would not pass in limits to be checked by the node. Instead would let Node::verify() increase stats.
> 
> 5) I would also return longest/shortest path to any leaf, and black-nodes-til-leaf, like this (see below point(6)):
> 
> 
> void verify(
>   size_t& num_nodes,          // total nodes 
>   size_t& blacks_until_leaf,  // returns the number of black nodes from here to its leafs
>   size_t& longest_leaf_path,  // length of longest to-leaf path from here
>   size_t& shortest_leaf_path, // length of shortest to-leaf path from here
>   size_t current_depth
>   ) const;
> 
> you also can create a Stat struct and pass a reference to that if you dislike so many arguments.
> 
> 6) In Node::verify(), I would explicitly check the following red-black tree rules: 
>   - If Node is red, it shall have no red childs; (I see you do this already)
>   - If a Node has exactly one child, the child must be red (consequently, the node must be black)
>   - After calling `Node::verify() `for left and right child recursively, the blacks-until-leaf for right node and the one for the left node should be the same (which is also why we return just one value for `blacks_until_leaf`)
>   - the longest leaf path and the shortest leaf path must never be farther apart than factor 2 (so, `shortest_leaf_path <= longest_leaf_path <= shortest_leaf_path * 2 `). That asserts that we are balanced as far as RB trees go.
> 
> 7) In debug builds, please give the Node class a marker bit: `DEBUG_ONLY(bool _visited)`. Keep inside of the Tree class a `DEBUG_ONLY("bool _expected_visited")` bit. On Node creation/addition, set node->_visited to tree->_expected_vi...

Thank you for the many suggestions! I have reworked the verification based on this, including everything in the list.

We assert quicker and separately, values and limits are sent up instead of down to be validated, and I added a lot more checks. The visited bit and gtest was a very elegant solution (especially for the future intrusive nodes)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22360#discussion_r1920475051


More information about the hotspot-dev mailing list