RFR: 8305638: Refactor Template Assertion Predicate Bool creation and Predicate code in Split If and Loop Unswitching
Emanuel Peter
epeter at openjdk.org
Fri Dec 22 17:28:57 UTC 2023
On Fri, 22 Dec 2023 17:14:36 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> I left quite a few comments below, because I think the algorithm is difficult to understand.
>> I like the idea of splitting it into smaller parts.
>>
>> If I understand the basic algorithm, we do this:
>> DFS, where we traverse the inputs recursively, using the `DFSNodeStack`, which uses the filter `TemplateAssertionPredicateBool::could_be_part`.
>>
>> If you see a init/stride `Opaque1` node, then you obviously clone it.
>>
>> If you come back from an input to a output (use to def), then you may get back a cloned input node.
>> If the input node is not a clone, then we have to do nothing.
>> If the input node is a clone, we also have to clone the output node. Clone it if it is not already a clone. And then update the current input slot, so we do not point to the pre-cloned input, but to the cloned input.
>
> So we need to do work whenever we walk back down a input->output edge. We can traverse this edge, by taking the post-order traversal, and then waling from the post-order visited node to its output node.
>
>
> Node* current;
> while (_stack.is_not_empty()) {
> current = _stack.top();
> if (current->is_Opaque1()) {
> Node* transformed_node = transform_opaque_loop_node(current, transform_opaque_loop_nodes);
> _stack.replace_top(transformed_node);
> traverse_edge_back_to_output_and_pop();
> } else if (!_stack.push_next_unvisited_input()) {
> traverse_edge_back_to_output_and_pop();
> } // else: we just pushed an new input, go and visit it first
> }
>
> traverse_edge_back_to_output_and_pop:
> Node* maybe_cloned_input = _stack.pop();
> if (_stack.is_empty()) {
> return; // we just visited the root node, and are now done. Maybe verify we have the root here?
> }
> Node* output = _stack.top();
> int i = _stack.top_index();
> if (!is_clone(output)) {
> output = output->clone();
> _stack.replace_top(output);
> }
> output->set_req(i, maybe_cloned_input);
An even easier algorithm would be to have an additional `old_new` data-structure that knows the clones for the old nodes.
Then you simply post-order traverse, and whenever you post-visit a node (current, all its inputs were already traversed, and if need be cloned), you check all the inputs, and see if any of them were cloned. If so, clone current also, and add it to `old_new`. That would mean that you do not have to hack the DFS iterator, and could use it in read-only mode. And it would be a much simpler algorithm. But the whole thing comes as the cost of this extra `old_new` data-structure.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16877#discussion_r1435232649
More information about the hotspot-compiler-dev
mailing list