Integrated: 8283775: better dump: VM support for graph querying in debugger with BFS traversal and node filtering

Mon Jun 13 11:49:01 UTC 2022

On Fri, 29 Apr 2022 13:04:55 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> **What this gives you for the debugger**
> - BFS traversal (inputs / outputs)
> - node filtering by category
> - shortest path between nodes
> - all paths between nodes
> - readability in terminal: alignment, sorting by node idx, distance to start, and colors (optional)
> - and more
> 
> **Some usecases**
> - more readable `dump`
> - follow only nodes of some categories (only control, only data, etc)
> - find which control nodes depend on data node (visit data nodes, include control in boundary)
> - how two nodes relate (shortest / all paths, following input/output nodes, or both)
> - find loops (control / memory / data: call all paths with node as start and target)
> 
> **Description**
> I implemented VM support for BFS traversal of IR nodes, where one can filter for input/output edges, and the node-type (control / data / memory). If one specifies the `target` node, we find a shortest path between `this` and `target`. With the `options` string one can easily select which node types to visit (`cdmxo`)  and which to include only in the boundary (`CDMXO`). To find all paths between two nodes, include the letter `A` in the options string.
> 
> `void Node::dump_bfs(const int max_distance, Node* target, char const* options)`
> 
> To get familiar with the many options, run this to get help:
> `find_node(0)->dump_bfs(0,0,"h")`
> 
> While a sufficient summary and description is in the comments above the function definition, I want to explain some useful use-cases here.
> 
> Please let me know if you would find this helpful, or if you have any feedback to improve it.
> Thanks, Emanuel
> 
> PS: I do plan to refactor the `dump` code in `node.cpp` to use my new infrastructure. I will also remove `Node::related` and `dump_related,` since it has not been properly extended and maintained. But that refactoring would risk messing with tools that depend on `dump`, which I would like to avoid for now, and do that in a second step.
> 
> **Better dump()**
> The function is similar to `dump()`, we can also follow input / output edges up to a certain distance. There are a few improvements/added features:
> 
> 1. Display the distance to the specified node. This way I can quickly assess how many nodes I have at a certain distance, and I can traverse edges from one distance to the next.
> 2. Choose if you want to traverse only input `+` or output `-` edges, or both `+-`.
> 3. Node filtering with types taken from `type->category()` help one limit traversal to control, data or memory flow. The categories are the same as in IGV.
> 4. Separate visit / boundary filters by node type: traverse graph visiting only some node types (eg. data). On the boundary, also display but do not traverse nodes allowed by boundary filter (eg. control). This can be useful to traverse outputs of a data node recursively, and see what control nodes depend on it. Use `dcmxo` for visit filter, and `DCMXO` for boundary filter.
> 5. Probably controversial: coloring of node types in terminal. By default it is off. May not work on all terminals. But I find it very useful to quickly separate memory, data and control - just as in IGV - but in my terminal! Highly recommend putting the `#` in the options string! To more easily trace chains of nodes, I highlight the node idx of all nodes that are displayed in their respective colors.
> 6. After matching, we create Mach nodes from the IR nodes. For most Mach nodes, there is an associated old node. Often an assert is triggered for Mach nodes, but the graph already breaks in the IR graph. It is thus very helpful to have the old node idx displayed. Use `@` in options string.
> 7. I also display the head node idx of the block a Mach node is scheduled for. This can be helpful if one gets bad dominance asserts, and needs to see why the blocks were scheduled in a bad way. Use `B` in options string.
> 8. Some people like the displayed nodes to be sorted by node idx. Simply add an `S` to the option string!
> 
> Example (BFS inputs):
> 
> (rr) p find_node(161)->dump_bfs(2,0,"dcmxo+")
> dist dump
> ---------------------------------------------
>    2  159  CmpI  === _ 137 40  [[ 160 ]]  !orig=[144] !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    2  147  IfTrue  === 161  [[ 166 ]] #1 !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    2  165  OuterStripMinedLoop  === 165 93 164  [[ 165 166 ]] 
>    1  160  Bool  === _ 159  [[ 161 ]] [lt] !orig=[145] !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    1  166  CountedLoop  === 166 165 147  [[ 166 161 102 103 ]] stride: 1  strip mined !orig=[157],[99] !jvms: StringLatin1::hashCode @ bci:16 (line 193)
>    0  161  CountedLoopEnd  === 166 160  [[ 162 147 ]] [lt] P=0.957374, C=19675.000000 !orig=[146] !jvms: StringLatin1::hashCode @ bci:13 (line 193)
> 
> 
> Example (BFS control inputs):
> 
> (rr) p find_node(163)->dump_bfs(5,0,"c+")
> dist dump
> ---------------------------------------------
>    5  147  IfTrue  === 161  [[ 166 ]] #1 !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    5  165  OuterStripMinedLoop  === 165 93 164  [[ 165 166 ]] 
>    4  166  CountedLoop  === 166 165 147  [[ 166 161 102 103 ]] stride: 1  strip mined !orig=[157],[99] !jvms: StringLatin1::hashCode @ bci:16 (line 193)
>    3  161  CountedLoopEnd  === 166 160  [[ 162 147 ]] [lt] P=0.957374, C=19675.000000 !orig=[146] !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    2  162  IfFalse  === 161  [[ 167 168 ]] #0 !orig=148 !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    1  167  SafePoint  === 162 1 7 1 1 168 1 136 37 40 137 1  [[ 163 ]]  SafePoint  !orig=138 !jvms: StringLatin1::hashCode @ bci:37 (line 193)
>    0  163  OuterStripMinedLoopEnd  === 167 22  [[ 164 148 ]] P=0.957374, C=19675.000000
> 
> We see the control flow of a strip mined loop.
> 
> 
> Experiment (BFS only data, but display all nodes on boundary)
> 
> (rr) p find_node(102)->dump_bfs(10,0,"dCDMOX-")
> dist dump
> ---------------------------------------------
>    0  102  Phi  === 166 22 136  [[ 133 132 ]]  #int !jvms: StringLatin1::hashCode @ bci:16 (line 193)
>    1  133  SubI  === _ 132 102  [[ 136 ]]  !jvms: StringLatin1::hashCode @ bci:25 (line 194)
>    1  132  LShiftI  === _ 102 131  [[ 133 ]]  !jvms: StringLatin1::hashCode @ bci:25 (line 194)
>    2  136  AddI  === _ 133 155  [[ 153 167 102 ]]  !jvms: StringLatin1::hashCode @ bci:32 (line 194)
>    3  153  Phi  === 53 136 22  [[ 154 ]]  #int !jvms: StringLatin1::hashCode @ bci:13 (line 193)
>    3  167  SafePoint  === 162 1 7 1 1 168 1 136 37 40 137 1  [[ 163 ]]  SafePoint  !orig=138 !jvms: StringLatin1::hashCode @ bci:37 (line 193)
>    4  154  Return  === 53 6 7 8 9 returns 153  [[ 0 ]] 
> 
> We see the dependent output nodes of the data-phi 102, we see that a SafePoint and the Return depend on it. Here colors are really helpful, as it makes it easy to separate the data-nodes (blue) from the boundary-nodes (other colors).
> 
> Example with Mach nodes:
> 
> (rr) p find_node(280)->dump_bfs(2,0,"cdmxo+ at B")
> dist [block  head  idom depth]   old dump
> ---------------------------------------------
>    2     B6   379   377     4   o118   38  sarI_rReg_CL  === _ 39 40  [[ 41 36 31 31 71 75 66 82 86 103 116 148 152 161 119 119 184 186 170 281 268 ]]  !jvms: String::length @ bci:9 (line 1487) ByteVector::putUTF8 @ bci:1 (line 285)
>    2    B52   441   277    23   o738  283  incI_rReg  === _ 285  [[ 284 285 281 ]] #1/0x00000001 !jvms: ByteVector::putUTF8 @ bci:131 (line 300)
>    2    B50   277   439    22   o756  282  IfTrue  === 273  [[ 441 ]] #1 !jvms: ByteVector::putUTF8 @ bci:100 (line 302)
>    1    B52   441   277    23   o737  281  compI_rReg  === _ 283 38  [[ 280 ]] 
>    1    B52   441   277    23      _  441  Region  === 441 282  [[ 441 280 290 ]] 
>    0    B52   441   277    23   o757  280  jmpLoopEnd  === 441 281  [[ 279 347 ]] P=0.500000, C=21462.000000 !jvms: ByteVector::putUTF8 @ bci:79 (line 300)
> 
> And the query on the old nodes:
> 
> (rr) p find_old_node(741)->dump_bfs(2,0,"cdmxo+#")
> dist dump
> ---------------------------------------------
>    2 o1871  AddI  === _ o79 o1872  [[ o739 o1948 o761 o1477 ]] 
>    2  o186  AddI  === _ o1756 o1714  [[ o1756 o739 o1055 ]] 
>    2  o178  If  === o1159 o177 o176  [[ o179 o180 ]] P=0.800503, C=7153.000000
>    1  o739  CmpI  === _ o186 o1871  [[ o740 o741 ]] 
>    1  o740  Bool  === _ o739  [[ o741 ]] [lt]
>    1  o179  IfTrue  === o178  [[ o741 ]] #1
>    0  o741  CountedLoopEnd  === o179 o740 o739  [[ o742 o190 ]] [lt] P=0.993611, C=7200.000000
> 
> 
> **Exploring loop body**
> When I find myself in a loop, I try to localize the loop head and end, and map out at least one path between them.
> `loop_end->print_bfs(20, loop_head, "c+")`
> This provides us with a shortest control path, given this path has a distance of at most 20.
> 
> Example (shortest path over control nodes):
> 
> (rr) p find_node(741)->dump_bfs(20,find_node(746),"c+")
> dist dump
> ---------------------------------------------
>    5  746  CountedLoop  === 746 745 190  [[ 746 148 141 ]] stride: 1  strip mined !orig=[735],[138] !jvms: StringLatin1::replace @ bci:22 (line 304)
>    4  148  RangeCheck  === 746 147  [[ 149 152 ]] P=0.999999, C=-1.000000 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    3  149  IfTrue  === 148  [[ 178 166 ]] #1 !orig=170 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    2  178  If  === 149 177  [[ 179 180 ]] P=0.800503, C=7153.000000 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    1  179  IfTrue  === 178  [[ 741 ]] #1 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    0  741  CountedLoopEnd  === 179 740  [[ 742 190 ]] [lt] P=0.993611, C=7200.000000 !orig=[189] !jvms: StringLatin1::replace @ bci:19 (line 303)
> 
> 
> Once we see this single path in the loop, we may want to see more of the body. For this, we can run an `all paths` query, with the additional character `A` in the options string. We see all nodes that lay on a path between the start and target node, with at most the specified path length.
> 
> Example (all paths between two nodes):
> 
> (rr) p find_node(741)->dump_bfs(8,find_node(746),"cdmxo+A")
> dist apd dump
> ---------------------------------------------
>    6   8  146  CmpU  === _ 141 79  [[ 147 ]]  !jvms: StringLatin1::replace @ bci:25 (line 304)
>    5   8  166  LoadB  === 149 7 164  [[ 176 747 ]]  @byte[int:>=0]:exact+any *, idx=5; #byte !jvms: StringLatin1::replace @ bci:25 (line 304)
>    5   8  147  Bool  === _ 146  [[ 148 ]] [lt] !jvms: StringLatin1::replace @ bci:25 (line 304)
>    5   5  746  CountedLoop  === 746 745 190  [[ 746 148 141 ]] stride: 1  strip mined !orig=[735],[138] !jvms: StringLatin1::replace @ bci:22 (line 304)
>    4   5  141  Phi  === 746 36 186  [[ 185 186 162 146 154 154 747 ]]  #int:0..max-1:www #tripcount !orig=[161] !jvms: StringLatin1::replace @ bci:22 (line 304)
>    4   8  176  CmpI  === _ 166 169  [[ 177 ]]  !jvms: StringLatin1::replace @ bci:28 (line 304)
>    4   5  148  RangeCheck  === 746 147  [[ 149 152 ]] P=0.999999, C=-1.000000 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    3   5  186  AddI  === _ 141 51  [[ 185 739 141 ]]  !orig=[738],... !jvms: StringLatin1::replace @ bci:13 (line 303)
>    3   8  177  Bool  === _ 176  [[ 178 ]] [ne] !jvms: StringLatin1::replace @ bci:28 (line 304)
>    3   5  149  IfTrue  === 148  [[ 178 166 ]] #1 !orig=170 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    2   5  739  CmpI  === _ 186 79  [[ 740 ]]  !orig=[187] !jvms: StringLatin1::replace @ bci:19 (line 303)
>    2   5  178  If  === 149 177  [[ 179 180 ]] P=0.800503, C=7153.000000 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    1   5  740  Bool  === _ 739  [[ 741 ]] [lt] !orig=[188] !jvms: StringLatin1::replace @ bci:19 (line 303)
>    1   5  179  IfTrue  === 178  [[ 741 ]] #1 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    0   5  741  CountedLoopEnd  === 179 740  [[ 742 190 ]] [lt] P=0.993611, C=7200.000000 !orig=[189] !jvms: StringLatin1::replace @ bci:19 (line 303)
> 
> We see there are multiple paths. We can quickly see that there are paths with length 5 (`apd = 5`): the control flow, but also the data flow for the loop-back condition. We also see some paths with length 8, which feed into `178 If` and `148 Rangecheck`. Node that the distance `d` is the distance to the start node `741  CountedLoopEnd`. The all paths distance `apd` computes the sum of the shortest path from the current node to the start plus the shortest path to the target node. Thus, we can easily compute the distance to the target node with `apd - d`.
> 
> An alternative to detect loops quickly, is running an all paths query from a node to itself:
> 
> Example (loop detection with all paths):
> 
> (rr) p find_node(741)->dump_bfs(7,find_node(741),"c+A")
> dist apd dump
> ---------------------------------------------
>    6   7  190  IfTrue  === 741  [[ 746 ]] #1 !jvms: StringLatin1::replace @ bci:19 (line 303)
>    5   7  746  CountedLoop  === 746 745 190  [[ 746 148 141 ]] stride: 1  strip mined !orig=[735],[138] !jvms: StringLatin1::replace @ bci:22 (line 304)
>    4   7  148  RangeCheck  === 746 147  [[ 149 152 ]] P=0.999999, C=-1.000000 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    3   7  149  IfTrue  === 148  [[ 178 166 ]] #1 !orig=170 !jvms: StringLatin1::replace @ bci:25 (line 304)
>    2   7  178  If  === 149 177  [[ 179 180 ]] P=0.800503, C=7153.000000 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    1   7  179  IfTrue  === 178  [[ 741 ]] #1 !jvms: StringLatin1::replace @ bci:28 (line 304)
>    0   0  741  CountedLoopEnd  === 179 740  [[ 742 190 ]] [lt] P=0.993611, C=7200.000000 !orig=[189] !jvms: StringLatin1::replace @ bci:19 (line 303)
> 
> We get the loop control, plus the loop-back `190 IfTrue`.
> 
> Example (loop detection with all paths for phi):
> 
> (rr) p find_node(141)->dump_bfs(4,find_node(141),"cdmxo+A")
> dist apd dump
> ---------------------------------------------
>    1   2  186  AddI  === _ 141 51  [[ 185 739 141 ]]  !orig=[738],... !jvms: StringLatin1::replace @ bci:13 (line 303)
>    0   0  141  Phi  === 746 36 186  [[ 185 186 162 146 154 154 747 ]]  #int:0..max-1:www #tripcount !orig=[161] !jvms: StringLatin1::replace @ bci:22 (line 304)
> 
> 
> **Color examples**
> Colors are especially useful to see chains between nodes (options character `#`).
> The input and output node idx are also colored if the node is displayed somewhere in the list. This should help you find chains of nodes.
> Tip: it can be worth it to configure the colors of your terminal to be more appealing.
> 
> Example (find control dependency of data node):
> ![image](https://user-images.githubusercontent.com/32593061/171135935-259d1e15-91d2-4c54-b924-8f5d4b20d338.png)
> We see data nodes in blue, and find a `SafePoint` in red and the `Return` in yellow.
> 
> Example (find memory dependency of data node):
> ![image](https://user-images.githubusercontent.com/32593061/171138929-d464bd1b-a807-4b9e-b4cc-ec32735cb024.png)
> 
> Example (loop detection):
> ![image](https://user-images.githubusercontent.com/32593061/171134459-27ddaa7f-756b-4807-8a98-44ae0632ab5c.png)
> We find the control and some data loop paths.

This pull request has now been integrated.

Changeset: 33ed0365
Author:    Emanuel Peter <epeter at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/33ed0365c3ed182a9d063e1701fe69bfb72dfa2e
Stats:     746 lines in 4 files changed: 714 ins; 0 del; 32 mod

8283775: better dump: VM support for graph querying in debugger with BFS traversal and node filtering

Reviewed-by: kvn, chagedorn, thartmann, rcastanedalo

-------------

PR: https://git.openjdk.org/jdk/pull/8468