RFR: 8325651: C2 SuperWord: refactor the dependency graph [v3]

Christian Hagedorn chagedorn at openjdk.org
Thu Mar 7 15:57:02 UTC 2024


On Thu, 7 Mar 2024 15:41:20 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Subtask of https://github.com/openjdk/jdk/pull/16620.
>> 
>> **Goal**
>> 
>> - Make the dependency graph, make it a module of `VLoopAnalyzer` -> `VLoopDependencyGraph`.
>> - Refactoring: replace linked-list edges with a compact array for each node.
>> - No behavioral change to vectorization.
>> 
>> **Benchmark**
>> 
>> I have a large loop body (extra large with hand-unrolling and ` -XX:LoopUnrollLimit=1000`).
>> All stores are connected to all previous loads, effectively creating an `O(n^2)` size graph,
>> ensuring that we spend a lot of time on the dependency graph compared to other components.
>> 
>> Measured on `linux-x64` and turbo disabled.
>> 
>> Measuring Compile time difference:
>> `/oracle-work/jdk-fork2/build/linux-x64/jdk/bin/java -XX:CompileCommand=compileonly,TestGraph::test* -XX:+CITime -XX:+UnlockDiagnosticVMOptions -XX:RepeatCompilation=20 -XX:LoopUnrollLimit=1000 -Xbatch TestGraph.java`
>> 
>> TestGraph.java
>> 
>> public class TestGraph {
>>     static int RANGE = 100_000;
>> 
>>     public static void main(String[] args) {
>>         int[] a = new int[RANGE];
>>         int[] b = new int[RANGE];
>>         for (int i = 0; i < 10_000; i++) {
>>             test1(a, b, i % 100);
>>         }
>>     }
>> 
>>     static void test1(int[] a, int[] b, int offset) {
>>         for (int i = 0; i < RANGE/16-200; i++) {
>>             a[i * 16 +  0] = b[i * 16 +  0 + offset];
>>             a[i * 16 +  1] = b[i * 16 +  1 + offset];
>>             a[i * 16 +  2] = b[i * 16 +  2 + offset];
>>             a[i * 16 +  3] = b[i * 16 +  3 + offset];
>>             a[i * 16 +  4] = b[i * 16 +  4 + offset];
>>             a[i * 16 +  5] = b[i * 16 +  5 + offset];
>>             a[i * 16 +  6] = b[i * 16 +  6 + offset];
>>             a[i * 16 +  7] = b[i * 16 +  7 + offset];
>>             a[i * 16 +  8] = b[i * 16 +  8 + offset];
>>             a[i * 16 +  9] = b[i * 16 +  9 + offset];
>>             a[i * 16 + 10] = b[i * 16 + 10 + offset];
>>             a[i * 16 + 11] = b[i * 16 + 11 + offset];
>>             a[i * 16 + 12] = b[i * 16 + 12 + offset];
>>             a[i * 16 + 13] = b[i * 16 + 13 + offset];
>>             a[i * 16 + 14] = b[i * 16 + 14 + offset];
>>             a[i * 16 + 15] = b[i * 16 + 15 + offset];
>>         }
>>     }
>> }
>> 
>> 
>> 
>> Before:
>> 
>>     C2 Compile Time:       14.588 s
>> ...
>>          IdealLoop:            13.670 s
>>            AutoVectorize:      11.703 s```
>> 
>> After:
>> 
>>     C2 Compile Time:       14.468 s
>> ...
>>          IdealLoop:            13.595 s
>>      ...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   missing string Extra -> Memory change

That's a nice refactoring. I only have some small comments.

src/hotspot/share/opto/vectorization.hpp line 456:

> 454: class VLoopDependencyGraph : public StackObj {
> 455: private:
> 456:   class DependencyNode;

I'm not sure if we should declare classes in the middle of another class. Should we move the forward declaration to the top of the file as done in other places as well?

src/hotspot/share/opto/vectorization.hpp line 467:

> 465: 
> 466:   // Node depth in DAG: bb_idx -> depth
> 467:   GrowableArray<int> _depth;

Suggestion:

  GrowableArray<int> _depths;

src/hotspot/share/opto/vectorization.hpp line 469:

> 467:   GrowableArray<int> _depth;
> 468: 
> 469: protected:

Why is this protected?

src/hotspot/share/opto/vectorization.hpp line 545:

> 543:     void next();
> 544:     bool done() const { return _current == nullptr; }
> 545:     Node* current() const { assert(!done(), "not done yet"); return _current; }

For two statements, I suggest to go with multiple lines:
Suggestion:

    Node* current() const { 
      assert(!done(), "not done yet"); 
      return _current;
    }

-------------

Marked as reviewed by chagedorn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17812#pullrequestreview-1922553044
PR Review Comment: https://git.openjdk.org/jdk/pull/17812#discussion_r1516224804
PR Review Comment: https://git.openjdk.org/jdk/pull/17812#discussion_r1516230789
PR Review Comment: https://git.openjdk.org/jdk/pull/17812#discussion_r1516231365
PR Review Comment: https://git.openjdk.org/jdk/pull/17812#discussion_r1516412346


More information about the hotspot-compiler-dev mailing list