Integrated: 8325651: C2 SuperWord: refactor the dependency graph

Emanuel Peter epeter at openjdk.org
Mon Mar 11 07:15:06 UTC 2024


On Mon, 12 Feb 2024 16:24:30 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> Subtask of https://github.com/openjdk/jdk/pull/16620.
> 
> **Goal**
> 
> - Make the dependency graph, make it a module of `VLoopAnalyzer` -> `VLoopDependencyGraph`.
> - Refactoring: replace linked-list edges with a compact array for each node.
> - No behavioral change to vectorization.
> 
> **Benchmark**
> 
> I have a large loop body (extra large with hand-unrolling and ` -XX:LoopUnrollLimit=1000`).
> All stores are connected to all previous loads, effectively creating an `O(n^2)` size graph,
> ensuring that we spend a lot of time on the dependency graph compared to other components.
> 
> Measured on `linux-x64` and turbo disabled.
> 
> Measuring Compile time difference:
> `/oracle-work/jdk-fork2/build/linux-x64/jdk/bin/java -XX:CompileCommand=compileonly,TestGraph::test* -XX:+CITime -XX:+UnlockDiagnosticVMOptions -XX:RepeatCompilation=20 -XX:LoopUnrollLimit=1000 -Xbatch TestGraph.java`
> 
> TestGraph.java
> 
> public class TestGraph {
>     static int RANGE = 100_000;
> 
>     public static void main(String[] args) {
>         int[] a = new int[RANGE];
>         int[] b = new int[RANGE];
>         for (int i = 0; i < 10_000; i++) {
>             test1(a, b, i % 100);
>         }
>     }
> 
>     static void test1(int[] a, int[] b, int offset) {
>         for (int i = 0; i < RANGE/16-200; i++) {
>             a[i * 16 +  0] = b[i * 16 +  0 + offset];
>             a[i * 16 +  1] = b[i * 16 +  1 + offset];
>             a[i * 16 +  2] = b[i * 16 +  2 + offset];
>             a[i * 16 +  3] = b[i * 16 +  3 + offset];
>             a[i * 16 +  4] = b[i * 16 +  4 + offset];
>             a[i * 16 +  5] = b[i * 16 +  5 + offset];
>             a[i * 16 +  6] = b[i * 16 +  6 + offset];
>             a[i * 16 +  7] = b[i * 16 +  7 + offset];
>             a[i * 16 +  8] = b[i * 16 +  8 + offset];
>             a[i * 16 +  9] = b[i * 16 +  9 + offset];
>             a[i * 16 + 10] = b[i * 16 + 10 + offset];
>             a[i * 16 + 11] = b[i * 16 + 11 + offset];
>             a[i * 16 + 12] = b[i * 16 + 12 + offset];
>             a[i * 16 + 13] = b[i * 16 + 13 + offset];
>             a[i * 16 + 14] = b[i * 16 + 14 + offset];
>             a[i * 16 + 15] = b[i * 16 + 15 + offset];
>         }
>     }
> }
> 
> 
> 
> Before:
> 
>     C2 Compile Time:       14.588 s
> ...
>          IdealLoop:            13.670 s
>            AutoVectorize:      11.703 s```
> 
> After:
> 
>     C2 Compile Time:       14.468 s
> ...
>          IdealLoop:            13.595 s
>            AutoVectorize:      11.539 s
> 
> 
> Memory usage: `-XX:CompileCommand=memstat,TestGraph::test*,print`
> Before: `8_225_680 - 8_258_408` byt...

This pull request has now been integrated.

Changeset: ca5ca85d
Author:    Emanuel Peter <epeter at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/ca5ca85d2408abfcb8a37f16476dba13c3b474d0
Stats:     713 lines in 5 files changed: 285 ins; 404 del; 24 mod

8325651: C2 SuperWord: refactor the dependency graph

Reviewed-by: chagedorn, vlivanov

-------------

PR: https://git.openjdk.org/jdk/pull/17812


More information about the hotspot-compiler-dev mailing list