RFR: JDKJDK-8277178: Reduce the priority of data dependent nodes when OptoScheduling enabled
SUN Guoyun
duke at openjdk.java.net
Tue Nov 16 09:16:59 UTC 2021
when doing gcm/lcm, We should not only consider the height of nodes(latency), but also consider whether there is data dependency between nodes. When there is data dependency between two nodes and the delay of the previous node is large, another node without data dependency can be considered inserting between the two nodes. For example:
for java code
<pre><code class="java">
public static final double fval = 2.00;
public static double[] A = new double[N];
public static int[] B = new int[N];
public static void testP(){
for (int i=0; i<N; i++) {
A[i] += A[i] * fval;
B[i] += B[i]+2;
}
}
</code></pre>
when use `-XX:+OptoScheduling` in aarch64, the sequence is
<pre><code class="shell">
190 B15: # out( B15 B16 ) <- in( B14 B15 ) Loop( B15-B15 inner main of N118 strip mined) Freq: 9.9999e+11
190 sxtw R13, R15 # i2l
194 + add R14, R17, R13, LShiftL #3 # ptr
198 ldrd V16, [R14, #16] # double
19c + fmuld V18, V16, V17
1a0 + faddd V16, V18, V16
1a4 strd V16, [R14, #16] # double
1a8 + add R13, R0, R13, LShiftL #2 # ptr
1ac + ldrw R1, [R13, #16] # int
1b0 + addw R14, R1, R1
1b4 + addw R1, R14, #2
1b8 + addw R15, R15, #1
1bc strw R1, [R13, #16] # int
1c0 + cmpw R15, R12
1c4 blt B15 // counted loop end P=1.000000 C=40960.000000
</code></pre>
Then a more efficient sequence should be:
<pre><code class="shell">
190 B15: # out( B15 B16 ) <- in( B14 B15 ) Loop( B15-B15 inner main of N118 strip mined) Freq: 9.9999e+11
190 sxtw R13, R14 # i2l
194 add R15, R17, R13, LShiftL #3 # ptr
198 add R13, R0, R13, LShiftL #2 # ptr
19c ldrd V16, [R15, #16] # double
1a0 ldrw R2, [R13, #16] # int
1a4 fmuld V18, V16, V17
1a8 addw R1, R2, R2
1ac faddd V16, V18, V16
1b0 strd V16, [R15, #16] # double
1b4 addw R1, R1, #2
1b8 strw R1, [R13, #16] # int
1bc addw R14, R14, #1
1c0 cmpw R14, R12
1c4 blt B15 // counted loop end P=1.000000 C=40960.000000
</code></pre>
This problem also exists in MIPS architecture. This is a patch to fix this problem. Please help review it.
Thanks
-------------
Commit messages:
- 8277178: Reduce the priority of data dependent nodes when OptoScheduling enabled
Changes: https://git.openjdk.java.net/jdk/pull/6407/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6407&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8277178
Stats: 41 lines in 2 files changed: 12 ins; 28 del; 1 mod
Patch: https://git.openjdk.java.net/jdk/pull/6407.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/6407/head:pull/6407
PR: https://git.openjdk.java.net/jdk/pull/6407
More information about the hotspot-compiler-dev
mailing list