RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v6]
kuaiwei
duke at openjdk.org
Fri Mar 21 10:09:12 UTC 2025
> In this patch, I extent the merge stores optimization to merge adjacents loads. Tier1 tests are passed in my machine.
>
> The benchmark result of MergeLoadBench.java
> AMD EPYC 9T24 96-Core Processor:
>
> |name | -MergeLoads | +MergeLoads |delta|
> |---|---|---|---|
> |MergeLoadBench.getCharB |4352.150 |4407.435 | 55.29 |
> |MergeLoadBench.getCharBU |4075.320 |4084.663 | 9.34 |
> |MergeLoadBench.getCharBV |3221.302 |3221.528 | 0.23 |
> |MergeLoadBench.getCharC |2235.433 |2238.796 | 3.36 |
> |MergeLoadBench.getCharL |4363.244 |4372.281 | 9.04 |
> |MergeLoadBench.getCharLU |4072.550 |4075.744 | 3.19 |
> |MergeLoadBench.getCharLV |2227.825 |2231.612 | 3.79 |
> |MergeLoadBench.getIntB |11199.935 |6869.030 | -4330.90 |
> |MergeLoadBench.getIntBU |6853.862 |2763.923 | -4089.94 |
> |MergeLoadBench.getIntBV |306.953 |309.911 | 2.96 |
> |MergeLoadBench.getIntL |10426.843 |6523.716 | -3903.13 |
> |MergeLoadBench.getIntLU |6740.847 |2602.701 | -4138.15 |
> |MergeLoadBench.getIntLV |2233.151 |2231.745 | -1.41 |
> |MergeLoadBench.getIntRB |11335.756 |8980.619 | -2355.14 |
> |MergeLoadBench.getIntRBU |7439.873 |3190.208 | -4249.66 |
> |MergeLoadBench.getIntRL |16323.040 |7786.842 | -8536.20 |
> |MergeLoadBench.getIntRLU |7457.745 |3364.140 | -4093.61 |
> |MergeLoadBench.getIntRU |2512.621 |2511.668 | -0.95 |
> |MergeLoadBench.getIntU |2501.064 |2500.629 | -0.43 |
> |MergeLoadBench.getLongB |21175.442 |21103.660 | -71.78 |
> |MergeLoadBench.getLongBU |14042.046 |2512.784 | -11529.26 |
> |MergeLoadBench.getLongBV |606.448 |606.171 | -0.28 |
> |MergeLoadBench.getLongL |23142.178 |23217.785 | 75.61 |
> |MergeLoadBench.getLongLU |14112.972 |2237.659 | -11875.31 |
> |MergeLoadBench.getLongLV |2230.416 |2231.224 | 0.81 |
> |MergeLoadBench.getLongRB |21152.558 |21140.583 | -11.98 |
> |MergeLoadBench.getLongRBU |14031.178 |2520.317 | -11510.86 |
> |MergeLoadBench.getLongRL |23248.506 |23136.410 | -112.10 |
> |MergeLoadBench.getLongRLU |14125.032 |2240.481 | -11884.55 |
> |MergeLoadBench.getLongRU |3071.881 |3066.606 | -5.27 |
> |Merg...
kuaiwei has updated the pull request incrementally with two additional commits since the last revision:
- Enable StressIGVN and riscv platform
- Change tests as review comments
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/24023/files
- new: https://git.openjdk.org/jdk/pull/24023/files/1eba9308..ed5590a9
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=24023&range=05
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=24023&range=04-05
Stats: 728 lines in 3 files changed: 434 ins; 118 del; 176 mod
Patch: https://git.openjdk.org/jdk/pull/24023.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/24023/head:pull/24023
PR: https://git.openjdk.org/jdk/pull/24023
More information about the hotspot-compiler-dev
mailing list