RFR: 8365926: RISC-V: Performance regression in renaissance (chi-square) [v5]
Robbin Ehn
rehn at openjdk.org
Wed Sep 10 18:46:52 UTC 2025
On Thu, 4 Sep 2025 13:32:34 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
>> Hey, please consider!
>>
>> A bunch of info in JBS entry, please read that also.
>>
>> I narrowed this issue down to the old jal optimization, making direct calls when in reach.
>> This patch restores them and removes this regression.
>>
>> In essence we turn "jalr ra,0(t1)" into a "jal ra,<dest>" if reachable, and restore the jalr if a new destination is not reachable.
>>
>> Please test on your hardware!
>>
>>
>> Chi Square (100 runs each, 10 fastest iterations of each run, P550)
>> JDK-23 (last version with trampoline calls)
>> Mean: 3189.5827
>> Standard Deviation: 284.6478
>>
>> JDK-25
>> Mean: 3424.8905
>> Standard Deviation: 222.2208
>>
>> Patch:
>> Mean: 3144.8535
>> Standard Deviation: 229.2577
>>
>>
>> No issues found in t1, running t2 also. Stress tested on vf2, bpi-f3, p550.
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
>
> - Merge branch 'master' into 8365926
> - Review comments
> - Review comments
> - Merge branch 'master' into 8365926
> - Spelling
> - Merge branch 'master' into 8365926
> - draft jal<->jalr
Hamlin had some offline Q so I gather this data for him:
Benchmark Results:
Base: JDK24* +UseTrampoline
JAL OPT: JDK24* +UseTrampoline + JAL OPT
+-----------------+--------------+--------------+----------------+----------------+--------------+------------------+-------------+----------------+------------------+--------------------+
| Benchmark | Mean (Base) | SD (Base) | Fastest (Base) | Mean (JAL OPT) | SD (JAL OPT) | Fastest (JAL OPT)| Diff Mean | Diff Fastest | Mean Diff Ratio | Fastest Diff Ratio |
+-----------------+--------------+--------------+----------------+----------------+--------------+------------------+-------------+----------------+------------------+--------------------+
| future-genetic | 8317.8449 | 925.0775 | 7824.59 | 8421.137 | 1870.3916 | 7955.19 | 103.2922 | 130.6 | 1.012418145 | 1.01669097 |
| akka-uct | 54775.8037 | 5220.7361 | 49614.46 | 54149.9939 | 4730.3662 | 48736.7 | -625.8097 | -877.76 | 0.9885750686 | 0.9823083835 |
| movie-lens | 44859.3268 | 107.8713 | 38160.64 | 43043.6965 | 7932.6525 | 36807.2 | -1815.6295 | -1353.44 | 0.9595261529 | 0.9645330896 |
| scala-doku | 10792.4933 | 3004.9348 | 970.34 | 10739.0164 | 2692.6155 | 9226.94 | -53.4766 | 256.59 | 0.9950450188 | 1.028605382 |
| chi-square | 4740.1812 | 3552.9489 | 2579.09 | 4749.0893 | 3484.3178 | 2498.04 | 8.9081 | -81.05 | 1.001879274 | 0.968574187 |
| fj-kmeans | 18597.656 | 2481.4036 | 17994.43 | 18588.154 | 4458.6089 | 18019.15 | -9.5018 | 24.72 | 0.9994890862 | 1.001373758 |
| db-shootout | 26529.8048 | 3163.9087 | 21270.43 | 25101.5681 | 2483.0698 | 21419.11 | -1428.2367 | 148.67 | 0.9461648244 | 1.006989986 |
| finagle-http | 20646.1713 | 1635.9154 | 14898.97 | 20250.4966 | 1046.1738 | 14735.66 | -395.6747 | -163.31 | 0.9808354443 | 0.9890388396 |
| reactors | 52051.8872 | 2023.7865 | 49188.65 | 51625.9497 | 2150.598 | 48874.49 | -425.9376 | -314.16 | 0.9918170594 | 0.9936131608 |
| dec-tree | 7532.9295 | 756.8107 | 4076.4 | 7441.0578 | 750.30926 | 4089.08 | -91.8717 | 12.68 | 0.9878039878 | 1.003110588 |
| naive-bayes | 38973.8684 | 16828.5555 | 31479.37 | 38484.4577 | 16640.458 | 31576.24 | -489.4106 | 96.87 | 0.9874425937 | 1.003077253 |
| als | 20116.2896 | 42.9005 | 14593.64 | 19553.929 | 947.1711 | 14599.15 | -562.3509 | 5.52 | 0.9720449855 | 1.000377562 |
| par-mnemonics | 17564.7499 | 744.1041 | 16654.08 | 17239.074 | 1100.0016 | 15942.67 | -325.676 | -711.41 | 0.9814585518 | 0.9572831402 |
| scala-kmeans | 1201.4918 | 180.6982 | 845 | 1173.5701 | 205.5769 | 791.32 | -27.9217 | -53.68 | 0.9767608069 | 0.9364733728 |
| philosophers | 4780.9081 | 417.8337 | 3656.22 | 4828.5436 | 1372.1029 | 3926.02 | 47.6356 | 269.8 | 1.009963714 | 1.073792058 |
| log-regression | 7403.8792 | 8743.3328 | 3675.79 | 7275.2818 | 715.8207 | 3578.2 | -128.5983 | -97.6 | 0.98263097 | 0.9734506052 |
| gauss-mix | 35128.1145 | 8364.2843 | 27585.27 | 33996.7118 | 7896.5377 | 26810.99 | -1131.4027 | -774.27 | 0.9677921028 | 0.9719313967 |
| mnemonics | 21426.0608 | 537.9065 | 20202.69 | 20956.9427 | 610.3026 | 19568.55 | -469.1181 | -634.14 | 0.9781052568 | 0.9686111107 |
| dotty | 16674.7994 | 13824.23 | 12773.145 | 16098.8288 | 13498.268 | 7484.09 | -575.9706 | -247.36 | 0.965458619 | 0.9680060015 |
| finagle-chirper | 20949.0206 | 10776.0049 | 15527.08 | 20286.9623 | 10038.7242 | 15212.05 | -662.0582 | -315.03 | 0.9683966944 | 0.9797109308 |
+-----------------+--------------+--------------+----------------+----------------+--------------+------------------+-------------+----------------+------------------+--------------------+
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26944#issuecomment-3276121311
More information about the hotspot-compiler-dev
mailing list