RFR: 8296602: RISC-V: improve performance of copy_memory stub [v4]

Vladimir Kempik vkempik at openjdk.org
Thu Nov 17 08:26:18 UTC 2022


> Please review this change to improve the performance of copy_memory stub on risc-v
> 
> This change has three parts
> 1) use copy32 if possible to do 4 ld and 4 st per loop cycle
> 2) don't produce precopy code if is_aligned is true, it's not executed.
> 3) in the end of loop8 and loop32, remove data dependency between two addi opcodes, to allow them to be scheduled simultaneously
> 
> testing: org.openjdk.bench.vm.compiler.ArrayCopyObject, hotspot_compiler_arraycopy, hotspot:tier1, hotspot:tier2 - all ok
> hotspot:tier2 is on the way.
> 
> and for the benchmark results, using 
> org.openjdk.bench.vm.compiler.ArrayCopyObject.conjoint_micro
> 
> thead rvb-ice c910
> thead
> 
> Before ( copy8 only )
> Benchmark            	   (size) Mode Cnt  Score  Error  Units
> ArrayCopyObject.conjoint_micro    31 thrpt 25 6653.095 ± 251.565 ops/ms
> ArrayCopyObject.conjoint_micro    63 thrpt 25 4933.970 ± 77.559 ops/ms
> ArrayCopyObject.conjoint_micro   127 thrpt 25 3627.454 ± 34.589 ops/ms
> ArrayCopyObject.conjoint_micro  2047 thrpt 25 368.249 ± 0.453  ops/ms
> ArrayCopyObject.conjoint_micro  4095 thrpt 25 187.776 ± 0.306  ops/ms
> ArrayCopyObject.conjoint_micro  8191 thrpt 25  94.477 ± 0.340   ops/ms
> 
> after ( with copy32 )
> ArrayCopyObject.conjoint_micro    31 thrpt 25 7620.546 ± 69.756 ops/ms
> ArrayCopyObject.conjoint_micro    63 thrpt 25 6677.978 ± 33.112 ops/ms
> ArrayCopyObject.conjoint_micro   127 thrpt 25 5206.973 ± 22.612 ops/ms
> ArrayCopyObject.conjoint_micro  2047 thrpt 25 653.655 ± 31.494 ops/ms
> ArrayCopyObject.conjoint_micro  4095 thrpt 25 352.905 ± 7.390  ops/ms
> ArrayCopyObject.conjoint_micro  8191 thrpt 25 165.127 ± 0.832  ops/ms 
> 
> after ( copy32 with dead code elimination and independent addis )
> ArrayCopyObject.conjoint_micro      31  thrpt   25  7576.346 ?  94.487  ops/ms
> ArrayCopyObject.conjoint_micro      63  thrpt   25  6475.730 ? 252.590  ops/ms
> ArrayCopyObject.conjoint_micro     127  thrpt   25  5221.764 ?  20.415  ops/ms
> ArrayCopyObject.conjoint_micro    2047  thrpt   25   691.847 ?   1.102  ops/ms
> ArrayCopyObject.conjoint_micro    4095  thrpt   25   360.269 ?   1.091  ops/ms
> ArrayCopyObject.conjoint_micro    8191  thrpt   25   179.733 ?   3.012  ops/ms
> 
> on hifive unmatched:
> 
> before:
> Benchmark                       (size)   Mode  Cnt     Score     Error   Units
> ArrayCopyObject.conjoint_micro      31  thrpt   25  5391.575 ± 152.984  ops/ms
> ArrayCopyObject.conjoint_micro      63  thrpt   25  3700.946 ±  43.175  ops/ms
> ArrayCopyObject.conjoint_micro     127  thrpt   25  2316.160 ±  24.734  ops/ms
> ArrayCopyObject.conjoint_micro    2047  thrpt   25   188.616 ±   0.151  ops/ms
> ArrayCopyObject.conjoint_micro    4095  thrpt   25    95.323 ±   0.053  ops/ms
> ArrayCopyObject.conjoint_micro    8191  thrpt   25    46.935 ±   0.041  ops/ms
> 
> after:
> Benchmark                       (size)   Mode  Cnt     Score     Error   Units
> ArrayCopyObject.conjoint_micro      31  thrpt   25  6136.169 ± 330.409  ops/ms
> ArrayCopyObject.conjoint_micro      63  thrpt   25  4924.020 ±  78.529  ops/ms
> ArrayCopyObject.conjoint_micro     127  thrpt   25  3732.561 ±  89.606  ops/ms
> ArrayCopyObject.conjoint_micro    2047  thrpt   25   431.103 ±   0.505  ops/ms
> ArrayCopyObject.conjoint_micro    4095  thrpt   25   221.543 ±   0.363  ops/ms
> ArrayCopyObject.conjoint_micro    8191  thrpt   25   100.586 ±   0.197  ops/ms

Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision:

  Update formatting

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/11058/files
  - new: https://git.openjdk.org/jdk/pull/11058/files/8d2a5a25..cc91f7b6

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=11058&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11058&range=02-03

  Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod
  Patch: https://git.openjdk.org/jdk/pull/11058.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11058/head:pull/11058

PR: https://git.openjdk.org/jdk/pull/11058


More information about the hotspot-compiler-dev mailing list