RFR: 8318562: Computational test more than 2x slower when AVX instructions are used

Sandhya Viswanathan sviswanathan at openjdk.org
Fri Nov 17 00:14:42 UTC 2023


This PR fixes the perf regression seen on AVX for floating point conversions.

In AVX the cvt instructions have three operands cvtxx dst, src1, src2.  Where src2 is the one being converted. The dst gets the lower bits as the converted value and upper bits (up to 128) from src1.

The C2 jit uses the cvtxx dst, dst, src2 flavor. Here the problem was due to uninitialized upper bits of the dst XMM register.
Doing an xor dst, dst  before the conversion instruction fixes the perf regression. 

Perf before the patch on UseAVX=3 platform:
ComputePI.compute_pi_dbl_flt   avgt    5   471.875 ±  0.400  ns/op
ComputePI.compute_pi_flt_dbl   avgt    5  1877.174 ±  0.557  ns/op
ComputePI.compute_pi_int_dbl   avgt    5   655.222 ± 28.082  ns/op
ComputePI.compute_pi_int_flt   avgt    5   737.178 ±  0.077  ns/op
ComputePI.compute_pi_long_dbl  avgt    5   767.364 ±  0.027  ns/op
ComputePI.compute_pi_long_flt  avgt    5   587.854 ± 10.068  ns/op

Perf after the patch on UseAVX=3 platform:
Benchmark                      Mode  Cnt    Score   Error  Units
ComputePI.compute_pi_dbl_flt   avgt    5  468.328 ± 0.141  ns/op
ComputePI.compute_pi_flt_dbl   avgt    5  435.430 ± 0.259  ns/op
ComputePI.compute_pi_int_dbl   avgt    5  424.088 ± 0.050  ns/op
ComputePI.compute_pi_int_flt   avgt    5  417.345 ± 0.207  ns/op
ComputePI.compute_pi_long_dbl  avgt    5  425.751 ± 0.006  ns/op
ComputePI.compute_pi_long_flt  avgt    5  430.199 ± 0.736  ns/op

-------------

Commit messages:
 - fix 32bit build problem
 - Fix for AVX cvt performance

Changes: https://git.openjdk.org/jdk/pull/16701/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16701&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8318562
  Stats: 247 lines in 4 files changed: 245 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/16701.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16701/head:pull/16701

PR: https://git.openjdk.org/jdk/pull/16701


More information about the hotspot-dev mailing list