RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v2]

Quan Anh Mai duke at openjdk.java.net
Wed May 18 14:59:49 UTC 2022


> Hi,
> 
> This patch optimises the matching rules for floating-point comparison with respects to eq/ne on x86-64
> 
> 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` is always set, so we don't need `cmpOpUCF2` for the eq/ne cases, which improves the sequence of `If (CmpF x x) (Bool ne)` from
> 
>     ucomiss xmm0, xmm0
>     jp      label
>     jne     label
> 
> into
> 
>     ucomiss xmm0, xmm0
>     jp      label
> 
> 2, The move rules for `cmpOpUCF2` is missing, which makes patterns such as `x == y ? 1 : 0` to fall back to `cmpOpU`, which have a really high cost of fixing the flags, such as
> 
>         xorl    ecx, ecx
>         ucomiss xmm0, xmm1
>         jnp     done
>         pushf
>         andq    [rsp], 0xffffff2b
>         popf
>     done:
>         movl    eax, 1
>         cmovel  eax, ecx
> 
> The patch changes this sequence into
> 
>     xorl    ecx, ecx
>     ucomiss xmm0, xmm1
>     movl    eax, 1
>     cmovpl  eax, ecx
>     cmovnel eax, ecx
> 
> 3, The patch also changes the pattern of `isInfinite` to be more optimised by using `Math.abs` to reduce 1 comparison and compares the result with `MAX_VALUE` since `>` is more optimised than `==` for floating-point types.
> 
> The benchmark results are as follow:
> 
>     Before:
>     Benchmark                      Mode  Cnt     Score     Error  Units
>     FPComparison.equalDouble       avgt    5  2876.242 ±  58.875  ns/op
>     FPComparison.equalFloat        avgt    5  3062.430 ±  31.371  ns/op
>     FPComparison.isFiniteDouble    avgt    5   475.749 ±  19.027  ns/op
>     FPComparison.isFiniteFloat     avgt    5   506.525 ±  14.417  ns/op
>     FPComparison.isInfiniteDouble  avgt    5  1232.800 ±  31.677  ns/op
>     FPComparison.isInfiniteFloat   avgt    5  1234.708 ±  70.239  ns/op
>     FPComparison.isNanDouble       avgt    5  2255.847 ±   7.238  ns/op
>     FPComparison.isNanFloat        avgt    5  2567.044 ±  36.078  ns/op
> 
>     After:
>     Benchmark                      Mode  Cnt     Score     Error  Units
>     FPComparison.equalDouble       avgt    5   594.636 ±   8.922  ns/op
>     FPComparison.equalFloat        avgt    5   663.849 ±   3.656  ns/op
>     FPComparison.isFiniteDouble    avgt    5   518.309 ± 107.352  ns/op
>     FPComparison.isFiniteFloat     avgt    5   515.576 ±  14.669  ns/op
>     FPComparison.isInfiniteDouble  avgt    5   621.185 ±  11.935  ns/op
>     FPComparison.isInfiniteFloat   avgt    5   623.566 ±  15.206  ns/op
>     FPComparison.isNanDouble       avgt    5   400.124 ±   0.762  ns/op
>     FPComparison.isNanFloat        avgt    5   546.486 ±   1.509  ns/op
> 
> Thank you very much.

Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:

 - incidental ws
 - add tests
 - Merge branch 'master' into fpcompare
 - fix tests
 - test
 - improve infinity
 - remove expensive rules
 - improve fp comparison

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8525/files
  - new: https://git.openjdk.java.net/jdk/pull/8525/files/b64e04b5..ba93dcf2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8525&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8525&range=00-01

  Stats: 210103 lines in 2627 files changed: 159508 ins; 36691 del; 13904 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8525.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8525/head:pull/8525

PR: https://git.openjdk.java.net/jdk/pull/8525


More information about the hotspot-compiler-dev mailing list