Integrated: 8285973: x86_64: Improve fp comparison and cmove for eq/ne
Quan Anh Mai
duke at openjdk.java.net
Tue May 24 00:18:42 UTC 2022
On Wed, 4 May 2022 01:59:17 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:
> Hi,
>
> This patch optimises the matching rules for floating-point comparison with respects to eq/ne on x86-64
>
> 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` is always set, so we don't need `cmpOpUCF2` for the eq/ne cases, which improves the sequence of `If (CmpF x x) (Bool ne)` from
>
> ucomiss xmm0, xmm0
> jp label
> jne label
>
> into
>
> ucomiss xmm0, xmm0
> jp label
>
> 2, The move rules for `cmpOpUCF2` is missing, which makes patterns such as `x == y ? 1 : 0` to fall back to `cmpOpU`, which have a really high cost of fixing the flags, such as
>
> xorl ecx, ecx
> ucomiss xmm0, xmm1
> jnp done
> pushf
> andq [rsp], 0xffffff2b
> popf
> done:
> movl eax, 1
> cmovel eax, ecx
>
> The patch changes this sequence into
>
> xorl ecx, ecx
> ucomiss xmm0, xmm1
> movl eax, 1
> cmovpl eax, ecx
> cmovnel eax, ecx
>
> 3, The patch also changes the pattern of `isInfinite` to be more optimised by using `Math.abs` to reduce 1 comparison and compares the result with `MAX_VALUE` since `>` is more optimised than `==` for floating-point types.
>
> The benchmark results are as follow:
>
> Before After
> Benchmark Mode Cnt Score Error Score Error Unit Ratio
> FPComparison.equalDouble avgt 5 2876.242 ± 58.875 594.636 ± 8.922 ns/op 4.84
> FPComparison.equalFloat avgt 5 3062.430 ± 31.371 663.849 ± 3.656 ns/op 4.61
> FPComparison.isFiniteDouble avgt 5 475.749 ± 19.027 518.309 ± 107.352 ns/op 0.92
> FPComparison.isFiniteFloat avgt 5 506.525 ± 14.417 515.576 ± 14.669 ns/op 0.98
> FPComparison.isInfiniteDouble avgt 5 1232.800 ± 31.677 621.185 ± 11.935 ns/op 1.98
> FPComparison.isInfiniteFloat avgt 5 1234.708 ± 70.239 623.566 ± 15.206 ns/op 1.98
> FPComparison.isNanDouble avgt 5 2255.847 ± 7.238 400.124 ± 0.762 ns/op 5.64
> FPComparison.isNanFloat avgt 5 2567.044 ± 36.078 546.486 ± 1.509 ns/op 4.70
>
> Thank you very much.
This pull request has now been integrated.
Changeset: c1db70d8
Author: Quan Anh Mai <anhmdq99 at gmail.com>
Committer: Sandhya Viswanathan <sviswanathan at openjdk.org>
URL: https://git.openjdk.java.net/jdk/commit/c1db70d827f7ac81aa6c6646e2431f672c71c8dc
Stats: 704 lines in 4 files changed: 620 ins; 70 del; 14 mod
8285973: x86_64: Improve fp comparison and cmove for eq/ne
Reviewed-by: kvn, sviswanathan
-------------
PR: https://git.openjdk.java.net/jdk/pull/8525
More information about the core-libs-dev
mailing list