RFR: 8282541: AArch64: Auto-vectorize Math.round API [v3]
Andrew Haley
aph at openjdk.java.net
Tue Apr 19 16:39:14 UTC 2022
> Before, Apple M1:
>
> +-----------------------------------------+---------------------------------+
> |Benchmark | (TESTSIZE) Mode Score Units|
> +-----------------------------------------+---------------------------------+
> |FpRoundingBenchmark.test_round_double | 1024 thrpt 1612.391 ops/ms|
> |FpRoundingBenchmark.test_round_double | 2048 thrpt 804.291 ops/ms|
> |FpRoundingBenchmark.test_round_float | 1024 thrpt 1558.202 ops/ms|
> |FpRoundingBenchmark.test_round_float | 2048 thrpt 775.730 ops/ms|
> +------------------------------------------+--------------------------------+
>
> After:
>
> +-----------------------------------------+----------------------------------+
> |Benchmark | (TESTSIZE) Mode Score Units|
> +-----------------------------------------+----------------------------------+
> |FpRoundingBenchmark.test_round_double | 1024 thrpt 2720.153 ops/ms|
> |FpRoundingBenchmark.test_round_double | 2048 thrpt 1371.750 ops/ms|
> |FpRoundingBenchmark.test_round_float | 1024 thrpt 5940.263 ops/ms|
> |FpRoundingBenchmark.test_round_float | 2048 thrpt 3036.201 ops/ms|
> +-----------------------------------------+----------------------------------+
>
> About the algorithm:
>
> `Math.round()` is tricky. Its specification corresponds to no standard
> rounding mode: it "returns the closest long to the argument, with ties
> rounding to positive infinity." For positive inputs this is the same
> as IEEE-754's `convertToIntegerTiesToAway` operation, which rounds
> away from zero, but there's no equivalent for negative inputs.
>
> `Math.round()` used simply to add 0.5 and convert to integer by taking
> the floor of the result, but that wasn't right because it suffers from
> double rounding. This breaks several cases, in particular because
>
> `0.4999999... (+) 0.5 == 1.0`
>
> (Here, `(+)` indicates an addition followed by roundTiesToEven.)
>
> There is no corresponding problem with `-0.4999999...` or `-0.9999999...`
>
> Also, in the 32-bit `float` case,
>
> `8388609 (+) 0.5 == 8388610`
>
> because 8388609 (0x1.000002p+23) as a 32-bit integer has no fraction
> bits, so adding 0.5, followed by roundTiesToEven, rounds upwards. This
> problem manifests for every odd integer within the binade from
> 0x1.000002p+23 to 0x1.fffffep+23, whether positive or negative. There
> is a corresponding problem for the `double` range.
>
> The patch for JDK-8279508 handles this by flipping the floating-point
> rounding mode to roundTowardNegative, adding 0.5, and taking the
> floor. While this does work on AArch64, the performance is
> tragic. AArch64 implementations seem to wait for all instructions in
> flight to retire, change the rounding mode, and do the operation. This
> effectively serializes the entire thread.
>
> This patch takes a different approach. Firstly, we can observe that we
> can use the `frinta` instruction for the entire positive range. The
> negative range is a bit trickier, but we can observe that any x,
> abs{x) >= -0x1.000000p+23, has no fractional bits so it must be an
> integer. For convenence, we can convert that range with the `frinta`
> instruction as well.
>
> All that remains are x < 0, abs{x) < -0x1.000000p+23. Adding 0.5
> followed by roundTiesToEven doesn't lead to a problem because for
> x < 0 && abs{x) >= 0.5, adding 0.5 only reduces the magnitude of x;
> for all x < 0 && abs{x) < 0.5, adding 0.5 followed by roundTiesToEven
> return 0.
Andrew Haley has updated the pull request incrementally with two additional commits since the last revision:
- Put all SVE-mode ops in aarch64_sve.ad.
- Move size extimate for Op_RoundXX to matcher.
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/8204/files
- new: https://git.openjdk.java.net/jdk/pull/8204/files/1a43443a..c365b3ff
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8204&range=02
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8204&range=01-02
Stats: 144 lines in 10 files changed: 52 ins; 55 del; 37 mod
Patch: https://git.openjdk.java.net/jdk/pull/8204.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/8204/head:pull/8204
PR: https://git.openjdk.java.net/jdk/pull/8204
More information about the hotspot-dev
mailing list