RFR: 8282541: AArch64: Auto-vectorize Math.round API [v6]

Andrew Haley aph at openjdk.java.net
Wed Apr 20 17:31:08 UTC 2022


> Before, Apple M1:
> 
> +-----------------------------------------+---------------------------------+
> |Benchmark                                | (TESTSIZE) Mode     Score  Units|
> +-----------------------------------------+---------------------------------+
> |FpRoundingBenchmark.test_round_double    |   1024  thrpt    1612.391 ops/ms|
> |FpRoundingBenchmark.test_round_double    |   2048  thrpt     804.291 ops/ms|
> |FpRoundingBenchmark.test_round_float     |   1024  thrpt    1558.202 ops/ms|
> |FpRoundingBenchmark.test_round_float     |   2048  thrpt     775.730 ops/ms|
> +------------------------------------------+--------------------------------+
> 
> After:
> 
> +-----------------------------------------+----------------------------------+
> |Benchmark                                | (TESTSIZE) Mode      Score  Units|
> +-----------------------------------------+----------------------------------+
> |FpRoundingBenchmark.test_round_double    |    1024  thrpt   2720.153  ops/ms|
> |FpRoundingBenchmark.test_round_double    |    2048  thrpt   1371.750  ops/ms|
> |FpRoundingBenchmark.test_round_float     |    1024  thrpt   5940.263  ops/ms|
> |FpRoundingBenchmark.test_round_float     |    2048  thrpt   3036.201  ops/ms|
> +-----------------------------------------+----------------------------------+
> 
> About the algorithm:
> 
> `Math.round()` is tricky. Its specification corresponds to no standard
> rounding mode: it "returns the closest long to the argument, with ties
> rounding to positive infinity." For positive inputs this is the same
> as IEEE-754's `convertToIntegerTiesToAway` operation, which rounds
> away from zero, but there's no equivalent for negative inputs.
> 
> `Math.round()` used simply to add 0.5 and convert to integer by taking
> the floor of the result, but that wasn't right because it suffers from
> double rounding. This breaks several cases, in particular because
> 
>  `0.4999999... (+) 0.5 == 1.0`
>  
>  (Here, `(+)` indicates an addition followed by roundTiesToEven.)
>  
> There is no corresponding problem with `-0.4999999...` or `-0.9999999...`
>  
> Also, in the 32-bit `float` case,
>  
>   `8388609 (+) 0.5 == 8388610`
>   
> because 8388609 (0x1.000002p+23) as a 32-bit integer has no fraction
> bits, so adding 0.5, followed by roundTiesToEven, rounds upwards. This
> problem manifests for every odd integer within the binade from
> 0x1.000002p+23 to 0x1.fffffep+23, whether positive or negative. There
> is a corresponding problem for the `double` range.
> 
> The patch for JDK-8279508 handles this by flipping the floating-point
> rounding mode to roundTowardNegative, adding 0.5, and taking the
> floor. While this does work on AArch64, the performance is
> tragic. AArch64 implementations seem to wait for all instructions in
> flight to retire, change the rounding mode, and do the operation. This
> effectively serializes the entire thread.
> 
> This patch takes a different approach. Firstly, we can observe that we
> can use the `frinta` instruction for the entire positive range. The
> negative range is a bit trickier, but we can observe that any x,
> abs{x) >= -0x1.000000p+23, has no fractional bits so it must be an
> integer. For convenence, we can convert that range with the `frinta`
> instruction as well.
> 
> All that remains are x < 0, abs{x) < -0x1.000000p+23. Adding 0.5
> followed by roundTiesToEven doesn't lead to a problem because for
> x < 0 && abs{x) >= 0.5, adding 0.5 only reduces the magnitude of x;
> for all x < 0 && abs{x) < 0.5, adding 0.5 followed by roundTiesToEven
> return 0.

Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:

  Untabify

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8204/files
  - new: https://git.openjdk.java.net/jdk/pull/8204/files/3632d35c..ebe17660

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8204&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8204&range=04-05

  Stats: 1037 lines in 1 file changed: 0 ins; 0 del; 1037 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8204.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8204/head:pull/8204

PR: https://git.openjdk.java.net/jdk/pull/8204


More information about the hotspot-dev mailing list