RFR: 8282541: AArch64: Auto-vectorize Math.round API [v6]

Andrew Dinn adinn at openjdk.java.net
Tue Apr 26 08:52:47 UTC 2022


On Wed, 20 Apr 2022 17:31:08 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Before, Apple M1:
>> 
>> +-----------------------------------------+---------------------------------+
>> |Benchmark                                | (TESTSIZE) Mode     Score  Units|
>> +-----------------------------------------+---------------------------------+
>> |FpRoundingBenchmark.test_round_double    |   1024  thrpt    1612.391 ops/ms|
>> |FpRoundingBenchmark.test_round_double    |   2048  thrpt     804.291 ops/ms|
>> |FpRoundingBenchmark.test_round_float     |   1024  thrpt    1558.202 ops/ms|
>> |FpRoundingBenchmark.test_round_float     |   2048  thrpt     775.730 ops/ms|
>> +------------------------------------------+--------------------------------+
>> 
>> After:
>> 
>> +-----------------------------------------+----------------------------------+
>> |Benchmark                                | (TESTSIZE) Mode      Score  Units|
>> +-----------------------------------------+----------------------------------+
>> |FpRoundingBenchmark.test_round_double    |    1024  thrpt   2720.153  ops/ms|
>> |FpRoundingBenchmark.test_round_double    |    2048  thrpt   1371.750  ops/ms|
>> |FpRoundingBenchmark.test_round_float     |    1024  thrpt   5940.263  ops/ms|
>> |FpRoundingBenchmark.test_round_float     |    2048  thrpt   3036.201  ops/ms|
>> +-----------------------------------------+----------------------------------+
>> 
>> About the algorithm:
>> 
>> `Math.round()` is tricky. Its specification corresponds to no standard
>> rounding mode: it "returns the closest long to the argument, with ties
>> rounding to positive infinity." For positive inputs this is the same
>> as IEEE-754's `convertToIntegerTiesToAway` operation, which rounds
>> away from zero, but there's no equivalent for negative inputs.
>> 
>> `Math.round()` used simply to add 0.5 and convert to integer by taking
>> the floor of the result, but that wasn't right because it suffers from
>> double rounding. This breaks several cases, in particular because
>> 
>>  `0.4999999... (+) 0.5 == 1.0`
>>  
>>  (Here, `(+)` indicates an addition followed by roundTiesToEven.)
>>  
>> There is no corresponding problem with `-0.4999999...` or `-0.9999999...`
>>  
>> Also, in the 32-bit `float` case,
>>  
>>   `8388609 (+) 0.5 == 8388610`
>>   
>> because 8388609 (0x1.000002p+23) as a 32-bit integer has no fraction
>> bits, so adding 0.5, followed by roundTiesToEven, rounds upwards. This
>> problem manifests for every odd integer within the binade from
>> 0x1.000002p+23 to 0x1.fffffep+23, whether positive or negative. There
>> is a corresponding problem for the `double` range.
>> 
>> The patch for JDK-8279508 handles this by flipping the floating-point
>> rounding mode to roundTowardNegative, adding 0.5, and taking the
>> floor. While this does work on AArch64, the performance is
>> tragic. AArch64 implementations seem to wait for all instructions in
>> flight to retire, change the rounding mode, and do the operation. This
>> effectively serializes the entire thread.
>> 
>> This patch takes a different approach. Firstly, we can observe that we
>> can use the `frinta` instruction for the entire positive range. The
>> negative range is a bit trickier, but we can observe that any x,
>> abs{x) >= -0x1.000000p+23, has no fractional bits so it must be an
>> integer. For convenence, we can convert that range with the `frinta`
>> instruction as well.
>> 
>> All that remains are x < 0, abs{x) < -0x1.000000p+23. Adding 0.5
>> followed by roundTiesToEven doesn't lead to a problem because for
>> x < 0 && abs{x) >= 0.5, adding 0.5 only reduces the magnitude of x;
>> for all x < 0 && abs{x) < 0.5, adding 0.5 followed by roundTiesToEven
>> return 0.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Untabify

This looks fine to me.

-------------

Marked as reviewed by adinn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8204


More information about the hotspot-dev mailing list