RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10]
Scott Gibbons
sgibbons at openjdk.org
Mon Jun 12 14:33:04 UTC 2023
On Mon, 5 Jun 2023 23:48:21 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:
>> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem.
>>
>> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191).
>>
>> Old:
>> gcc-12.2.1-4.fc36.x86_64
>> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix
>> JVM version: 21-internal
>> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68
>> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67
>> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70
>> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69
>> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44
>> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40
>> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40
>> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40
>> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41
>> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40
>> New:
>> JVM version: 21-internal (float)
>> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42
>> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27
>> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17
>> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16
>> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17
>> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17
>> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17
>> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16
>> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16
>> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix tests; need vlbwdq for vpbroadcastq
Thanks, @TobiHartmann
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1587464645
More information about the hotspot-dev
mailing list