Integrated: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512

Scott Gibbons sgibbons at openjdk.org
Mon Jun 12 15:10:05 UTC 2023


On Tue, 30 May 2023 17:07:01 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:

> Add an intrinsic for x86 AVX and AVX512 fmod.  This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem).  Also addresses dmod / drem.
> 
> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191).
> 
> Old:
> gcc-12.2.1-4.fc36.x86_64
> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix
> JVM version: 21-internal
> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68
> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67
> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70
> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69
> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44
> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40
> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40
> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40
> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41
> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40
> New:
> JVM version: 21-internal  (float)
> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42
> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27
> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17
> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16
> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17
> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17
> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17
> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16
> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16
> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17

This pull request has now been integrated.

Changeset: 5d5ae352
Author:    Scott Gibbons <sgibbons at openjdk.org>
Committer: Sandhya Viswanathan <sviswanathan at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/5d5ae35288989fcfabdea013b6e3cdafe359f0df
Stats:     902 lines in 12 files changed: 901 ins; 0 del; 1 mod

8308966: Add intrinsic for float/double modulo for x86 AVX2 and AVX512

Co-authored-by: Marius Cornea <marius.cornea at intel.com>
Reviewed-by: jbhateja, sviswanathan

-------------

PR: https://git.openjdk.org/jdk/pull/14224


More information about the hotspot-dev mailing list