RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6]

Mon Jun 5 17:06:16 UTC 2023

On Sun, 4 Jun 2023 17:31:06 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 121:
>> 
>>> 119:     //     // |x|, |y|
>>> 120:     //     a = DP_AND(x, DP_CONST(7fffffffffffffff));
>>> 121:     __ movq(xmm0, xmm0);
>> 
>> Redundatn move.
>
> I do not believe these are redundant, as the upper quadword of the register is cleared as a side-effect of the vmovq.  I do not believe the icx compiler would insert random redundant vmovq instructions at this optimization level.

Subsequent uses of xmm0 operate on 128 bit vector and eventually it feed into DIVSD instruction operating on fist 64 bit data. Given that we are clearing upper 64 bit it may be issued to execution port and consume 1 cycle.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1218352040