RFR: 8323116: [REDO] Computational test more than 2x slower when AVX instructions are used [v4]

Jatin Bhateja jbhateja at openjdk.org
Fri Apr 5 03:46:12 UTC 2024


On Fri, 5 Apr 2024 00:09:00 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Similar to #18089, the purpose of this change is to remove the slowdown due to false dependency. For example, using the current `(dst, dst, src)` encoding in the case of `VCVTSD2SS xmm1, xmm2, xmm3/m64`, the instruction converts one double precision floating-point value in xmm3/m64 to one single precision floating-point value and **merge with high bits in xmm2**. This merge with high bits of xmm2 causes a false dependency as xmm1 and xmm2 are the same in `(dst, dst, src)` encoding.
>> 
>> We are removing the false dependency by (1) removing the m64 source in VCVTSDSS instruction encoding in the .ad file (2) load `m64` source in `src` before calling `VCVTSD2SS `and explicitly zeroing out the of high bits in `src` using `vmovsd  src, m64` and then calling `VCVTSD2SS dst, src, src`. Thus `dst[0:63]` now gets the result of convert operation from `src[0:63]` and since` src[64:127]` is already zeroed out, it's put in `dst[64:127] `without a false dependency.
>> 
>> Thanks,
>> Vamsi
>
> Thank you for explaining.

This is a downcast from double precision to single precision value, thus only lower 32 bits of destination hold the actual results for conversion, upper 127:32 bits are copied from non destructive source operand for vex encoded instruction.

VCVTSD2SS (VEX.128 Encoded Version) ¶
DEST[31:0] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC2[63:0]);
DEST[127:32] := SRC1[127:32]
DEST[MAXVL-1:128] := 0

User is only interested in lower 32 bit of destination and passing source as NDS will prevent false dependency for AVX targets since instruction dispatch will not be held for false dependency anymore and will be issued to OOO backend the moment source is ready

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18503#discussion_r1552785288


More information about the hotspot-compiler-dev mailing list