RFR: 8323116: [REDO] Computational test more than 2x slower when AVX instructions are used [v4]
Quan Anh Mai
qamai at openjdk.org
Fri Apr 5 20:18:10 UTC 2024
On Fri, 5 Apr 2024 15:55:01 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> @jatin-bhateja I get it but IMO it shouldn't be the responsibility of the assembler to do that, the assembler should emit machine code in a manner that respects what is being written.
>
>> This is a downcast from double precision to single precision value, thus only lower 32 bits of destination hold the actual results for conversion, upper 127:32 bits are copied from non destructive source operand for vex encoded instruction.
>
> Please see the updated description incorporating the correction dst[63:0] -> dst[31,0] for `cvtss2sd`
@vamsi-parasa
> This change modifies the defined behaviours of cvtss2sd. Without AVX, it would retains the bits 64-127 of dst while with it the bits would be copied from src. I would suggest separating the matching rules instead.
Please address this, fyi in similar cases we created separate methods in the `MacroAssembler` such as `movflt` or `movdbl`. Feel free to disagree but I think the assembler should not behave differently compared to the corresponding assembly instruction.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18503#discussion_r1554255271
More information about the hotspot-compiler-dev
mailing list