RFR: 8318562: Computational test more than 2x slower when AVX instructions are used
Quan Anh Mai
qamai at openjdk.org
Fri Nov 17 21:06:44 UTC 2023
On Fri, 17 Nov 2023 20:05:03 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> I confirmed that this change solved performance issue on machines I tested (old Broadwell and Cascade Lake CPUs).
>> I am submitting our regular testing for approval.
>
> Thanks a lot for the reviews @vnkozlov @jatin-bhateja @merykitty.
@sviswa7 You mean `cvt xmm1, xmm0, xmm0` is slower than `xorps xmm1, xmm1; cvt xmm1, xmm1, xmm0`, right? Since `cvt xmm0, xmm0, xmm0` has self dependency so standalone its throughput will be lower than `xorps xmm1, xmm1; cvt xmm1, xmm1, xmm0` which does not.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16701#issuecomment-1817102967
More information about the hotspot-dev
mailing list