RFR: 8323116: [REDO] Computational test more than 2x slower when AVX instructions are used [v5]
Srinivas Vamsi Parasa
duke at openjdk.org
Fri Apr 5 19:53:31 UTC 2024
On Fri, 5 Apr 2024 18:17:00 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> My new testing passed.
>> But I want to hear an answer to @merykitty suggestion about using xmm15.
>
> @vnkozlov If I understand the proposal from @merykitty correctly, the suggestion is to reserve xmm15 as non allocatable throughout. This sounds like a big overhead for cases where every xmm register is usable say in a Vector API kernel. From Vamsi's microbenchmark runs, he has clearly shown that the gain of his optimization is way more than any overhead of doing pxor just before the converts.
> Okay. I will wait changes @sviswa7 suggested to use vlRegD and vlRegF.
Please see the updated commit which uses vlRegD and vlRegF.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18503#issuecomment-2040529692
More information about the hotspot-compiler-dev
mailing list