RFR: 8323116: [REDO] Computational test more than 2x slower when AVX instructions are used [v5]
Vladimir Kozlov
kvn at openjdk.org
Fri Apr 5 18:49:10 UTC 2024
On Fri, 5 Apr 2024 18:17:00 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> My new testing passed.
>> But I want to hear an answer to @merykitty suggestion about using xmm15.
>
> @vnkozlov If I understand the proposal from @merykitty correctly, the suggestion is to reserve xmm15 as non allocatable throughout. This sounds like a big overhead for cases where every xmm register is usable say in a Vector API kernel. From Vamsi's microbenchmark runs, he has clearly shown that the gain of his optimization is way more than any overhead of doing pxor just before the converts.
Okay. I will wait changes @sviswa7 suggested to use vlRegD and vlRegF.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18503#issuecomment-2040435432
More information about the hotspot-compiler-dev
mailing list