RFR: 8277617: Adjust AVX3Threshold for copy/fill stubs [v4]
David Holmes
dholmes at openjdk.java.net
Thu Nov 25 05:14:04 UTC 2021
On Wed, 24 Nov 2021 16:55:32 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Currently 32-byte instructions are used for small array copy and clear.
>> This can be optimized by using 64-byte instructions.
>>
>> Please review.
>>
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:
>
> Override threshold only if flag is default
General change looks okay but I have a query below about startup overhead.
Also what testing has been done for this aside from the benchmarking? AFAICS there is only a single test that currently sets AVX3Threshold to zero so we have very little test coverage for that. With this change it will be zero all the time on some systems and so will now be exercising code paths that do not normally get executed.
Thanks,
David
src/hotspot/cpu/x86/vm_version_x86.cpp line 1893:
> 1891: return AVX3Threshold;
> 1892: }
> 1893: }
I am somewhat concerned about the overhead of evaluating this each time it is used. I realize these will only be startup costs while generating the stubs, not part of the stubs themselves, but it still may be a startup impact. Can you run a startup benchmark to see if there is any problem?
I was also thinking the more direct formulation would just be:
```return (is_intel_family_core() && supports_serialize() && FLAG_IS_DEFAULT(AVX3Threshold)) ? 0 : AVX3Threshold;```
-------------
PR: https://git.openjdk.java.net/jdk/pull/6512
More information about the hotspot-dev
mailing list