RFR: 8320206: Some intrinsics/stubs missing vzeroupper on x86_64
Vladimir Ivanov
vlivanov at openjdk.org
Fri Nov 17 03:06:30 UTC 2023
On Wed, 15 Nov 2023 21:28:46 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
> The following intrinsics/stubs are missing vzeroupper:
> adler32 (since JDK17)
> count_positives (since JDK 9)
> chacha20 (since JDK 20)
> string indexOfChar (since JDK 9)
>
> Adding the missing vzeroupper to avoid AVX-SSE transition penalties.
Also, as @cl4es noticed recently, `vzeroupper` is unconditionally inserted before all `CallLeaf`/`CallLeafNoFP` (but not for `CallLeafVector`) irrespective of what stub is being called. So, when it comes to C2, there are 2 `vzeroupper` instructions issued.
There's an attempt to detect when `vzeroupper` is needed (`generate_vzeroupper()` predicate), but it depends on 2 flags which are set for the whole compilation.
@sviswa7 How expensive `vzeroupper` is in practice? Does it make sense to introduce a finer-grained heuristic?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16678#issuecomment-1815675210
More information about the hotspot-compiler-dev
mailing list