RFR: 8320206: Some intrinsics/stubs missing vzeroupper on x86_64

Vladimir Ivanov vlivanov at openjdk.org
Fri Nov 17 03:06:30 UTC 2023


On Wed, 15 Nov 2023 21:28:46 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> The following intrinsics/stubs are missing vzeroupper:
> adler32 (since JDK17)
> count_positives (since JDK 9)
> chacha20 (since JDK 20)
> string indexOfChar (since JDK 9)
> 
> Adding the missing vzeroupper to avoid AVX-SSE transition penalties.

Also, as @cl4es noticed recently, `vzeroupper` is unconditionally inserted before all `CallLeaf`/`CallLeafNoFP` (but not for `CallLeafVector`) irrespective of what stub is being called. So, when it comes to C2, there are 2 `vzeroupper` instructions issued.

There's an attempt to detect when `vzeroupper` is needed (`generate_vzeroupper()` predicate), but it depends on 2 flags which are set for the whole compilation. 

@sviswa7 How expensive `vzeroupper` is in practice? Does it make sense to introduce a finer-grained heuristic?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16678#issuecomment-1815675210


More information about the hotspot-compiler-dev mailing list