RFR: 8320206: Some intrinsics/stubs missing vzeroupper on x86_64

Sandhya Viswanathan sviswanathan at openjdk.org
Fri Nov 17 20:08:45 UTC 2023


On Fri, 17 Nov 2023 03:02:34 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> Also, as @cl4es noticed recently, `vzeroupper` is unconditionally inserted before all `CallLeaf`/`CallLeafNoFP` (but not for `CallLeafVector`) irrespective of what stub is being called. So, when it comes to C2, there are 2 `vzeroupper` instructions issued.
> 
> There's an attempt to detect when `vzeroupper` is needed (`generate_vzeroupper()` predicate), but it depends on 2 flags which are set for the whole compilation.
> 
> @sviswa7 How expensive `vzeroupper` is in practice? Does it make sense to introduce a finer-grained heuristic?

vzeroupper is not an expensive instruction but any optimizations is always helpful.
I would like to clarify that both max_vector_size() and clear_upper_avx() used in generate_vzeroupper() are set per compilation and not global. I think that is what you meant as well.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16678#issuecomment-1817020277


More information about the hotspot-compiler-dev mailing list