RFR: 8309660: C2: failed: XMM register should be 0-15 (UseKNLSetting and ConvF2HF)

Fri Jun 9 10:03:46 UTC 2023

On Thu, 8 Jun 2023 14:45:54 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> Context: `Float.floatToFloat16` -> `vcvtps2ph`.
> 
> **Problem**
> 
> vcvtps2ph
> pre=Assembler::VEX_SIMD_66
> opc=Assembler::VEX_OPCODE_0F_3A
> VEX.128.66.0F3A
> requires F16C
> 
> https://www.felixcloutier.com/x86/vcvtps2ph
> 
> So this is a non-AVX512 feature, and we should only use the registers `xmm0-15`.
> 
> There is also a AVX512 version, but it requires `AVX512VL and AVX512F`.
> 
> So on `x64`, we should only use registers `xmm0-15` if we do not have `AVX512VL`, and if we have it, then we can use `xmm0-31`.
> 
> **Suggested Solution**
> As @sviswa7 suggested, we should just use the `vlRegF` instead of `regF`, see discussion in comments.
> 
> **Testing**
> I simulated the patch on intel's `sde`. So now I'm confident that I don't generate code that uses `AVX512VL` registers (XMM16-31).
> 
> Running: tier1-6 + stress testing.

vcvtps2ph is correct. The problem is that the register from XMM16-XMM31 is being passed for KNL. The higher bank register is  and AVX512 feature. The fix would be:

--- a/src/hotspot/cpu/x86/x86.ad
+++ b/src/hotspot/cpu/x86/x86.ad
@@ -3638,7 +3638,7 @@ instruct sqrtD_reg(regD dst) %{
   ins_pipe(pipe_slow);
 %}

-instruct convF2HF_reg_reg(rRegI dst, regF src, regF tmp) %{
+instruct convF2HF_reg_reg(rRegI dst, vlRegF src, vlRegF tmp) %{
   effect(TEMP tmp);
   match(Set dst (ConvF2HF src));
   ins_cost(125);
@@ -3682,7 +3682,7 @@ instruct vconvF2HF_mem_reg(memory mem, vec src) %{
   ins_pipe( pipe_slow );
 %}

-instruct convHF2F_reg_reg(regF dst, rRegI src) %{
+instruct convHF2F_reg_reg(vlRegF dst, rRegI src) %{
   match(Set dst (ConvHF2F src));
   format %{ "vcvtph2ps $dst,$src" %}
   ins_encode %{

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14379#issuecomment-1583715351