Request for reviews (XL): 7116452: Add support for AVX instructions

Tue Dec 13 18:30:36 PST 2011

Per Tom's request I compared generated code (before and after these changes) on 
non AVX machine to verify that generated code stays the same.

I found few differences with 64-bit VM because of this change:

 > As part of these changes REX.W prefix was removed from instructions
 > where it was not needed: MOVDQA, MOVDQU, PCMPESTRI, PSRLQ, PSRLDQ, PTEST.

and also because I removed incorrect prefix 0x66 when I replaced next code in 
Push_SrcXD:

       // fldd [rsp]
       emit_opcode(cbuf, 0x66);
       emit_opcode(cbuf, 0xDD);
       encode_RegMem(cbuf, 0x0, RSP_enc, 0x4, 0, 0, false);

Difference with 32-bit VM is mostly due to switching from comiss to ucomiss 
instructions. Also movapd is used instead of movsd to move between XMM registers 
because I replaced direct encoding with movdbl() and movflt().

But I did screwed up when replaced enc_copy_wide() in x86_64.ad since I did not 
noticed that it does not generate instruction if source and destination 
registers are the same. I fixed it.

thanks,
Vladimir

Vladimir Kozlov wrote:
> http://cr.openjdk.java.net/~kvn/7116452/webrev.01
> 
> 7116452: Add support for AVX instructions
> 
> Initial changes were submitted by Intel. I refactored it to simplify 
> prefix usage in instructions codding (added simd_prefix methods) and VEX 
> encoding was fixed to generate 2bytes prefix when possible. Changes in 
> .ad files were not complete (especially in 32-bit .ad) and were not 
> aggressive as I want. I changed more mach nodes encoding to use 
> macroassembler instructions. Added missing decoding parts in 
> Assembler::locate_operand() and NativeMovRegMem::instruction_start().
> 
> Note: no new AVX instructions were added in these changes. And no 3 
> operands format was added to MacroAssembler. It will be other changes. 
> Destination operand is used as second source in current implementation 
> where applicable.
> 
> Float compare implementation in x86_32.ad was replaced with 
> implementation from x86_64.ad. It uses less branches and does not 
> destroy EAX register. Note: ucomiss instruction produces the same result 
> as comiss since we masking numeric exceptions. Also ucomiss could be a 
> little faster since it does not need to check control word for QNaN values.
> 
> Vector instructions with VEX prefix use unaligned load for memory 
> operands where with old REX prefix it require 16 bytes alignment. 
> Instructions version with memory operand were added for that but they 
> should be used only with VEX prefix, assert was added. ANDPD and XORPD 
> with memory operand were used before with 16 bytes aligned memory (we 
> have special code to do it). I added assert to check address alignment 
> for these instructions.
> 
> As part of these changes REX.W prefix was removed from instructions 
> where it was not needed: MOVDQA, MOVDQU, PCMPESTRI, PSRLQ, PSRLDQ, PTEST.
> 
> 
> Tested with UseAVX=1|0, UseSSE=4|2|1|0, CTW, VM regression tests, nsk.
> 
> Thanks,
> Vladimir