Request for reviews (XL): 7116452: Add support for AVX instructions
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Dec 13 18:30:36 PST 2011
Per Tom's request I compared generated code (before and after these changes) on
non AVX machine to verify that generated code stays the same.
I found few differences with 64-bit VM because of this change:
> As part of these changes REX.W prefix was removed from instructions
> where it was not needed: MOVDQA, MOVDQU, PCMPESTRI, PSRLQ, PSRLDQ, PTEST.
and also because I removed incorrect prefix 0x66 when I replaced next code in
Push_SrcXD:
// fldd [rsp]
emit_opcode(cbuf, 0x66);
emit_opcode(cbuf, 0xDD);
encode_RegMem(cbuf, 0x0, RSP_enc, 0x4, 0, 0, false);
Difference with 32-bit VM is mostly due to switching from comiss to ucomiss
instructions. Also movapd is used instead of movsd to move between XMM registers
because I replaced direct encoding with movdbl() and movflt().
But I did screwed up when replaced enc_copy_wide() in x86_64.ad since I did not
noticed that it does not generate instruction if source and destination
registers are the same. I fixed it.
thanks,
Vladimir
Vladimir Kozlov wrote:
> http://cr.openjdk.java.net/~kvn/7116452/webrev.01
>
> 7116452: Add support for AVX instructions
>
> Initial changes were submitted by Intel. I refactored it to simplify
> prefix usage in instructions codding (added simd_prefix methods) and VEX
> encoding was fixed to generate 2bytes prefix when possible. Changes in
> .ad files were not complete (especially in 32-bit .ad) and were not
> aggressive as I want. I changed more mach nodes encoding to use
> macroassembler instructions. Added missing decoding parts in
> Assembler::locate_operand() and NativeMovRegMem::instruction_start().
>
> Note: no new AVX instructions were added in these changes. And no 3
> operands format was added to MacroAssembler. It will be other changes.
> Destination operand is used as second source in current implementation
> where applicable.
>
> Float compare implementation in x86_32.ad was replaced with
> implementation from x86_64.ad. It uses less branches and does not
> destroy EAX register. Note: ucomiss instruction produces the same result
> as comiss since we masking numeric exceptions. Also ucomiss could be a
> little faster since it does not need to check control word for QNaN values.
>
> Vector instructions with VEX prefix use unaligned load for memory
> operands where with old REX prefix it require 16 bytes alignment.
> Instructions version with memory operand were added for that but they
> should be used only with VEX prefix, assert was added. ANDPD and XORPD
> with memory operand were used before with 16 bytes aligned memory (we
> have special code to do it). I added assert to check address alignment
> for these instructions.
>
> As part of these changes REX.W prefix was removed from instructions
> where it was not needed: MOVDQA, MOVDQU, PCMPESTRI, PSRLQ, PSRLDQ, PTEST.
>
>
> Tested with UseAVX=1|0, UseSSE=4|2|1|0, CTW, VM regression tests, nsk.
>
> Thanks,
> Vladimir
More information about the hotspot-compiler-dev
mailing list