[15] RFR (L) 8235825: C2: Merge AD instructions for Replicate nodes
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Dec 13 01:16:29 UTC 2019
Vladimir
replicateB
Can you fold it differently?
ReplB_reg_leg
predicate(!VM_Version::supports_avx512vlbw());
ins_encode %{
uint vlen = vector_length(this);
__ movdl($dst$$XMMRegister, $src$$Register);
__ punpcklbw($dst$$XMMRegister, $dst$$XMMRegister);
__ pshuflw($dst$$XMMRegister, $dst$$XMMRegister, 0x00);
if (vlen > 8) {
__ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
if (vlen > 16) {
__ vinserti128_high($dst$$XMMRegister, $dst$$XMMRegister);
if (vlen > 32) {
assert(vlen == 64, "sanity");
__ vinserti64x4($dst$$XMMRegister, $dst$$XMMRegister, $dst$$XMMRegister, 0x1);
Similar ReplB_imm_leg for which I don't see new implementation.
It should also simplify code for avx512 which one or 2 instructions.
Other types changes can be done same way.
Thanks,
Vladimir
On 12/12/19 3:19 AM, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/all/
> https://bugs.openjdk.java.net/browse/JDK-8235825
>
> Merge AD instructions for the following vector nodes:
> - ReplicateB, ..., ReplicateD
>
> Individual patches:
>
> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/individual
>
> Testing: tier1-4, test run on different CPU flavors (KNL, CLX)
>
> Contributed-by: Jatin Bhateja <jatin.bhateja at intel.com>
> Reviewed-by: vlivanov, sviswanathan, ?
>
> Best regards,
> Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list