VectorAPI Constant Shift for Byte and Short Intrinsic Support

Wed Jan 16 17:05:23 UTC 2019

Hi Vladimir,
I added a rule in the matcher to support this constraint. Please find the patch below:
http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantShift/webrev.02/

Thanks,
Shravya.

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] 
Sent: Tuesday, January 15, 2019 5:21 PM
To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>; panama-dev <panama-dev at openjdk.java.net>
Subject: Re: VectorAPI Constant Shift for Byte and Short Intrinsic Support

> Not adding 32B for AVX=1 was intentional. I updated the patch with your comments. Please find the webrev below.
> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantSh
> ift/webrev.01/

So, what's expected behavior for 32B variants with AVX=1 then?
If you deliberately keep 32B variants not supported for now, I'd expect to see that listed in Matcher::match_rule_supported_vector() as an implementation constraint.

The operations shouldn't be intrinsified (because there are no applicable rules in AD file), but Matcher::match_rule_supported_vector()
only limits support to UseSSE >= 4.

Am I missing something?

Best regards,
Vladimir Ivanov

> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Thursday, January 10, 2019 2:39 PM
> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>; panama-dev 
> <panama-dev at openjdk.java.net>
> Subject: Re: VectorAPI Constant Shift for Byte and Short Intrinsic 
> Support
> 
> 
>> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantS
>> h
>> ift/webrev.00/
> 
> Don't you need to update Javadoc as well?
> 
> Though Javadoc refers to shift operations, implementation does more than that ("a >> i" vs "a >> (i & 7)" / "a >> (i & 15)"):
> 
>       /**
>        * Arithmetically right shifts (or signed right shifts) this vector by the
>        * broadcast of an input scalar.
>        * <p>
>        * This is a vector binary operation where the primitive arithmetic right
>        * shift operation ({@code >>}) is applied to lane elements.
>        *
>        * @param s the input scalar; the number of the bits to right shift
>        * @return the result of arithmetically right shifting this vector by the
>        * broadcast of an input scalar
>        */
>       public abstract ShortVector aShiftR(int s);
> 
> vs
> 
>       @Override
>       @ForceInline
>       public Short256Vector aShiftR(int s) {
>           return VectorIntrinsics.broadcastInt(
>               VECTOR_OP_RSHIFT, Short256Vector.class, short.class, LENGTH,
>               this, s,
>               (v, i) -> v.uOp((__, a) -> (short) (a >> (i & 15))));
>       }
> 
> =================================================
> 
> I assume C2 backend support for Short*Vector is already there.
> 
> src/hotspot/cpu/x86/x86.ad:
> 
> +instruct vsll32B_avx(vecY dst, vecY src, vecS shift, vecY tmp, vecY
> tmp2, rRegL scratch) %{
> +  predicate(UseAVX > 1 && n->as_Vector()->length() == 32);
> 
> Is it deliberate there's no 32B support with UseAVX=1?
> 
> On related node, do you need to add *ShiftVB cases into 
> Matcher::match_rule_supported_vector()?
> 
> Best regards,
> Vladimir Ivanov
>