VectorAPI Constant Shift for Byte and Short Intrinsic Support

Wed Jan 16 17:59:48 UTC 2019

> I added a rule in the matcher to support this constraint. Please find the patch below:
> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantShift/webrev.02/

Looks good!

Best regards,
Vladimir Ivanov

> 
> Thanks,
> Shravya.
> 
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Tuesday, January 15, 2019 5:21 PM
> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>; panama-dev <panama-dev at openjdk.java.net>
> Subject: Re: VectorAPI Constant Shift for Byte and Short Intrinsic Support
> 
> 
>> Not adding 32B for AVX=1 was intentional. I updated the patch with your comments. Please find the webrev below.
>> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantSh
>> ift/webrev.01/
> 
> So, what's expected behavior for 32B variants with AVX=1 then?
> If you deliberately keep 32B variants not supported for now, I'd expect to see that listed in Matcher::match_rule_supported_vector() as an implementation constraint.
> 
> The operations shouldn't be intrinsified (because there are no applicable rules in AD file), but Matcher::match_rule_supported_vector()
> only limits support to UseSSE >= 4.
> 
> Am I missing something?
> 
> Best regards,
> Vladimir Ivanov
> 
>> -----Original Message-----
>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
>> Sent: Thursday, January 10, 2019 2:39 PM
>> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>; panama-dev
>> <panama-dev at openjdk.java.net>
>> Subject: Re: VectorAPI Constant Shift for Byte and Short Intrinsic
>> Support
>>
>>
>>> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_ByteShortConstantS
>>> h
>>> ift/webrev.00/
>>
>> Don't you need to update Javadoc as well?
>>
>> Though Javadoc refers to shift operations, implementation does more than that ("a >> i" vs "a >> (i & 7)" / "a >> (i & 15)"):
>>
>>        /**
>>         * Arithmetically right shifts (or signed right shifts) this vector by the
>>         * broadcast of an input scalar.
>>         * <p>
>>         * This is a vector binary operation where the primitive arithmetic right
>>         * shift operation ({@code >>}) is applied to lane elements.
>>         *
>>         * @param s the input scalar; the number of the bits to right shift
>>         * @return the result of arithmetically right shifting this vector by the
>>         * broadcast of an input scalar
>>         */
>>        public abstract ShortVector aShiftR(int s);
>>
>> vs
>>
>>        @Override
>>        @ForceInline
>>        public Short256Vector aShiftR(int s) {
>>            return VectorIntrinsics.broadcastInt(
>>                VECTOR_OP_RSHIFT, Short256Vector.class, short.class, LENGTH,
>>                this, s,
>>                (v, i) -> v.uOp((__, a) -> (short) (a >> (i & 15))));
>>        }
>>
>> =================================================
>>
>> I assume C2 backend support for Short*Vector is already there.
>>
>> src/hotspot/cpu/x86/x86.ad:
>>
>> +instruct vsll32B_avx(vecY dst, vecY src, vecS shift, vecY tmp, vecY
>> tmp2, rRegL scratch) %{
>> +  predicate(UseAVX > 1 && n->as_Vector()->length() == 32);
>>
>> Is it deliberate there's no 32B support with UseAVX=1?
>>
>> On related node, do you need to add *ShiftVB cases into
>> Matcher::match_rule_supported_vector()?
>>
>> Best regards,
>> Vladimir Ivanov
>>