VectorAPI VectorInsert Intrinsic

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Thu Jun 14 13:04:17 UTC 2018



On 14/06/2018 02:00, Rukmannagari, Shravya wrote:
> Hi Vladimir,
> Yes, I was referring to this patch.

Thanks for testing & review! Pushed:
   http://hg.openjdk.java.net/panama/dev/rev/e00750ed585a

Best regards,
Vladimir Ivanov

> 
> Thanks,
> Shravya.
> 
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Wednesday, June 13, 2018 3:55 PM
> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>; panama-dev at openjdk.java.net
> Subject: Re: VectorAPI VectorInsert Intrinsic
> 
> Shravya,
> 
>> There is a fix for 512 code, the vmovdqu has to be changed to evmovdquq. Everything else looks good and thanks a lot once again for making these changes. I'll update the patch once your changes are checked in.
> 
> Are you talking about the following patch?
>     http://cr.openjdk.java.net/~vlivanov/panama/vector/insert/webrev.00/
> 
> Best regards,
> Vladimir Ivanov
>>
>> Thanks,
>> Shravya.
>>
>> -----Original Message-----
>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
>> Sent: Friday, June 8, 2018 5:33 AM
>> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>;
>> panama-dev at openjdk.java.net
>> Subject: Re: VectorAPI VectorInsert Intrinsic
>>
>>
>>
>> On 08/06/2018 02:50, Rukmannagari, Shravya wrote:
>>> Hi Vladimir,
>>> Thanks for reviewing the patch. I updated the patch to reflect the changes. Regarding the variable indexed element, we currently haven't given a thought about it, but that seems to be the ultimate design goal for inserting, I can work on it in the next few days.
>>
>> I'd like to hear other opinions, but IMO it's quite important case to cover w.r.t. predictable performance goal.
>>
>> Regarding implementation considerations, I'd try to extract variable index variants into vector stubs and share them among all use sites.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> -----Original Message-----
>>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
>>> Sent: Wednesday, June 6, 2018 12:52 AM
>>> To: Rukmannagari, Shravya <shravya.rukmannagari at intel.com>;
>>> panama-dev at openjdk.java.net
>>> Subject: Re: VectorAPI VectorInsert Intrinsic
>>>
>>>
>>> Additionally:
>>>
>>> +instruct rvinsert64B(vecZ dst, vecZ src, vecZ tmp, rRegI val, immI
>>> +idx) %{
>>> +  predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
>>>
>>> +instruct rvinsert32S(vecZ dst, vecZ src, vecZ tmp, rRegI val, immI
>>> +idx) %{
>>> +  predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_SHORT);
>>>
>>> +instruct rvinsert16I(vecZ dst, vecZ src, vecZ tmp, rRegI val, immI
>>> +idx) %{
>>> +  predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_INT);
>>>
>>> +instruct rvinsert8L(vecZ dst, vecZ src, vecZ tmp, rRegL val, immI
>>> +idx) %{
>>> +  predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_LONG);
>>>
>>> +  instruct rvinsert16F(vecZ dst, vecZ src, vecZ tmp, regF val, immI idx) %{
>>> +    predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_FLOAT);
>>>
>>> +  instruct rvinsert8D(vecZ dst, vecZ src, vecZ tmp, regD val, rRegL
>>> tmp1, immI idx) %{
>>> +    predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_DOUBLE);
>>>
>>> Shouldn't UseAVX > 2 be used instead?
>>>
>>>
>>> Also, I suspect *_avx rules are never used.
>>>
>>> +instruct rvinsert16B(vecX dst, vecX src, rRegI val, immI idx) %{
>>> +  predicate(UseSSE > 3 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
>>>
>>> +instruct rvinsert16B_avx(vecX dst, vecX src, rRegI val, immI idx) %{
>>> +  predicate(UseAVX > 0 &&
>>> n->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
>>>
>>> Both predicates are true when UseAVX > 0 and the first matching one
>>> is used always. Don't you need "UseSSE > 3 && UseAVX == 0" in the first case?
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 06/06/2018 02:30, Vladimir Ivanov wrote:
>>>>
>>>>> http://cr.openjdk.java.net/~srukmannagar/VectorAPI_vectorInsert/web
>>>>> r
>>>>> ev.00/
>>>>>
>>>>
>>>> Overall, looks good!
>>>>
>>>> +instruct rvinsert32B(vecY dst, vecY src, vecY tmp, rRegI val, immI
>>>> +idx) %{
>>>> ...
>>>> +    if(id <= 15) {
>>>> ...
>>>> +      __ vpinsrb($tmp$$XMMRegister, $tmp$$XMMRegister,
>>>> +$val$$Register,
>>>> id - 16);
>>>> ...
>>>>
>>>> +  instruct rvinsert8D(vecZ dst, vecZ src, vecZ tmp, regD val, rRegL
>>>> tmp1, immI idx) %{
>>>> +      } else if(id <= 5) {
>>>> ...
>>>> +        __ vpinsrq($tmp$$XMMRegister, $tmp$$XMMRegister,
>>>> $tmp1$$Register, id - 6);
>>>>
>>>> "id - 16", "id - 6". Looks like copy-paste errors.
>>>>
>>>>
>>>>
>>>> +instruct rvinsert8B(vecD dst, vecD src, rRegI val, immI idx) %{
>>>> ...
>>>> +    int id = 0x7 & $idx$$constant;
>>>>
>>>> I'd prefer to see "immU3 idx" instead. idx should always be in-bound
>>>> and it's validated on JDK side before calling intrinsic.
>>>>
>>>>
>>>> On a more general topic: what are the plans for optimizing variable
>>>> indexed element accesses? It would be unfortunate if a non-constant
>>>> index usage caused vector boxing.
>>>>
>>>> I remember I played with dispatch tables [1] and it worked ok. Are
>>>> there better alternatives available?
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1]
>>>> http://hg.openjdk.java.net/panama/dev/file/dd9ee9434abe/test/jdk/pan
>>>> a
>>>> ma/snippets/VectorUtils.java#l300
>>>>


More information about the panama-dev mailing list