[Vector]fromArray/allTrue performance
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Nov 20 01:39:35 UTC 2018
Zhuo, thanks for the feedback!
Unfortunately, it seems the attachment was stripped by mail server.
Do you mind resending it inline?
I'll let Intel folks comment on individual cases you refer to, but
overall everything marked as "Implementation limitation" is a
work-in-progress and will be addressed later at some point.
Best regards,
Vladimir Ivanov
On 19/11/2018 02:41, 王卓(卓仁) wrote:
> Hello,
> I am Zhuoren from Alibaba JVM team. Glad to take part in project panama.
> We have integrated Vector API to Alibaba JDK, and another Alibaba team is now using Vector API to optimize their applications.
> Here are some issues we found in our previous work and blocked future optimization, and I am searching for solutions.
>
> 1. The performance of Long128Species::fromArray(long[] a, int ax, Mask<Long, Shapes.S128Bit> m) is bad. The attached java file is a test for this API.
> I checked the performance issue is due to intrinsic failure.
> x86.ad:
> case Op_VectorLoadMask:
> if (UseSSE <= 3) { ret_value = false; }
> else if (vlen == 1 || vlen == 2) { ret_value = false; } // Implementation limitation
> else if (size_in_bits >= 256 && UseAVX < 2) { ret_value = false; } // Implementation limitation
> break;
>
> I wonder if there will be a fix for this issue, because this API is very important to our optimizations. The fromArray.patch is the workaround we are using. How to improve this workaround is also welcome.
>
> 2. anyTrue/allTrue on 512 bit
> This is another intrinsic failure:
> case Op_VectorTest:
> if (UseAVX <= 0) { ret_value = false; }
> else if (size_in_bits != 128 && size_in_bits != 256) { ret_value = false; } // Implementation limitation
> break;
> This also blocked some of our 512 bit optimizations, but I have no workaround for this now.
>
> Please share with me your advice, thanks!
>
> Regards,
> Zhuo
>
More information about the panama-dev
mailing list