RFR: 8291600: [vectorapi] vector cast op check is not always needed for vector mask cast

Xiaohong Gong xgong at openjdk.org
Tue Aug 16 08:39:16 UTC 2022


On Tue, 16 Aug 2022 02:18:31 GMT, Quan Anh Mai <duke at openjdk.org> wrote:

>>> > Yes you are right, the code would be mostly the same, which means we can reuse the existing match rules to additionally match `VectorMaskCast` for those cases. For other cases, in particular, narrowing cast to subword types, since avx < 3 does not support truncation cast and only provides saturation cast instructions. We need to truncate the upper bits ourselves. For example, a cast from `int` to `byte` is done as follow
>>> > ```
>>> > vpand dst, src, [external address mask]
>>> > vpackusdw dst, dst
>>> > vpermq dst, dst, 0x08
>>> > vpackuswb dst, dst
>>> > ```
>>> > 
>>> > 
>>> >     
>>> >       
>>> >     
>>> > 
>>> >       
>>> >     
>>> > 
>>> >     
>>> >   
>>> > For vector mask cast, we can get rid of the first masking and use the `vpackss`s instead, which removes the need to reference memory. Thanks.
>>> 
>>> I see, thanks! So would you like to provide the missing x86 backend implementation for `VectorMaskCast` ? If so we can use `VectorMaskCast` for all cases to simply the current codes? Thanks a lot!
>> 
>> Maybe I can refactor the codes in this patch, and add the same backend rules like `VectorCast`? And then you can create a followed-up patch to improve the x86 codegen if you like. WDYT?
>
> @XiaohongGong  I have created a PR against your branch, this only contains changes in the x86 backend to avoid any conflicts with the changes you may have made, could you have a look? Thanks
> 
> https://github.com/XiaohongGong/jdk/pull/2

Hi @merykitty , thanks for the x86 backend implementation to the VectorMaskCast. I did a local refactor to the current code and tested with your backend patch, and no issue is found now. I guess there is no regression and I will run more benchmarks with it. BTW, I'd like to push the new changes to the github after https://github.com/openjdk/jdk/pull/9346 is merged because we have the dependence in the aarch64 backend rules.  Thanks!

-------------

PR: https://git.openjdk.org/jdk/pull/9737


More information about the hotspot-compiler-dev mailing list