[vectorIntrinsics+mask] RFR: 8273057: [vector] New VectorAPI "SelectiveStore"

Viswanathan, Sandhya sandhya.viswanathan at intel.com
Fri Sep 3 01:43:06 UTC 2021


Today we have rearrange, slice and unslice methods to do cross lane movements.  
It looks to me that best way is to extend this and provide compress and expand as a primitive. It seems more natural from programmer's perspective as well as easy to do good code gen.
Doing it as a special purpose rearrange with mask could be confusing as in the current API rearrange with mask has a different meaning. Backend implementation is also not clear on how then to achieve the desired single instruction code gen.
As John suggests, on architectures that doesn’t support compress/expand, underlying compress/expand could be implemented in terms of mask->partition shuffle -> rearrange.

Best Regards,
Sandhya

-----Original Message-----
From: panama-dev <panama-dev-retn at openjdk.java.net> On Behalf Of John Rose
Sent: Thursday, September 02, 2021 11:36 AM
To: Paul Sandoz <paul.sandoz at oracle.com>
Cc: Ningsheng Jian <njian at openjdk.java.net>; panama-dev <panama-dev at openjdk.java.net>
Subject: Re: [vectorIntrinsics+mask] RFR: 8273057: [vector] New VectorAPI "SelectiveStore"

On Aug 31, 2021, at 9:44 AM, Paul Sandoz <paul.sandoz at oracle.com<mailto:paul.sandoz at oracle.com>> wrote:

Yes, my suggestion is that a vector-to-vector compress might be a composition of mask -> partitioning shuffle -> rearrange, such that on supported architectures it reduces down to a single instruction. In combination with a store and prefix mask it may be possible to reduce further to single instruction accepting the source vector, mask, and a memory location.

As I argued in my previous, it may be just as well to think of compress as its own primitive, even if under the covers it is implemented using shuffle.

I think it’s worth thinking more about anti-shuffle, what that would be like.  Mathematical permutations do not come in two kinds, but shuffles and anti-shuffles are distinct because only the former duplicate and only the latter collide.




More information about the panama-dev mailing list