[vectorIntrinsics+mask] RFR: 8264563: Add masked vector intrinsics for binary/store operations [v5]

Xiaohong Gong xgong at openjdk.java.net
Sun Apr 25 02:10:51 UTC 2021


On Thu, 22 Apr 2021 09:42:12 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>>> > > I am tempted to have a template method for both non-mask and mask. Thus specialized code (that before calling the intrinsic) may be somewhat duplicated before calling `binaryMaskOp`, using a constant for the op -> lambda function, and passing in null or the mask value.
>>> > 
>>> > 
>>> > Yeah, that's why I added the `lanewise0` and `lanewise0Template` methods.
>>> 
>>> I am suggesting to separate out templates for non-masked and masked, similar to the current pattern. That would result in some duplication for the special cases, but i think makes it easier to reason about in one place, rather than being somewhat spread out. Perhaps easier to explain as code. We should just iterate on the branch.
>> 
>> Thanks for your more details! I think I misunderstood your point before. I will think about your suggestion. Thanks so much!
>
>> > > > I am tempted to have a template method for both non-mask and mask. Thus specialized code (that before calling the intrinsic) may be somewhat duplicated before calling `binaryMaskOp`, using a constant for the op -> lambda function, and passing in null or the mask value.
>> > > 
>> > > 
>> > > Yeah, that's why I added the `lanewise0` and `lanewise0Template` methods.
>> > 
>> > 
>> > I am suggesting to separate out templates for non-masked and masked, similar to the current pattern. That would result in some duplication for the special cases, but i think makes it easier to reason about in one place, rather than being somewhat spread out. Perhaps easier to explain as code. We should just iterate on the branch.
>> 
>> Thanks for your more details! I think I misunderstood your point before. I will think about your suggestion. Thanks so much!
> 
> Hi @PaulSandoz , the latest commit changes something according to your suggestion. It totally separateS the special handles for the special cases for non-masked and masked version, although finally they still use one template `lanewise0Template`. Could you please take a look again? If it matches what you suggested above, do you think it need to separate out the template methods further? Thanks!

> @XiaohongGong See the following diff which more accurately depicts what i was suggesting:
> https://gist.github.com/PaulSandoz/d93fa2a78c3c7be5fdb2d714371639ce
> i.e. we follow the exiting template pattern more directly in both the non-mask and mask.
> Perhaps we find later on we can consolidate, but right now i find that pattern easier to follow and it is very clear on the null contract for no mask.

Thanks for your suggestion! I also tried this way before the current version. However, the main concern for me is the species checking for vector mask.  See the codes:

    @Override
    @ForceInline
    public final Int128Vector lanewise(Binary op, Vector<Integer> v, VectorMask<Integer> m) {
        return (Int128Vector) super.lanewiseTemplate(op, Int128Mask.class, v, (Int128Mask) m);  // specialize
    }

The override `lanewise` method must cast the mask value to `Int128Mask` before it calls the super class's `lanewiseTemplate` which does the species checking later. It seems more reasonable for me to do the species checking before doing the type casting. That means the species checking must be done in the `lanewise` method while not `lanewiseTemplate`. Add considering adding the species checking for each subclass's `lanewise` seems duplicate, finally I change the codes to the current style.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/57


More information about the panama-dev mailing list