RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v2]

Jatin Bhateja jbhateja at openjdk.java.net
Thu Apr 28 06:30:42 UTC 2022


On Wed, 20 Apr 2022 02:44:39 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>>> The blend should be with the intended-to-store vector, so that masked lanes contain the need-to-store elements and unmasked lanes contain the loaded elements, which would be stored back, which results in unchanged values.
>> 
>> It may not work if memory is beyond legal accessible address space of the process, a corner case could be a page boundary.  Thus re-composing the intermediated vector which partially contains actual updates but effectively perform full vector write to destination address may not work in all scenarios.
>
> Thanks for the comment! So how about adding the check for the valid array range like the masked vector load?
> Codes like:
> 
> public final
>     void intoArray(byte[] a, int offset,
>                    VectorMask<Byte> m) {
>         if (m.allTrue()) {
>             intoArray(a, offset);
>         } else {
>             ByteSpecies vsp = vspecies();
>             if (offset >= 0 && offset <= (a.length - vsp.length())) {     // a full range check
>                 intoArray0(a, offset, m, /* usePred */ false);                   // can be vectorized by load+blend_store
>             } else {
>                 checkMaskFromIndexSize(offset, vsp, m, 1, a.length);
>                 intoArray0(a, offset, m, /* usePred */ true);                    // only be vectorized by the predicated store
>             }
>         }
>     }

Thanks, this looks ok since out of range condition will not be intrinsified if targets does not support predicated vector store.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8035


More information about the core-libs-dev mailing list