RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature
Jatin Bhateja
jbhateja at openjdk.java.net
Mon Apr 11 09:08:41 UTC 2022
On Thu, 31 Mar 2022 03:53:15 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> Yeah, maybe I misunderstood what you mean. So maybe the masked store `(store(src, m))` could be implemented with:
>>
>> 1) v1 = load
>> 2) v2 = blend(load, src, m)
>> 3) store(v2)
>>
>> Let's record this a JBS and fix it with a followed-up patch. Thanks!
>
> The optimization for masked store is recorded to: https://bugs.openjdk.java.net/browse/JDK-8284050
> The blend should be with the intended-to-store vector, so that masked lanes contain the need-to-store elements and unmasked lanes contain the loaded elements, which would be stored back, which results in unchanged values.
It may not work if memory is beyond legal accessible address space of the process, a corner case could be a page boundary. Thus re-composing the intermediated vector which partially contains actual updates but effectively perform full vector write to destination address may not work in all scenarios.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8035
More information about the core-libs-dev
mailing list