RFR: 8318650: Optimized subword gather for x86 targets. [v2]

Jatin Bhateja jbhateja at openjdk.org
Sun Oct 29 22:14:30 UTC 2023


On Fri, 27 Oct 2023 06:42:07 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> sub-word gather do not emit any predicated instructions, thus only VectorMaskUseLoad is relevant in this context, however AVX512 and SVE does have a direct predicated gather instructions for 32/64 bit types.
>
> I see, thanks! `VecMaskUsePred` is added to check `match_rule_supported_vector_masked` for normal vector ops. That's because we may add an additional mask input for those vector ops. But `Load|StoreVectorScatterMasked` are different. They point to the masked operations no matter how they are implemented. So just `VecMaskUseLoad` is fine for all these two ops for me.

I think its better to align masked sub-word gather implementation with non-sub-word once i.e. support intrinsfication only for  predicated targets. Respective backends may then choose to either emit a predicated loop like the one which this patch does OR directly emit a predicated instruction if target support it.

With this we may see some performance penalty for masked sub-word gathers on non-predicated targets since original java implementation will now become fallback code, but then same penalty exists for non-subword gathers today.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1375522151


More information about the hotspot-compiler-dev mailing list