RFR: 8351623: VectorAPI: Add SVE implementation of subword gather load operation [v5]

Emanuel Peter epeter at openjdk.org
Tue Sep 9 07:32:39 UTC 2025


On Mon, 8 Sep 2025 02:57:55 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>>> That semantic is not quite what I would expect from `Concatenate`. Maybe we can call it something else? `VectorConcatenateAndNarrowNode`?
>> 
>> Yeah, `VectorConcatenateAndNarrowNode` would be much match. I just thought the name would be too long. I will change it in next commit. Thanks for your suggestion!
>
>> Have you considered using `2x Cast + Concatenate` instead, and just matching that in the backend? I don't remember how to do the mere Concat, but it should be possible via the `unslice` or some other operation that concatenates two vectors.
> 
> Would using `2x Cast + Concatenate` make the IRs and match rule more complex? Mere concatenate would be something like `vector slice` in Vector API.  It concatenates two vectors into one with an index denoting the merging position. And it requires the vector types are the same for two input vectors and the dst vector. Hence, if we want to separate this operation with cast and concatenate, the IRs would be (assume original type of `v1/v2` is `4-int`, the result type should be `8-short`):
> 1) Narrow two input vectors:
> `v1 = VectorCast(v1)  (4-short); v2 = VectorCast(v2) (4-short)`. 
> The vector length are not changed while the element size is half size. Hence the vector length in bytes is half size as well.
> 2) Resize `v1` and `v2` to double vector length. The higher bits are cleared:
> `v1 = VectorReinterpret(v1) (8-short); v2 = VectorReinterpret(v2) (8-short)`.
> 3) Concatenate `v1` and `v2` like slice. The position is the middle of the vector length.
> `v = VectorSlice(v1, v2, 4)  (8-short)`.
> 
> If we want to merging these IRs in backend, would the match rule be more complex? I will take a considering.

I'm not saying I know that this alternative would be better. I'm just worried about having extra IR nodes, and then optimizations are more complex / just don't work because we don't handle all nodes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26236#discussion_r2332301985


More information about the hotspot-compiler-dev mailing list