RFR: 8366702: C2 SuperWord: refactor VTransform vector nodes [v4]
Galder Zamarreño
galder at openjdk.org
Wed Sep 10 08:16:22 UTC 2025
On Mon, 8 Sep 2025 08:28:51 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> I'm working on cost-modeling, and am integrating some smaller changes from this proof-of-concept PR:
>> https://github.com/openjdk/jdk/pull/20964
>> [See plan overfiew.](https://bugs.openjdk.org/browse/JDK-8340093)
>>
>> This is a pure refactoring - no change in behaviour. I'm presenting it like this because it will make reviews easier.
>>
>> ---------------------------------
>>
>> I have to say: I'm very sorry for this refactoring. I took some decisions in https://github.com/openjdk/jdk/pull/19719 that I'm now partially undoing. I moved too much logic from `SuperWord::output` (now called `SuperWordVTransformBuilder::make_vector_vtnode_for_pack`) to the `VTransform...Node::apply`. https://github.com/openjdk/jdk/pull/19719 was a roughly 1.5k line change, and I took about a 0.3k misstep that I'm now correcting here ;)
>>
>> I had accidentially made the `VTransformGraph` too close to the `PackSet`, and not close enough to the future vectorized C2 Graph. And that makes some future changes hard.
>>
>> My vision:
>> - VLoop / VLoopAnalyzer look at the scalar loop and prepare it for SuperWord
>> - SuperWord creates the `PackSet`: some nodes are packed, all others are scalar.
>> - `SuperWordVTransformBuilder` converts the `PackSet` into the `VTransformGraph`
>> - The `VTransformGraph` very closely represents the C2 vectorized loop after vectorization
>> - It does not need to know which `nodes` it packs, it rather just needs to know how to generate the new vector nodes
>> - That means it is straight-forward to compute cost
>> - And it also makes optimizations on that graph easier
>> - And the `apply` methods are simpler too
>>
>> ----------------------------------
>>
>> So therefore, the main goal was to make the `VTransform...Node::apply` calls simpler again. And move the logic back to `SuperWordVTransformBuilder::make_vector_vtnode_for_pack`.
>>
>> One important step to making the the `VTransformGraph` less of a `PackSet` is to remove reliance on `nodes` for the vector nodes.
>>
>> What I did:
>> - Moving a lot of the logic in `VTransformElementWiseVectorNode::apply` to `SuperWordVTransformBuilder::make_vector_vtnode_for_pack`.
>> - Will make it easier to optimize and compute cost in future RFE's.
>> - `VTransformVectorNodePrototype`: packs a lot of the info for `VTransformVectorNode`.
>> - pass info about `bt`, `vlen`, `sopc` instead of the `pack` -> allows us to eventually remove the dependency on `nodes`.
>> - New vector nodes, they are special cases I split away from ...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>
> fix typo
Changes requested by galder (Author).
src/hotspot/share/opto/vtransform.cpp line 795:
> 793:
> 794: VectorNode* vn = nullptr;
> 795: if (req() <= 3) {
I'm wondering if with this change, the `assert(2 <= req() && req() <= 4, "Must have 1-3 inputs");` call could moved and be made more specific for these 2 sides of the conditon.
For example, we know that if we go down the `req() <= 3` route, then we're in the 1-2 inputs? And if if we're in the other one we're at least 3 inputs.
Then, with that in mind, I wonder if we couldn't move ` Node* in3 = (req() >= 4) ? apply_state.transformed_node(in_req(3)) : nullptr;` to be computed only in the `else` and convert it to `Node* in3 = apply_state.transformed_node(in_req(3))`?
-------------
PR Review: https://git.openjdk.org/jdk/pull/27056#pullrequestreview-3204977764
PR Review Comment: https://git.openjdk.org/jdk/pull/27056#discussion_r2335935271
More information about the hotspot-compiler-dev
mailing list