RFR: 8342662: C2: Add new phase for backend-specific lowering [v2]

Thu Oct 24 11:54:08 UTC 2024

On Thu, 24 Oct 2024 02:08:14 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>> src/hotspot/share/opto/phaseX.cpp line 2277:
>> 
>>> 2275: 
>>> 2276:   // Try to find an existing version of the same node
>>> 2277:   Node* existing = _igvn->hash_find_insert(n);
>> 
>> I think it would be easier if you have a switch in `gvn` that says you passed the point of doing `Ideal`, moving forward you will probably want to have a `IdealLowering` to transform nodes during this phase. `Identity` I think is fine since it returns an existing node.
>
> Ah, do you mean having a method in `Node` that holds the lowering code? I was originally planning on doing it this way, but I think it would pose some problems where certain nodes' `Lower()` methods would only be overridden on certain backends, which would become messy. One of my goals was to keep the lowering code separate from shared code, so new lowerings could be implemented by just updating the main `lower_node` function in the backend.
> About GVN, I think it makes sense to do it in a separate phase because GVN is used quite generally whereas lowering is only done once. Since the `transform_old` function in IGVN is pretty complex as well, I think it's simpler to just implement `Value()` and GVN separately. Thinking on it more I think Identity is probably a good idea too, since as you mention it can't introduce new nodes into the graph. Mainly I wanted to avoid the case where `Ideal()` could fold a lowered graph back into the original form, causing an infinite loop.

I mean we might want to run another kind of `Ideal` that will replace the normal `Ideal` on a node after its lowering. For example, consider this:

    vector<int,8> v;
    u = v.withLane(0, a).withLane(1, b);

This will be parsed into:

    vector<int,8> v;
    v0 = InsertI(v, 4, a);
    u = InsertI(v0, 5, b);

And can be lowered to:

    vector<int,8> v;
    vector<int,4> v1 = VectorExtract(v, 1);
    v2 = InsertI(v1, 0, a);
    v0 = VectorInsert(v, 1, v2);
    vector<int,4> v3 = VectorExtract(v0, 1);
    v4 = InsertI(v3, 1, b);
    u = VectorInsert(v0, 1, v4);

Which represents this sequence:

    ymm0;
    vextracti128 xmm1, ymm0, 1;
    vpinsrd xmm1, xmm1, a, 0;
    vinserti128 ymm0, ymm0, xmm1, 1;
    vextracti128 xmm1, ymm0, 1;
    vpinsrd xmm1, xmm1, b, 1;
    vinserti128 ymm0, ymm0, xmm1, 1;

As you can imagine this sequence is pretty inefficient, what we really want is:

    ymm0;
    vextracti128 xmm1, ymm0, 1;
    vpinsrd xmm1, xmm1, a, 0;
    vpinsrd xmm1, xmm1, b, 1;
    vinserti128 ymm0, ymm0, xmm1, 1;

Looking back at the graph, we can `Identity` `v3` into `v2` since it is pretty obvious that we just do an insert and extract from the same place. However, to transform `u = VectorInsert(v0, 1, v4)` into `u = VectorInsert(v, 1, v4)`, we would need  an `Ideal`-like transformation to see that we just insert into a location twice and remove the intermediate `VectorInsert`.

As a result, in addition to ease of implementation, I think you may extend `PhaseIterGVN` and override its `PhaseGVN::apply_ideal` to return `nullptr` for now, and take advantages of `PhaseIterGVN::optimize` to do the iterative transformation for you.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21599#discussion_r1814828100