RFR: 8342662: C2: Add new phase for backend-specific lowering [v6]

Tue Jan 14 11:06:52 UTC 2025

On Tue, 14 Jan 2025 06:45:41 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Implement apply_identity
>
> src/hotspot/share/opto/phaseX.cpp line 2301:
> 
>> 2299:   // that may undo the changes done during lowering.
>> 2300: 
>> 2301:   return k->LoweredIdeal(this);
> 
> I'm sorry that I still cannot understand well what this method is expected to do for a node. For example, if we need to add some architecture specific optimization for `MulNode` like AArch64, we can add the lowering code in `lower_node_platform` for AArch64, right? Do we also need to override the `LoweredIdeal()` for `MulNode` ? Thanks!

`lower_node_transform` transforms a node that should not appear in matching to something that can appear there while `LoweredIdeal` transforms a node that may appear in matching to another based on the pattern of its input.

For example, consider this Java code:

    Int256Vector v1;
    Int256Vector v2 = v1.withLane(4, x);
    Int256Vector v3 = v2.withLane(5, y);

Before lowering we would have (pseudocode for the graph):

    vector<int,8> v1;
    vector<int,8> v2 = VectorInsert(v1, x, 4);
    vector<int,8> v3 = VectorInsert(v2, y, 5);

x86 does not know how to insert to a 256-bit vector, so we need to extract the 128-bit lane, insert the element into the lane, then insert the lane into the original vector. Currently, this is done during code emission, suppose we want to do so during lowering, we will have this:

    vector<int,8> v1; // [a, b, c, d, e, f, g, h]
    vector<int,4> v4 = ExtractVector(v1, 1); // [e, f, g, h]
    vector<int,4> v5 = VectorInsert(v4, x, 0); // [x, f, g, h]
    vector<int,8> v2 = VectorInsert(v1, v5, 1); // [a, b, c, d, x, f, g, h]
    vector<int,4> v6 = ExtractVector(v2, 1); // [x, f, g, h]
    vector<int,4> v7 = VectorInsert(v6, y, 1); // [x, y, g, h]
    vector<int,8> v3 = VectorInsert(v2, v7, 1); // [a, b, c, d, x, y, g, h]

Now using `Identity` we may be able to ensure that `v6 == v5`, this leaves us with:

    vector<int,8> v1; // [a, b, c, d, e, f, g, h]
    vector<int,4> v4 = ExtractVector(v1, 1); // [e, f, g, h]
    vector<int,4> v5 = VectorInsert(v4, x, 0); // [x, f, g, h]
    vector<int,8> v2 = VectorInsert(v1, v5, 1); // [a, b, c, d, x, f, g, h]
    vector<int,4> v7 = VectorInsert(v5, y, 1); // [x, y, g, h]
    vector<int,8> v3 = VectorInsert(v2, v7, 1); // [a, b, c, d, x, y, g, h]

Ideally, we would want to transform `v3` into `VectorInsert(v1, v7, 1)` because then we can elide `v2`. This can be done using `LoweredIdeal`.

So to your question, I think `LoweredIdeal` would be a better choice, this aligns pretty well with our current method of doing it in `Ideal`, too.

> src/hotspot/share/opto/phaseX.cpp line 2310:
> 
>> 2308: Node* PhaseLowering::lower_node(Node* n) {
>> 2309:   // Apply shared lowering transforms
>> 2310: 
> 
> Per my understanding, this is a backend specific lowering phase, is there any scenario that a platform in-dependent lowering is needed here? As we already have the common GVN phase for common node idealize, is there any difference for such shared transformations? Thanks!

Lowering is not idealisation so I think having backend independent lowering is fine in case we need it.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21599#discussion_r1914633526
PR Review Comment: https://git.openjdk.org/jdk/pull/21599#discussion_r1914634733