RFR: 8309893: Integrate ReplicateB/S/I/L/F/D nodes to Replicate node [v2]

Emanuel Peter epeter at openjdk.org
Tue Oct 31 11:40:34 UTC 2023


On Tue, 31 Oct 2023 11:04:43 GMT, Eric Liu <eliu at openjdk.org> wrote:

>> src/hotspot/share/opto/vectornode.cpp line 609:
>> 
>>> 607: // Check if input is loop invariant vector.
>>> 608: bool VectorNode::is_invariant_vector(Node* n) {
>>> 609:   // Only Replicate vector nodes are loop invariant for now.
>> 
>> Random discovery: does this not sound fishy? Can Replicate nodes never be used in a loop variant way? For example with the VectorAPI?
>
> Replicate can be variant in Vector API. E.g., we can create an IntVector in loop by IntVector.broadcast (https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java). We also have such matching rules supporting register as Replicate input.
> 
> I only find the use point of this method in Superword, I'm not sure if auto-vectorization has any special handle in vectorizing Replicate.

I think this method has no value now. I would remove it and use `n->Opcode() == Op_Replicate` at the two use cases:
`VectorNode::degenerate_vector_rotate` and `SuperWord::output`
That would be more understandable.

In SuperWord, we so far only add Replicate nodes if we have a vector-op that has one arg-slot as all the same input (eg `some_invariant_value` below).

for (int i = ...) {
  a[i] = b[i] * some_invariant_value;
}

So I guess the assumption is that the Replicate node can always float out of the loop, and hence it would be good to have it on the "right side" (second slot) of the Add/Mul node. This supposedly reduces register pressure. If that is still true I do not know. Hence, I would do this:


        if (in1->Opcode() == Op_Replicate && (node_isa_reduction == false) && (n->is_Add() || n->is_Mul())) {
          // We have a Replicate node on in1. Since the input to the Replicate node is expected to be outside
          // the loop, also the Replicate node can float out of the loop. To reduce register pressure, we swap
          // the invariant inpus to the second slot.
          Node* tmp = in1;
          in1 = in2;
          in2 = tmp;
        }

Additionally, it would be interesting to add an assert to see if the input of Replicate is indeed loop invariant.

`VectorNode::degenerate_vector_rotate` talks about doing something to replicate nodes anyway, so talking about "invariant" is not helpful.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14830#discussion_r1377450518


More information about the hotspot-compiler-dev mailing list