RFR: 8369448: C2 SuperWord: refactor VTransform to do move_unordered_reduction_out_of_loop during VTransform::optimize [v2]
    Emanuel Peter 
    epeter at openjdk.org
       
    Mon Oct 13 10:58:47 UTC 2025
    
    
  
On Mon, 13 Oct 2025 08:49:26 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:
>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   For Vladimir K7
>
> src/hotspot/share/opto/vtransform.cpp line 97:
> 
>> 95: 
>> 96:   collect_nodes_without_strong_in_edges(stack);
>> 97:   int num_alive_nodes = count_alive_vtnodes();
> 
> Suggestion:
> 
>   const int num_alive_nodes = count_alive_vtnodes();
Applied!
> src/hotspot/share/opto/vtransform.cpp line 1070:
> 
>> 1068: //       outside the loop, and instead cheaper element-wise vector accumulations
>> 1069: //       are performed inside the loop.
>> 1070: bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop(const VLoopAnalyzer& vloop_analyzer, VTransform& vtransform) {
> 
> Any particular reason you chose the additional `optimize` prefix? I think the intent is already clear without it.
In my head, it is a bit like `Ideal` calling a `Ideal_....` method. I want to make clear that it is part of the `optimize`. Is that ok with you, or rather confusing?
> src/hotspot/share/opto/vtransform.cpp line 1074:
> 
>> 1072:   uint vlen    = vector_length();
>> 1073:   BasicType bt = element_basic_type();
>> 1074:   int ropc     = vector_reduction_opcode();
> 
> Can probably be made `const` for good measure:
> 
> Suggestion:
> 
>   const int sopc     = scalar_opcode();
>   const uint vlen    = vector_length();
>   const BasicType bt = element_basic_type();
>   const int ropc     = vector_reduction_opcode();
> 
> 
> And could they also be moved down to the definition of `vopc`?
Nice idea, applied!
> src/hotspot/share/opto/vtransform.cpp line 1113:
> 
>> 1111:   VTransformReductionVectorNode* last_red    = phi->in_req(2)->isa_ReductionVector();
>> 1112:   VTransformReductionVectorNode* current_red = last_red;
>> 1113:   while (true) {
> 
> The method is already quite big. IIUC, this only does some checking and we do not need to bookkeep for further down. Therefore, I suggest to extract this to a "is_looping_back_to_phi" method or something like that.
It seems you have 2 concerns here:
- Variables that could be limited to a smaller scope
- method too long
If I am going to refactor the code, then I'd probably have to split it into:
- preconditions
  - not just this loop, but all conditions above too
- transform
I'll struggle a bit to name the methods, the name is already insanely long:
`VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop`
 I'll get something like:
`VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop_preconditions`
Or even longer if you want me to split the preconditions.
There are some downsides to splitting the code: I'll have to either pass quite a lot of arguments around, or duplicate code that finds specific nodes. Example:
`VTransformLoopPhiNode* phi = in_req(1)->isa_LoopPhi();`
I'm honestly not convinced that refactoring is better here. At least splitting is hard. We could of course make it all a class... but that would be a little over-engineered I think. Keeping it procedural, a simple list of steps seems ok for me.
Besides: this is the same code structure as before this patch, I only moved it ;)
I gave it a try, and split the preconditions off. But it just leads to more code, so I'm not super satisfied.
diff --git a/src/hotspot/share/opto/vtransform.cpp b/src/hotspot/share/opto/vtransform.cpp
index 97d16739116..f987c2bee55 100644
--- a/src/hotspot/share/opto/vtransform.cpp
+++ b/src/hotspot/share/opto/vtransform.cpp
@@ -1072,7 +1072,7 @@ bool VTransformReductionVectorNode::requires_strict_order() const {
 //       become profitable, since the expensive reduction node is moved
 //       outside the loop, and instead cheaper element-wise vector accumulations
 //       are performed inside the loop.
-bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop(const VLoopAnalyzer& vloop_analyzer, VTransform& vtransform) {
+bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop_preconditions(VTransform& vtransform) {
   // We have a phi with a single use.
   VTransformLoopPhiNode* phi = in_req(1)->isa_LoopPhi();
   if (phi == nullptr) {
@@ -1167,6 +1167,18 @@ bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_ou
     // We expect another non strict reduction, verify it in the next iteration.
     current_red = scalar_input->isa_ReductionVector();
   }
+  return true; // success
+}
+
+bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop(const VLoopAnalyzer& vloop_analyzer, VTransform& vtransform) {
+  if (!optimize_move_non_strict_order_reductions_out_of_loop_preconditions(vtransform)) {
+    return false;
+  }
+
+  const int sopc     = scalar_opcode();
+  const uint vlen    = vector_length();
+  const BasicType bt = element_basic_type();
+  const int vopc     = VectorNode::opcode(sopc, bt);
 
   // All checks were successful. Edit the vtransform graph now.
   PhaseIdealLoop* phase = vloop_analyzer.vloop().phase();
@@ -1183,12 +1195,15 @@ bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_ou
   vtn_identity_vector->init_req(1, vtn_identity);
 
   // Turn the scalar phi into a vector phi.
+  VTransformLoopPhiNode* phi = in_req(1)->isa_LoopPhi();
   VTransformNode* init = phi->in_req(1);
   phi->set_req(1, vtn_identity_vector);
 
   // Traverse down the chain of reductions, and replace them with vector_accumulators.
   VTransformNode* current_vector_accumulator = phi;
-  current_red = first_red;
+  VTransformReductionVectorNode* first_red   = this;
+  VTransformReductionVectorNode* last_red    = phi->in_req(2)->isa_ReductionVector();
+  VTransformReductionVectorNode* current_red = first_red;
   while (true) {
     VTransformNode* vector_input = current_red->in_req(2);
     VTransformVectorNode* vector_accumulator = new (vtransform.arena()) VTransformElementWiseVectorNode(vtransform, 3, current_red->properties(), vopc);
diff --git a/src/hotspot/share/opto/vtransform.hpp b/src/hotspot/share/opto/vtransform.hpp
index 85f015db442..7ad7b432e9b 100644
--- a/src/hotspot/share/opto/vtransform.hpp
+++ b/src/hotspot/share/opto/vtransform.hpp
@@ -841,6 +841,7 @@ class VTransformReductionVectorNode : public VTransformVectorNode {
 private:
   int vector_reduction_opcode() const;
   bool requires_strict_order() const;
+  bool optimize_move_non_strict_order_reductions_out_of_loop_preconditions(VTransform& vtransform);
   bool optimize_move_non_strict_order_reductions_out_of_loop(const VLoopAnalyzer& vloop_analyzer, VTransform& vtransform);
 };
We can discuss it offline.
> src/hotspot/share/opto/vtransform.cpp line 1175:
> 
>> 1173:   // Create a vector of identity values.
>> 1174:   Node* identity = ReductionNode::make_identity_con_scalar(phase->igvn(), sopc, bt);
>> 1175:   phase->set_ctrl(identity, phase->C->root());
> 
> Any particular reason why you are no longer using `set_root_as_ctrl()`?
I think my proof-of-concept was still in an old state, before we had added that method. Fixed now.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2425740108
PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2425747258
PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2425753160
PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2425941651
PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2425938864
    
    
More information about the hotspot-compiler-dev
mailing list