RFR: 8369448: C2 SuperWord: refactor VTransform to do move_unordered_reduction_out_of_loop during VTransform::optimize [v2]

Mon Oct 13 13:14:23 UTC 2025

On Mon, 13 Oct 2025 12:14:27 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> It seems you have 2 concerns here:
>> - Variables that could be limited to a smaller scope
>> - method too long
>> 
>> If I am going to refactor the code, then I'd probably have to split it into:
>> - preconditions
>>   - not just this loop, but all conditions above too
>> - transform
>> 
>> I'll struggle a bit to name the methods, the name is already insanely long:
>> `VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop`
>>  I'll get something like:
>> `VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop_preconditions`
>> Or even longer if you want me to split the preconditions.
>> 
>> There are some downsides to splitting the code: I'll have to either pass quite a lot of arguments around, or duplicate code that finds specific nodes. Example:
>> `VTransformLoopPhiNode* phi = in_req(1)->isa_LoopPhi();`
>> 
>> I'm honestly not convinced that refactoring is better here. At least splitting is hard. We could of course make it all a class... but that would be a little over-engineered I think. Keeping it procedural, a simple list of steps seems ok for me.
>> 
>> Besides: this is the same code structure as before this patch, I only moved it ;)
>> 
>> I gave it a try, and split the preconditions off. But it just leads to more code, so I'm not super satisfied.
>> 
>> 
>> diff --git a/src/hotspot/share/opto/vtransform.cpp b/src/hotspot/share/opto/vtransform.cpp
>> index 97d16739116..f987c2bee55 100644
>> --- a/src/hotspot/share/opto/vtransform.cpp
>> +++ b/src/hotspot/share/opto/vtransform.cpp
>> @@ -1072,7 +1072,7 @@ bool VTransformReductionVectorNode::requires_strict_order() const {
>>  //       become profitable, since the expensive reduction node is moved
>>  //       outside the loop, and instead cheaper element-wise vector accumulations
>>  //       are performed inside the loop.
>> -bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop(const VLoopAnalyzer& vloop_analyzer, VTransform& vtransform) {
>> +bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_out_of_loop_preconditions(VTransform& vtransform) {
>>    // We have a phi with a single use.
>>    VTransformLoopPhiNode* phi = in_req(1)->isa_LoopPhi();
>>    if (phi == nullptr) {
>> @@ -1167,6 +1167,18 @@ bool VTransformReductionVectorNode::optimize_move_non_strict_order_reductions_ou
>>      // We expect another non strict reduction, verify it in the next iteration.
>>      current_red = scalar_input->isa_ReductionVec...
>
> I decided to apply the split diff from above. I hope that is a bit better for you. It is not yet a full-blown class-approach. But at least it separates the preconditions from the optimization.

Thanks for the summary and the interesting offline discussion! I think that's a good trade-off here as otherwise would need to refactor quite some more and probably fall back to builder classes to avoid having mutable fields. That's exciting but out of scope since you did not actually newly implement this but rather moved/updated it. So, I think it's a good choice you made here with splitting only, thanks! :-)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27704#discussion_r2426276872