RFR: 8340093: C2 SuperWord: implement cost model [v3]

Wed Nov 5 08:38:26 UTC 2025

On Mon, 3 Nov 2025 13:59:28 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> src/hotspot/share/opto/vectorization.cpp line 634:
>> 
>>> 632:   // Each reduction is composed of multiple instructions, each estimated with a unit cost.
>>> 633:   //                                Linear: shuffle and reduce    Recursive: shuffle and reduce
>>> 634:   float c = requires_strict_order ? 2 * vlen                    : 2 * exact_log2(vlen);
>> 
>> "unit cost" sounds a bit too simple given that there is some kind of estimation going on already. Maybe it would make sense to add some discussion how strict order affects the shape of the resulting vectorized code?
>> 
>> I assume cases where the reduction can be moved after the loop are covered somewhere else?
>
> Thanks for the comment :)
> 
> By "unit cost" I mean unit cost per hardware instruction. Reduction ops use multiple instructions, so we count the instructions, and return that count.
> 
> Yes, if we move reductions out of the loop, then the reduction node is not in the loop anymore, and instead we have vector accumulators. And then we count the cost of the vector accumulators.
> 
> That's why I need methods like `VTransformGraph::mark_vtnodes_in_loop` to know what nodes are in the loop (the new vector accumulators, and not the reductions if moved out of the loop).
> 
> I think I'll improve the comments a little to make that more clear :)

Ah, when referring to hardware instructions this makes perfectly sense, somehow I assumed "unit cost of a node". Thanks for clarifying!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27803#discussion_r2493503295