RFR: 8340093: C2 SuperWord: implement cost model [v4]

Emanuel Peter epeter at openjdk.org
Thu Nov 6 07:52:03 UTC 2025


On Thu, 6 Nov 2025 07:23:57 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> src/hotspot/share/opto/vectorization.cpp line 635:
>> 
>>> 633:   // Each reduction is composed of multiple instructions, each estimated with a unit cost.
>>> 634:   //                                Linear: shuffle and reduce    Recursive: shuffle and reduce
>>> 635:   float c = requires_strict_order ? 2 * vlen                    : 2 * exact_log2(vlen);
>> 
>> Can we ask for the cost of the element-wise opcode here, something like `(1 + element_wise_cost)` would be more accurate?
>
> To be a little more precise, the strict one should be something like:
> 
>     vlen * (1 + Matcher::vector_op_pre_select_sz_estimate(Op_Extract, bt, vlen)) + (vlen - 1) * (1 + Matcher::scalar_op_pre_select_sz_estimate(opcode, bt)));
> 
> and the non-strict one would be:
> 
>     float c = Matcher::vector_op_pre_select_sz_estimate(Op_Extract, bt, 2) * 2 + Matcher::scalar_op_pre_select_sz_estimate(opcode) + 3;
>     for (int i = 4; i <= vlen; i *= 2) {
>       c += 2 + Matcher::vector_op_pre_select_sz_estimate(Op_VectorRearrange, bt, i) + Matcher::vector_op_pre_select_sz_estimate(opcode, bt, i);
>     }
> 
> Maybe refactoring a little bit to make the `Matcher::vector_op_pre_select_sz_estimate` less awkward would be welcomed, too. Currently, it returns the estimated size - 1, which is unsettling.

@merykitty Can we do that in a follow-up RFE? For now, I'd like to keep it as simple as possible. Cost-models can become arbitrarily complex. There is a bit of a trade-off between simplicity and accuracy. And we can for sure improve things in the future, this PR just lays the foundation.

My goal here is to start as simple as possible, and then add complexity if there is a proven need for it.

So if you/we can find a benchmark where the cost model is not accurate enough yet, provable by `-XX:AutoVectorizationOverrideProfitability=0/2`, then we should make it more complex.

Would that be acceptable for you?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27803#discussion_r2497861417


More information about the hotspot-compiler-dev mailing list