RFR: 8307084: C2: Vectorized drain loop is not executed for some small trip counts [v4]
Fei Gao
fgao at openjdk.org
Thu Jan 22 16:33:15 UTC 2026
On Wed, 21 Jan 2026 10:37:24 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> test/micro/org/openjdk/bench/vm/compiler/VectorThroughputForIterationCount.java line 136:
>>
>>> 134: // effects of this patch unobservable.
>>> 135: @Param({"true", "false"})
>>> 136: public static boolean ENABLE_LARGE_LOOP_WARMUP;
>>
>> It would be nice to have some more comments here:
>> - for which benchmarks would the effect of "this patch" not be observable? Also: referring to "this patch" will require a future reader to trace things back in the "git blame" history, that's a bit unfortunate.
>> - Generally, it would now be nice to have a summary of which types of benchmarks show what kind of results, and why do we have all the variants.
>
> I'm asking for more comments because I fear the benchmark is becoming harder to use, with all the extra options and benchmark variants.
Really great suggestions.
I'll refine the comments as like:
// When enabled, run an additional warm-up phase using a large loop iteration
// count to encourage C2 to generate vectorized and unrolled loop bodies.
//
// Rationale:
// Some benchmarks in this suite use small, fixed trip-count loops. During
// early profiling, C2 may treat such loops as trivial, avoid vectorization,
// or optimize them away entirely. In those cases, changes that affect loop
// vectorization behavior, such as the improvement introduced by JDK-8307084,
// may not be observable in the generated code.
//
// As a result, this benchmark suite contains two main classes of
// microbenchmarks:
// 1) bench_xx_computeBound / bench_xx_memoryBound
// These measure the performance of C2-generated code for the given
// workload without relying on a special warm-up phase.
// 2) bench03xx_staticTripCount / bench03xx_dynamicTripCount
// These benchmarks are sensitive to early profiling. Enabling a
// large-loop warm-up forces the optimizer to observe the loop at scale,
// making vectorized code generation more likely and allowing such
// effects to be measured.
//
// Usage guidance:
// - Enable for microbenchmarks that rely on observing vectorization or
// unrolling effects, especially when loop trip counts are small or
// constant (e.g., bench03xx_staticTripCount and bench03xx_dynamicTripCount,
// introduced by JDK-8307084).
// - Disable for general regression testing and for other microbenchmarks.
@Param({"true", "false"})
public static boolean ENABLE_LARGE_LOOP_WARMUP;
WDYT?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2717360633
More information about the hotspot-compiler-dev
mailing list