RFR: 8373344: Add support for min/max reduction operations for Float16 [v2]

Xiaohong Gong xgong at openjdk.org
Thu Jan 8 06:04:09 UTC 2026


On Wed, 7 Jan 2026 17:33:42 GMT, Yi Wu <duke at openjdk.org> wrote:

>You mean move it down, like Op_AddReductionVI and Op_AddReductionVL to use return !VM_Version::use_neon_for_vector(length_in_bytes);?

Yes, that was what I mean.

> It doesn't to make much of a difference.

So what does `8B/16B/32B` mean? I guess it means the real vector size of the reduction operation? But how did you test these cases, as I noticed the code of benchmarks do not have any parallelization differences. Is the vectorization factor decided by using different `MaxVectorSize` vm option ? If so, then I think the partial cases are not touched. Could you please check whether instruction of `VectorMaskGenNode` is generated from the generated code? I assume there should be difference, because for partial cases (vector_size < MaxVectorSize), it uses the SVE predicated instructions before, while it uses NEON instructions after. And the instruction latency/throughput of SVE reduction are much worse than NEON ones.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28828#discussion_r2670981173


More information about the hotspot-compiler-dev mailing list