RFR: 8346256: Optimize UMIN/UMAX reduction operations for x86 targets [v2]
Jatin Bhateja
jbhateja at openjdk.org
Tue Feb 17 06:36:05 UTC 2026
> Hi all,
>
> Patch adds x86 backend implementation for UMIN/UMAX reduction operation.
>
> Following are the performance numbers of existing micro-benchmark test/micro/org/openjdk/bench/jdk/incubator/vector/VectorUMinUMaxReductionBenchmark.java
>
> System Configuration:
> Model name: AMD EPYC 9755 128-Core Processor (Turin)
> Fixed Frequency : 2.1GHz
>
>
>
> Baseline:-
> ----------
> Benchmark (size) Mode Cnt Score Error Units
> VectorUMinUMaxReductionBenchmark.byteUMaxReduction 1024 thrpt 2 1183.300 ops/ms
> VectorUMinUMaxReductionBenchmark.byteUMaxReductionMasked 1024 thrpt 2 1426.570 ops/ms
> VectorUMinUMaxReductionBenchmark.byteUMinReduction 1024 thrpt 2 1186.889 ops/ms
> VectorUMinUMaxReductionBenchmark.byteUMinReductionMasked 1024 thrpt 2 1360.700 ops/ms
> VectorUMinUMaxReductionBenchmark.intUMaxReduction 1024 thrpt 2 967.264 ops/ms
> VectorUMinUMaxReductionBenchmark.intUMaxReductionMasked 1024 thrpt 2 767.641 ops/ms
> VectorUMinUMaxReductionBenchmark.intUMinReduction 1024 thrpt 2 969.714 ops/ms
> VectorUMinUMaxReductionBenchmark.intUMinReductionMasked 1024 thrpt 2 799.210 ops/ms
> VectorUMinUMaxReductionBenchmark.longUMaxReduction 1024 thrpt 2 410.210 ops/ms
> VectorUMinUMaxReductionBenchmark.longUMaxReductionMasked 1024 thrpt 2 452.717 ops/ms
> VectorUMinUMaxReductionBenchmark.longUMinReduction 1024 thrpt 2 470.575 ops/ms
> VectorUMinUMaxReductionBenchmark.longUMinReductionMasked 1024 thrpt 2 485.897 ops/ms
> VectorUMinUMaxReductionBenchmark.shortUMaxReduction 1024 thrpt 2 958.935 ops/ms
> VectorUMinUMaxReductionBenchmark.shortUMaxReductionMasked 1024 thrpt 2 937.805 ops/ms
> VectorUMinUMaxReductionBenchmark.shortUMinReduction 1024 thrpt 2 950.125 ops/ms
> VectorUMinUMaxReductionBenchmark.shortUMinReductionMasked 1024 thrpt 2 928.718 ops/ms
>
> Withopt:-
> ---------
> Benchmark (size) Mode Cnt Score Error Units
> VectorUMinUMaxReductionBenchmark.byteUMaxReduction 1024 thrpt 2 21391.700 ops/ms
> VectorUMinUMaxReductionBenchmark.byteUMaxReductionMasked 1024 thrpt 2 19865.073 ops/ms
> VectorUMinUMaxRed...
Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8346256
- 8346256: Optimize UMIN/UMAX reduction operations for x86 targets
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/29751/files
- new: https://git.openjdk.org/jdk/pull/29751/files/913f22a3..19a40fa0
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=29751&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=29751&range=00-01
Stats: 24143 lines in 508 files changed: 11385 ins; 2730 del; 10028 mod
Patch: https://git.openjdk.org/jdk/pull/29751.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/29751/head:pull/29751
PR: https://git.openjdk.org/jdk/pull/29751
More information about the hotspot-dev
mailing list