RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v12]

Evgeny Astigeevich eastigeevich at openjdk.org
Wed Feb 19 19:54:05 UTC 2025


On Wed, 19 Feb 2025 17:43:54 GMT, Galder Zamarreño <galder at openjdk.org> wrote:

>> Galder Zamarreño has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 44 additional commits since the last revision:
>> 
>>  - Merge branch 'master' into topic.intrinsify-max-min-long
>>  - Fix typo
>>  - Renaming methods and variables and add docu on algorithms
>>  - Fix copyright years
>>  - Make sure it runs with cpus with either avx512 or asimd
>>  - Test can only run with 256 bit registers or bigger
>>    
>>    * Remove platform dependant check
>>    and use platform independent configuration instead.
>>  - Fix license header
>>  - Tests should also run on aarch64 asimd=true envs
>>  - Added comment around the assertions
>>  - Adjust min/max identity IR test expectations after changes
>>  - ... and 34 more: https://git.openjdk.org/jdk/compare/384bab03...a190ae68
>
> I will run a comparison next with the same batch of tests but looking at `int` and see if there are any differences compared with `long` or not.

Hi @galderz,
Results from Graviton 3(Neoverse-V1).
Without the patch:

Benchmark                       (probability)  (range)  (seed)  (size)   Mode  Cnt      Score    Error   Units
MinMaxVector.intClippingRange             N/A       90       0    1000  thrpt    8  12565.427 ± 37.538  ops/ms
MinMaxVector.intClippingRange             N/A      100       0    1000  thrpt    8  12462.072 ± 84.067  ops/ms
MinMaxVector.intLoopMax                    50      N/A     N/A    2048  thrpt    8   5113.090 ± 68.720  ops/ms
MinMaxVector.intLoopMax                    80      N/A     N/A    2048  thrpt    8   5129.857 ± 35.005  ops/ms
MinMaxVector.intLoopMax                   100      N/A     N/A    2048  thrpt    8   5116.081 ±  8.946  ops/ms
MinMaxVector.intLoopMin                    50      N/A     N/A    2048  thrpt    8   6174.544 ± 52.573  ops/ms
MinMaxVector.intLoopMin                    80      N/A     N/A    2048  thrpt    8   6110.884 ± 54.447  ops/ms
MinMaxVector.intLoopMin                   100      N/A     N/A    2048  thrpt    8   6178.661 ± 48.450  ops/ms
MinMaxVector.intReductionMax               50      N/A     N/A    2048  thrpt    8   5109.270 ± 10.525  ops/ms
MinMaxVector.intReductionMax               80      N/A     N/A    2048  thrpt    8   5123.426 ± 28.229  ops/ms
MinMaxVector.intReductionMax              100      N/A     N/A    2048  thrpt    8   5133.799 ±  7.693  ops/ms
MinMaxVector.intReductionMin               50      N/A     N/A    2048  thrpt    8   5130.209 ± 15.491  ops/ms
MinMaxVector.intReductionMin               80      N/A     N/A    2048  thrpt    8   5127.823 ± 27.767  ops/ms
MinMaxVector.intReductionMin              100      N/A     N/A    2048  thrpt    8   5118.217 ± 22.186  ops/ms
MinMaxVector.longClippingRange            N/A       90       0    1000  thrpt    8   1831.026 ± 15.502  ops/ms
MinMaxVector.longClippingRange            N/A      100       0    1000  thrpt    8   1827.194 ± 22.076  ops/ms
MinMaxVector.longLoopMax                   50      N/A     N/A    2048  thrpt    8   2643.383 ±  9.830  ops/ms
MinMaxVector.longLoopMax                   80      N/A     N/A    2048  thrpt    8   2640.417 ±  7.797  ops/ms
MinMaxVector.longLoopMax                  100      N/A     N/A    2048  thrpt    8   1244.321 ±  1.001  ops/ms
MinMaxVector.longLoopMin                   50      N/A     N/A    2048  thrpt    8   3239.234 ±  8.813  ops/ms
MinMaxVector.longLoopMin                   80      N/A     N/A    2048  thrpt    8   3252.713 ±  3.446  ops/ms
MinMaxVector.longLoopMin                  100      N/A     N/A    2048  thrpt    8   1204.370 ± 10.537  ops/ms
MinMaxVector.longReductionMax              50      N/A     N/A    2048  thrpt    8   2536.322 ±  0.127  ops/ms
MinMaxVector.longReductionMax              80      N/A     N/A    2048  thrpt    8   2536.318 ±  0.277  ops/ms
MinMaxVector.longReductionMax             100      N/A     N/A    2048  thrpt    8   1395.273 ± 13.862  ops/ms
MinMaxVector.longReductionMin              50      N/A     N/A    2048  thrpt    8   2536.325 ±  0.146  ops/ms
MinMaxVector.longReductionMin              80      N/A     N/A    2048  thrpt    8   2536.265 ±  0.272  ops/ms
MinMaxVector.longReductionMin             100      N/A     N/A    2048  thrpt    8   1389.982 ±  5.345  ops/ms


With the patch:

Benchmark                       (probability)  (range)  (seed)  (size)   Mode  Cnt      Score    Error   Units
MinMaxVector.intClippingRange             N/A       90       0    1000  thrpt    8  12598.201 ± 52.631  ops/ms
MinMaxVector.intClippingRange             N/A      100       0    1000  thrpt    8  12555.284 ± 62.472  ops/ms
MinMaxVector.intLoopMax                    50      N/A     N/A    2048  thrpt    8   5079.499 ± 16.392  ops/ms
MinMaxVector.intLoopMax                    80      N/A     N/A    2048  thrpt    8   5100.673 ± 30.376  ops/ms
MinMaxVector.intLoopMax                   100      N/A     N/A    2048  thrpt    8   5082.544 ± 23.540  ops/ms
MinMaxVector.intLoopMin                    50      N/A     N/A    2048  thrpt    8   6137.512 ± 30.198  ops/ms
MinMaxVector.intLoopMin                    80      N/A     N/A    2048  thrpt    8   6136.233 ±  7.726  ops/ms
MinMaxVector.intLoopMin                   100      N/A     N/A    2048  thrpt    8   6142.262 ± 96.510  ops/ms
MinMaxVector.intReductionMax               50      N/A     N/A    2048  thrpt    8   5116.055 ± 23.270  ops/ms
MinMaxVector.intReductionMax               80      N/A     N/A    2048  thrpt    8   5111.481 ± 12.236  ops/ms
MinMaxVector.intReductionMax              100      N/A     N/A    2048  thrpt    8   5106.367 ±  9.035  ops/ms
MinMaxVector.intReductionMin               50      N/A     N/A    2048  thrpt    8   5115.666 ± 15.539  ops/ms
MinMaxVector.intReductionMin               80      N/A     N/A    2048  thrpt    8   5133.127 ±  4.918  ops/ms
MinMaxVector.intReductionMin              100      N/A     N/A    2048  thrpt    8   5120.469 ± 24.355  ops/ms
MinMaxVector.longClippingRange            N/A       90       0    1000  thrpt    8   5094.259 ± 14.092  ops/ms
MinMaxVector.longClippingRange            N/A      100       0    1000  thrpt    8   5096.835 ± 16.517  ops/ms
MinMaxVector.longLoopMax                   50      N/A     N/A    2048  thrpt    8   2636.438 ± 18.760  ops/ms
MinMaxVector.longLoopMax                   80      N/A     N/A    2048  thrpt    8   2644.069 ±  3.933  ops/ms
MinMaxVector.longLoopMax                  100      N/A     N/A    2048  thrpt    8   2646.250 ±  2.007  ops/ms
MinMaxVector.longLoopMin                   50      N/A     N/A    2048  thrpt    8   2648.504 ± 18.294  ops/ms
MinMaxVector.longLoopMin                   80      N/A     N/A    2048  thrpt    8   2658.082 ±  3.362  ops/ms
MinMaxVector.longLoopMin                  100      N/A     N/A    2048  thrpt    8   2647.532 ±  5.600  ops/ms
MinMaxVector.longReductionMax              50      N/A     N/A    2048  thrpt    8   2536.254 ±  0.086  ops/ms
MinMaxVector.longReductionMax              80      N/A     N/A    2048  thrpt    8   2536.209 ±  0.129  ops/ms
MinMaxVector.longReductionMax             100      N/A     N/A    2048  thrpt    8   2536.342 ±  0.068  ops/ms
MinMaxVector.longReductionMin              50      N/A     N/A    2048  thrpt    8   2536.271 ±  0.203  ops/ms
MinMaxVector.longReductionMin              80      N/A     N/A    2048  thrpt    8   2536.250 ±  0.343  ops/ms
MinMaxVector.longReductionMin             100      N/A     N/A    2048  thrpt    8   2536.246 ±  0.179  ops/ms

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2669613497


More information about the core-libs-dev mailing list