RFR: 8318650: Optimized subword gather for x86 targets.

Jatin Bhateja jbhateja at openjdk.org
Wed Oct 25 05:25:11 UTC 2023


On Wed, 25 Oct 2023 04:34:59 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Hi All,
> 
> This patch optimizes sub-word gather operation for x86 targets with AVX2 and AVX512 features.
> 
> Following is the summary of changes:-
> 
> 1) Intrinsify sub-word gather with high performance backend implementation based on hybrid algorithm which initially partially unrolls scalar loop to accumulates values from gather indices into a quadword(64bit) slice followed by vector permutation to place the slice into appropriate vector lanes, it prevents code bloating and generates compact
> JIT sequence. This coupled with savings from expansive array allocation in existing java implementation translates into significant performance of 1.3-5x gains with included micro.
> 
> 
> ![image](https://github.com/openjdk/jdk/assets/59989778/e25ba4ad-6a61-42fa-9566-452f741a9c6d)
> 
> 
> 2) Patch was also compared against modified java fallback implementation by replacing temporary array allocation with zero initialized vector and a scalar loops which inserts gathered values into vector. But, vector insert operation in higher vector lanes is a three step process which first extracts the upper vector 128 bit lane, updates it with gather subword value and then inserts the lane back to its original position. This makes inserts into higher order lanes costly w.r.t to proposed solution. In addition generated JIT code for modified fallback implementation was very bulky. This may impact in-lining decisions into caller contexts.
> 
> 3) Some minor adjustments in existing gather instruction pattens for double/quad words.
> 
> 
> Kindly review and share your feedback.
> 
> 
> Best Regards,
> Jatin

**Detailed performance numberers with AVX2**

Benchmark | Size | Baseline Score (ops/ms) | WithOpt Score (ops/ms) | Gain Factor (opt/baseline)
-- | -- | -- | -- | --
GatherOperationsBenchmark.microByteGather128 | 64 | 15916.774 | 34288.944 | 2.154264677
GatherOperationsBenchmark.microByteGather128 | 256 | 4128.501 | 8793.293 | 2.12989969
GatherOperationsBenchmark.microByteGather128 | 1024 | 1027.606 | 2217.138 | 2.157575958
GatherOperationsBenchmark.microByteGather128 | 4096 | 264.002 | 554.603 | 2.100753025
GatherOperationsBenchmark.microByteGather128_MASK | 64 | 16729.183 | 26308.667 | 1.57262115
GatherOperationsBenchmark.microByteGather128_MASK | 256 | 4157.73 | 7312.934 | 1.758876599
GatherOperationsBenchmark.microByteGather128_MASK | 1024 | 1067.675 | 1828.035 | 1.712164282
GatherOperationsBenchmark.microByteGather128_MASK | 4096 | 268.538 | 462.191 | 1.721138163
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 64 | 16559.725 | 25355.415 | 1.531149521
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 256 | 4190.36 | 6596.82 | 1.574284787
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 1024 | 1070.641 | 1638.323 | 1.530226285
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 4096 | 274.703 | 415.345 | 1.511978391
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 64 | 15445.814 | 30518.41 | 1.975836948
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 256 | 4087.154 | 8075.382 | 1.975795872
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 1024 | 1035.527 | 2008.003 | 1.939112162
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 4096 | 262.936 | 501.675 | 1.907973804
GatherOperationsBenchmark.microByteGather256 | 64 | 18266.25 | 37549.708 | 2.05568784
GatherOperationsBenchmark.microByteGather256 | 256 | 4714.027 | 9894.099 | 2.098863456
GatherOperationsBenchmark.microByteGather256 | 1024 | 1147.282 | 2490.351 | 2.1706529
GatherOperationsBenchmark.microByteGather256 | 4096 | 286.935 | 622.153 | 2.16827156
GatherOperationsBenchmark.microByteGather256_MASK | 64 | 21992.019 | 27357.032 | 1.243952727
GatherOperationsBenchmark.microByteGather256_MASK | 256 | 5732.258 | 7760.398 | 1.353811709
GatherOperationsBenchmark.microByteGather256_MASK | 1024 | 1495.632 | 1964.343 | 1.313386582
GatherOperationsBenchmark.microByteGather256_MASK | 4096 | 386.313 | 480.509 | 1.243833368
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 64 | 19911.793 | 26818.552 | 1.346867758
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 256 | 5013.248 | 7040.98 | 1.404474704
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 1024 | 1289.123 | 1785.368 | 1.384947751
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 4096 | 332.791 | 452.568 | 1.359916584
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 64 | 17147.769 | 33913.351 | 1.977712144
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 256 | 4386.044 | 8640.734 | 1.970051828
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 1024 | 1097.485 | 2261.998 | 2.061074183
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 4096 | 277.155 | 565.051 | 2.038754488
GatherOperationsBenchmark.microByteGather64 | 64 | 13068.085 | 37960.616 | 2.904833876
GatherOperationsBenchmark.microByteGather64 | 256 | 3227.857 | 9935.642 | 3.078092369
GatherOperationsBenchmark.microByteGather64 | 1024 | 834.99 | 2530.696 | 3.03080995
GatherOperationsBenchmark.microByteGather64 | 4096 | 212.664 | 637.938 | 2.999746078
GatherOperationsBenchmark.microByteGather64_MASK | 64 | 13548.225 | 30755.634 | 2.27008586
GatherOperationsBenchmark.microByteGather64_MASK | 256 | 3347.844 | 8026.22 | 2.39742951
GatherOperationsBenchmark.microByteGather64_MASK | 1024 | 843.279 | 2072.913 | 2.458157976
GatherOperationsBenchmark.microByteGather64_MASK | 4096 | 213.316 | 544.853 | 2.554205967
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 64 | 12982.383 | 28193.925 | 2.171706458
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 256 | 3288.497 | 7483.684 | 2.275715623
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 1024 | 834.342 | 1860.542 | 2.229951267
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 4096 | 208.107 | 473.987 | 2.277611998
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 64 | 13079.567 | 32992.977 | 2.522482357
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 256 | 3321.098 | 8987.837 | 2.706284789
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 1024 | 865.324 | 2362.563 | 2.73026404
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 4096 | 216.768 | 575.35 | 2.65422018
GatherOperationsBenchmark.microShortGather128 | 64 | 12835.472 | 31370.111 | 2.44401694
GatherOperationsBenchmark.microShortGather128 | 256 | 3151.091 | 8603.442 | 2.730305789
GatherOperationsBenchmark.microShortGather128 | 1024 | 820.026 | 2158.645 | 2.632410436
GatherOperationsBenchmark.microShortGather128 | 4096 | 205.263 | 535.444 | 2.60857534
GatherOperationsBenchmark.microShortGather128_MASK | 64 | 13055.905 | 23957.317 | 1.834979421
GatherOperationsBenchmark.microShortGather128_MASK | 256 | 3234.501 | 6416.879 | 1.983885304
GatherOperationsBenchmark.microShortGather128_MASK | 1024 | 829.648 | 1578.415 | 1.902511668
GatherOperationsBenchmark.microShortGather128_MASK | 4096 | 206.04 | 416.303 | 2.02049602
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 64 | 12905.373 | 22475.815 | 1.74158585
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 256 | 3202.372 | 5695.988 | 1.778677805
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 1024 | 814.645 | 1412.466 | 1.733842349
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 4096 | 199.535 | 355.407 | 1.781176235
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 64 | 12329.793 | 27620.341 | 2.240130147
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 256 | 3146.016 | 7664.47 | 2.436246351
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 1024 | 794.335 | 1925.535 | 2.424084297
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 4096 | 195.754 | 485.942 | 2.482411598
GatherOperationsBenchmark.microShortGather256 | 64 | 15430.153 | 33050.636 | 2.141951282
GatherOperationsBenchmark.microShortGather256 | 256 | 4042.835 | 8901.664 | 2.201837077
GatherOperationsBenchmark.microShortGather256 | 1024 | 986.361 | 2180.195 | 2.210341853
GatherOperationsBenchmark.microShortGather256 | 4096 | 250.057 | 560.523 | 2.24158092
GatherOperationsBenchmark.microShortGather256_MASK | 64 | 16793.012 | 23516.915 | 1.400398868
GatherOperationsBenchmark.microShortGather256_MASK | 256 | 4249.641 | 6505.857 | 1.5309192
GatherOperationsBenchmark.microShortGather256_MASK | 1024 | 1105.868 | 1600.44 | 1.447225166
GatherOperationsBenchmark.microShortGather256_MASK | 4096 | 268.443 | 410.052 | 1.527519809
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 64 | 16107.265 | 22559.877 | 1.400602585
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 256 | 4035.417 | 5872.376 | 1.455209214
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 1024 | 1028.671 | 1469.825 | 1.428858206
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 4096 | 258.639 | 370.997 | 1.434420176
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 64 | 14761.16 | 27601.245 | 1.869856095
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 256 | 3905.104 | 7684.751 | 1.967873583
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 1024 | 986.575 | 1853.319 | 1.878538378
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 4096 | 248.541 | 485.734 | 1.954341537
GatherOperationsBenchmark.microShortGather64 | 64 | 7942.618 | 33097.908 | 4.167128269
GatherOperationsBenchmark.microShortGather64 | 256 | 2009.148 | 9039.775 | 4.499307667
GatherOperationsBenchmark.microShortGather64 | 1024 | 506.769 | 2198.022 | 4.33732529
GatherOperationsBenchmark.microShortGather64 | 4096 | 118.499 | 565.551 | 4.772622554
GatherOperationsBenchmark.microShortGather64_MASK | 64 | 7802.345 | 23559.186 | 3.019500676
GatherOperationsBenchmark.microShortGather64_MASK | 256 | 1917.049 | 6278.454 | 3.275061827
GatherOperationsBenchmark.microShortGather64_MASK | 1024 | 491.248 | 1569.524 | 3.194972804
GatherOperationsBenchmark.microShortGather64_MASK | 4096 | 117.255 | 398.438 | 3.398046992
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 64 | 7697.165 | 22599.8 | 2.936119987
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 256 | 1913.269 | 5986.04 | 3.128697533
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 1024 | 483.724 | 1491.969 | 3.084339417
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 4096 | 116.716 | 375.492 | 3.217142465
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 64 | 7882.26 | 29755.573 | 3.775005265
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 256 | 1992.655 | 7969.383 | 3.99937922
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 1024 | 498.249 | 1997.082 | 4.008200719
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 4096 | 117.764 | 497.177 | 4.221808023



</body>

</html>

**Detailed performance numbers with AVX3**


Benchmark | Size | Baseline Score (ops/ms) | WithOpt Score (ops/ms) | Gain Factor (opt/baseline)
-- | -- | -- | -- | --
GatherOperationsBenchmark.microByteGather128 | 64 | 15900.681 | 35745.941 | 2.248076104
GatherOperationsBenchmark.microByteGather128 | 256 | 4194.349 | 9931.187 | 2.36775409
GatherOperationsBenchmark.microByteGather128 | 1024 | 1064.611 | 2528.468 | 2.375015851
GatherOperationsBenchmark.microByteGather128 | 4096 | 270.486 | 633.351 | 2.341529691
GatherOperationsBenchmark.microByteGather128_MASK | 64 | 17836.944 | 30418.654 | 1.705373634
GatherOperationsBenchmark.microByteGather128_MASK | 256 | 4411.449 | 8451.317 | 1.915768946
GatherOperationsBenchmark.microByteGather128_MASK | 1024 | 1155.587 | 2119.895 | 1.8344746
GatherOperationsBenchmark.microByteGather128_MASK | 4096 | 287.88 | 538.807 | 1.871637488
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 64 | 16678.512 | 27223.074 | 1.632224385
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 256 | 4268.674 | 7395.33 | 1.732465398
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 1024 | 1119.764 | 1854.529 | 1.656178445
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 4096 | 276.836 | 469.102 | 1.694512274
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 64 | 15561.662 | 33674.023 | 2.163909163
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 256 | 4065.922 | 9427.52 | 2.318667205
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 1024 | 1030.027 | 2430.395 | 2.359544944
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 4096 | 261.51 | 609.811 | 2.331884058
GatherOperationsBenchmark.microByteGather256 | 64 | 17993.999 | 36026.071 | 2.002115872
GatherOperationsBenchmark.microByteGather256 | 256 | 4646.105 | 9695.417 | 2.086783876
GatherOperationsBenchmark.microByteGather256 | 1024 | 1131.979 | 2487.113 | 2.197137049
GatherOperationsBenchmark.microByteGather256 | 4096 | 278.159 | 624.745 | 2.24599959
GatherOperationsBenchmark.microByteGather256_MASK | 64 | 22898.291 | 30126.448 | 1.315663601
GatherOperationsBenchmark.microByteGather256_MASK | 256 | 5473.285 | 8843.556 | 1.615767496
GatherOperationsBenchmark.microByteGather256_MASK | 1024 | 1415.369 | 2230.048 | 1.575594774
GatherOperationsBenchmark.microByteGather256_MASK | 4096 | 358.725 | 556.882 | 1.552392501
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 64 | 20186.469 | 27915.464 | 1.382879988
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 256 | 5214.919 | 7578.939 | 1.453318642
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 1024 | 1360.825 | 1902.398 | 1.397974023
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 4096 | 359.569 | 487.59 | 1.356040148
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 64 | 17154.31 | 35904.295 | 2.093018897
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 256 | 4404.264 | 8997.564 | 2.042921133
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 1024 | 1098.961 | 2317.713 | 2.109003868
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 4096 | 275.722 | 576.866 | 2.092201565
GatherOperationsBenchmark.microByteGather512 | 64 | 18790.829 | 38455.649 | 2.046511572
GatherOperationsBenchmark.microByteGather512 | 256 | 4806.001 | 10023.706 | 2.085664568
GatherOperationsBenchmark.microByteGather512 | 1024 | 1164.771 | 2558.357 | 2.19644634
GatherOperationsBenchmark.microByteGather512 | 4096 | 286.714 | 640.06 | 2.232398836
GatherOperationsBenchmark.microByteGather512_MASK | 64 | 25265.683 | 32738.543 | 1.295771145
GatherOperationsBenchmark.microByteGather512_MASK | 256 | 6417.048 | 8900.835 | 1.387060686
GatherOperationsBenchmark.microByteGather512_MASK | 1024 | 1726.425 | 2231.39 | 1.29249171
GatherOperationsBenchmark.microByteGather512_MASK | 4096 | 438.445 | 562.29 | 1.282464163
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 64 | 22097.788 | 29326.431 | 1.32712066
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 256 | 5587.934 | 7937.573 | 1.420484387
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 1024 | 1485.739 | 1967.966 | 1.324570466
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 4096 | 350.395 | 502.295 | 1.433510752
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 64 | 17904.091 | 34883.849 | 1.948373084
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 256 | 4532.971 | 9354.373 | 2.063629571
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 1024 | 1135.769 | 2394.267 | 2.108058065
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 4096 | 285.823 | 588.764 | 2.059890212
GatherOperationsBenchmark.microByteGather64 | 64 | 13044.341 | 32947.355 | 2.525796819
GatherOperationsBenchmark.microByteGather64 | 256 | 3244.318 | 8817.036 | 2.717685504
GatherOperationsBenchmark.microByteGather64 | 1024 | 812.016 | 2205.047 | 2.715521615
GatherOperationsBenchmark.microByteGather64 | 4096 | 212.882 | 559.439 | 2.627930027
GatherOperationsBenchmark.microByteGather64_MASK | 64 | 13328.592 | 25055.284 | 1.879814762
GatherOperationsBenchmark.microByteGather64_MASK | 256 | 3294.445 | 6779.14 | 2.057748726
GatherOperationsBenchmark.microByteGather64_MASK | 1024 | 832.091 | 1693.255 | 2.034939688
GatherOperationsBenchmark.microByteGather64_MASK | 4096 | 213.049 | 432.162 | 2.028462936
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 64 | 12713.428 | 23246.19 | 1.828475373
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 256 | 3238.82 | 6168.581 | 1.904576667
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 1024 | 819.699 | 1548.811 | 1.889487483
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 4096 | 207.266 | 388.488 | 1.874345045
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 64 | 12740.659 | 28229.421 | 2.215695515
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 256 | 3255.884 | 7816.444 | 2.400713293
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 1024 | 831.691 | 1976.915 | 2.376982557
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 4096 | 210.812 | 503.111 | 2.386538717
GatherOperationsBenchmark.microShortGather128 | 64 | 12858.925 | 34710.696 | 2.699346641
GatherOperationsBenchmark.microShortGather128 | 256 | 3106.472 | 9171.459 | 2.952371372
GatherOperationsBenchmark.microShortGather128 | 1024 | 819.192 | 2278.838 | 2.781811834
GatherOperationsBenchmark.microShortGather128 | 4096 | 204.157 | 575.636 | 2.819575131
GatherOperationsBenchmark.microShortGather128_MASK | 64 | 12528.506 | 28202.741 | 2.251085724
GatherOperationsBenchmark.microShortGather128_MASK | 256 | 3236.653 | 7798 | 2.409278968
GatherOperationsBenchmark.microShortGather128_MASK | 1024 | 820.409 | 1991.597 | 2.427566007
GatherOperationsBenchmark.microShortGather128_MASK | 4096 | 203.635 | 509.145 | 2.500282368
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 64 | 12166.914 | 25418.26 | 2.089129585
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 256 | 3202.89 | 6914.467 | 2.158821252
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 1024 | 799.485 | 1752.541 | 2.192087406
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 4096 | 199.868 | 442.822 | 2.215572278
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 64 | 12531.197 | 31505.217 | 2.514142663
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 256 | 3150.884 | 9098.353 | 2.887555683
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 1024 | 806.932 | 2280.42 | 2.826037386
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 4096 | 197.351 | 571.426 | 2.895480641
GatherOperationsBenchmark.microShortGather256 | 64 | 15255.994 | 34013.422 | 2.22951202
GatherOperationsBenchmark.microShortGather256 | 256 | 3986.306 | 9196.43 | 2.307005533
GatherOperationsBenchmark.microShortGather256 | 1024 | 1003.058 | 2294.437 | 2.287442002
GatherOperationsBenchmark.microShortGather256 | 4096 | 257.45 | 560.259 | 2.176185667
GatherOperationsBenchmark.microShortGather256_MASK | 64 | 17194.868 | 26817.506 | 1.559622673
GatherOperationsBenchmark.microShortGather256_MASK | 256 | 4307.911 | 7252.799 | 1.683600009
GatherOperationsBenchmark.microShortGather256_MASK | 1024 | 1109.034 | 1803.594 | 1.626274758
GatherOperationsBenchmark.microShortGather256_MASK | 4096 | 274.104 | 458.023 | 1.670982547
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 64 | 15615.164 | 25553.091 | 1.636427962
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 256 | 4041.88 | 6826.642 | 1.688976912
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 1024 | 1037.741 | 1706.208 | 1.644155912
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 4096 | 257.732 | 439.843 | 1.706590567
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 64 | 14795.183 | 31449.844 | 2.125681311
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 256 | 3931.158 | 8463.181 | 2.15284682
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 1024 | 989.033 | 2120.528 | 2.144041705
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 4096 | 247.472 | 537.454 | 2.171777009
GatherOperationsBenchmark.microShortGather512 | 64 | 17122.803 | 33007.209 | 1.927675568
GatherOperationsBenchmark.microShortGather512 | 256 | 4378.354 | 9043.805 | 2.065571902
GatherOperationsBenchmark.microShortGather512 | 1024 | 1058.899 | 2292.852 | 2.165316994
GatherOperationsBenchmark.microShortGather512 | 4096 | 255.502 | 577.995 | 2.262193642
GatherOperationsBenchmark.microShortGather512_MASK | 64 | 21232.499 | 27840.109 | 1.311202652
GatherOperationsBenchmark.microShortGather512_MASK | 256 | 5520.535 | 8068.078 | 1.461466688
GatherOperationsBenchmark.microShortGather512_MASK | 1024 | 1424.798 | 2058.709 | 1.444912893
GatherOperationsBenchmark.microShortGather512_MASK | 4096 | 355.353 | 472.564 | 1.329843845
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 64 | 19155.458 | 25609.616 | 1.336935718
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 256 | 4927.708 | 7004.695 | 1.421491493
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 1024 | 1275.924 | 1773.667 | 1.390103956
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 4096 | 312.012 | 444.827 | 1.425672731
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 64 | 16195.665 | 33555.966 | 2.071910354
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 256 | 4123.235 | 7852.8 | 1.904523996
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 1024 | 1033.009 | 1722.473 | 1.667432714
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 4096 | 256.001 | 441.754 | 1.725594822
GatherOperationsBenchmark.microShortGather64 | 64 | 7717.303 | 34015.02 | 4.40763049
GatherOperationsBenchmark.microShortGather64 | 256 | 1940.575 | 9191.168 | 4.73631166
GatherOperationsBenchmark.microShortGather64 | 1024 | 490.29 | 2294.718 | 4.680327969
GatherOperationsBenchmark.microShortGather64 | 4096 | 117.59 | 579.407 | 4.927349264
GatherOperationsBenchmark.microShortGather64_MASK | 64 | 7325.501 | 26815.558 | 3.660576662
GatherOperationsBenchmark.microShortGather64_MASK | 256 | 1792.717 | 7265.925 | 4.053023985
GatherOperationsBenchmark.microShortGather64_MASK | 1024 | 456.88 | 1805.418 | 3.951624059
GatherOperationsBenchmark.microShortGather64_MASK | 4096 | 109.657 | 454.874 | 4.148152877
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 64 | 7157.769 | 25542.547 | 3.568506751
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 256 | 1763.398 | 6817.439 | 3.866080715
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 1024 | 453.254 | 1704.752 | 3.761140553
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 4096 | 108.622 | 439.762 | 4.0485537
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 64 | 7708.116 | 31334.47 | 4.065126939
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 256 | 1954.153 | 8460.249 | 4.329368785
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 1024 | 493.538 | 2120.589 | 4.296708663
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 4096 | 116.976 | 537.472 | 4.594720285

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16354#issuecomment-1778499766
PR Comment: https://git.openjdk.org/jdk/pull/16354#issuecomment-1778500434


More information about the hotspot-compiler-dev mailing list