KNL specific fix: disable generating INC and DEC instructions on Xeon Phi and Silvermont CPUs

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri Jun 16 21:28:46 UTC 2017


I don't see it is fixed:

+       FLAG_SET_DEFAULT(UseIncDec, false);
+       }
+#ifdef COMPILER2
+ if (FLAG_IS_DEFAULT(OptoScheduling)) {
+  OptoScheduling = true;
+ }
+#endif
+      if (supports_sse4_2()) { // Silvermont

+       if (FLAG_IS_DEFAULT(UseIncDec)){
+        FLAG_SET_DEFAULT(UseIncDec, false);
+        }

Vladimir

On 6/16/17 2:03 PM, Kandu, Rahul wrote:
> Hi Vladimir,
> 
> Thanks. Fixed the indents- no tabs in the code change. Please find the 
> updated webrev below.
> 
> Openjdk bug location: https://bugs.openjdk.java.net/browse/JDK-8182138
> 
> Webrev for the code change: 
> http://cr.openjdk.java.net/~vdeshpande/8182138/webrev.01/
> 
> regards,
> 
> Rahul
> 
> *From:*Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> *Sent:* Thursday, June 15, 2017 2:10 PM
> *To:* Kandu, Rahul <rahul.kandu at intel.com>; 
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: KNL specific fix: disable generating INC and DEC 
> instructions on Xeon Phi and Silvermont CPUs
> 
> Hi Rahul
> 
> Please fix indents - don't use tabs.
> 
> Vladimir
> 
> On 6/15/17 1:14 PM, Kandu, Rahul wrote:
> 
>     Hi all,
> 
>     The following patch disables generating INC, DEC instructions on
>     Xeon Phi and Silvermont ATOM based CPUs. We have currently
>     identified that using INC and DEC can suffer from unexpected
>     performance drops on certain processors which don't optimize for
>     partial write flags. This patch disables generation of these two
>     instructions as they are more commonly used at loop
>     increment/decrement.
> 
>     Patch provides 3.65% better performance on Knights Landing CPU on
>     SPECjvm2008 composite score as per runs below on the latest openjdk
>     source.
> 
>     Openjdk bug location: https://bugs.openjdk.java.net/browse/JDK-8182138
> 
>     Webrev for the code change:
>     http://cr.openjdk.java.net/~vdeshpande/8182138/webrev.00/
>     <http://cr.openjdk.java.net/%7Evdeshpande/8182138/webrev.00/>
> 
>     Scores:
> 
>     	
> 
>     *6/10 jdk10 code (no change)*
> 
>     						
> 
>     *6/10 jdk10code with this patch *
> 
>     				
>     	
> 
>     *run1*
> 
>     	
> 
>     *run2*
> 
>     	
> 
>     *run3*
> 
>     	
> 
>     *geomean*
> 
>     						
> 
>     *run1*
> 
>     	
> 
>     *run2*
> 
>     	
> 
>     *run3*
> 
>     	
> 
>     *geomean*
> 
>     		
> 
>     lu.small
> 
>     	
> 
>     1503.79
> 
>     	
> 
>     1500.62
> 
>     	
> 
>     1494.98
> 
>     	
> 
>     1499.792
> 
>     					
> 
>     lu.small
> 
>     	
> 
>     1478.48
> 
>     	
> 
>     1493.78
> 
>     	
> 
>     1509.2
> 
>     	
> 
>     1493.767
> 
>     		
> 
>     sor.small
> 
>     	
> 
>     2417.24
> 
>     	
> 
>     2372.1
> 
>     	
> 
>     2356.47
> 
>     	
> 
>     2381.798
> 
>     					
> 
>     sor.small
> 
>     	
> 
>     2436.89
> 
>     	
> 
>     2434.46
> 
>     	
> 
>     2446.88
> 
>     	
> 
>     2439.404
> 
>     		
> 
>     sparse.small
> 
>     	
> 
>     606.35
> 
>     	
> 
>     635.19
> 
>     	
> 
>     595.44
> 
>     	
> 
>     612.099
> 
>     					
> 
>     sparse.small
> 
>     	
> 
>     681.96
> 
>     	
> 
>     728.02
> 
>     	
> 
>     671.41
> 
>     	
> 
>     693.3673
> 
>     		
> 
>     fft.small
> 
>     	
> 
>     1463.55
> 
>     	
> 
>     1406.43
> 
>     	
> 
>     1173.63
> 
>     	
> 
>     1341.793
> 
>     					
> 
>     fft.small
> 
>     	
> 
>     1220.14
> 
>     	
> 
>     1425.19
> 
>     	
> 
>     1190.06
> 
>     	
> 
>     1274.335
> 
>     		
> 
>     monte_carlo
> 
>     	
> 
>     823.66
> 
>     	
> 
>     825.96
> 
>     	
> 
>     761.26
> 
>     	
> 
>     803.0575
> 
>     					
> 
>     monte_carlo
> 
>     	
> 
>     939.53
> 
>     	
> 
>     923
> 
>     	
> 
>     934.76
> 
>     	
> 
>     932.4041
> 
>     		
> 
>     sparse.large
> 
>     	
> 
>     159.45
> 
>     	
> 
>     139.83
> 
>     	
> 
>     155.76
> 
>     	
> 
>     151.4352
> 
>     					
> 
>     sparse.large
> 
>     	
> 
>     100.66
> 
>     	
> 
>     150.22
> 
>     	
> 
>     179.79
> 
>     	
> 
>     139.5672
> 
>     		
> 
>     fft.large
> 
>     	
> 
>     419.19
> 
>     	
> 
>     425.81
> 
>     	
> 
>     432.6
> 
>     	
> 
>     425.8315
> 
>     					
> 
>     fft.large
> 
>     	
> 
>     433.11
> 
>     	
> 
>     424.72
> 
>     	
> 
>     429.07
> 
>     	
> 
>     428.953
> 
>     		
> 
>     sor.large
> 
>     	
> 
>     416.31
> 
>     	
> 
>     262.98
> 
>     	
> 
>     271.31
> 
>     	
> 
>     309.6957
> 
>     					
> 
>     sor.large
> 
>     	
> 
>     366.6
> 
>     	
> 
>     397.67
> 
>     	
> 
>     352.75
> 
>     	
> 
>     371.8725
> 
>     		
> 
>     lu.large
> 
>     	
> 
>     116.46
> 
>     	
> 
>     127.51
> 
>     	
> 
>     129.33
> 
>     	
> 
>     124.3007
> 
>     					
> 
>     lu.large
> 
>     	
> 
>     124.2
> 
>     	
> 
>     122.69
> 
>     	
> 
>     124.1
> 
>     	
> 
>     123.6614
> 
>     		
> 
>     transform
> 
>     	
> 
>     1056.64
> 
>     	
> 
>     1066.6
> 
>     	
> 
>     1021.08
> 
>     	
> 
>     1047.923
> 
>     					
> 
>     transform
> 
>     	
> 
>     1015.85
> 
>     	
> 
>     1056.42
> 
>     	
> 
>     1049.42
> 
>     	
> 
>     1040.412
> 
>     		
> 
>     validation
> 
>     	
> 
>     1371.86
> 
>     	
> 
>     1898.49
> 
>     	
> 
>     1971.28
> 
>     	
> 
>     1725.131
> 
>     					
> 
>     validation
> 
>     	
> 
>     2088.81
> 
>     	
> 
>     2178.14
> 
>     	
> 
>     2112.95
> 
>     	
> 
>     2126.301
> 
>     		
> 
>     aes
> 
>     	
> 
>     276.67
> 
>     	
> 
>     255.84
> 
>     	
> 
>     299.78
> 
>     	
> 
>     276.8499
> 
>     					
> 
>     aes
> 
>     	
> 
>     261.5
> 
>     	
> 
>     258.95
> 
>     	
> 
>     290.17
> 
>     	
> 
>     269.8444
> 
>     		
> 
>     rsa
> 
>     	
> 
>     1041.29
> 
>     	
> 
>     1069.51
> 
>     	
> 
>     1069.26
> 
>     	
> 
>     1059.937
> 
>     					
> 
>     rsa
> 
>     	
> 
>     1091.45
> 
>     	
> 
>     1089.15
> 
>     	
> 
>     1095.52
> 
>     	
> 
>     1092.037
> 
>     		
> 
>     signverify
> 
>     	
> 
>     2583.7
> 
>     	
> 
>     2592.98
> 
>     	
> 
>     2586.34
> 
>     	
> 
>     2587.67
> 
>     					
> 
>     signverify
> 
>     	
> 
>     2660.73
> 
>     	
> 
>     2664.17
> 
>     	
> 
>     2634.47
> 
>     	
> 
>     2653.09
> 
>     		
> 
>     compress
> 
>     	
> 
>     817.65
> 
>     	
> 
>     817.44
> 
>     	
> 
>     816.55
> 
>     	
> 
>     817.2132
> 
>     					
> 
>     compress
> 
>     	
> 
>     852.55
> 
>     	
> 
>     847.61
> 
>     	
> 
>     894.59
> 
>     	
> 
>     864.6626
> 
>     		
> 
>     serial
> 
>     	
> 
>     608.48
> 
>     	
> 
>     586.62
> 
>     	
> 
>     615.37
> 
>     	
> 
>     603.3646
> 
>     					
> 
>     serial
> 
>     	
> 
>     627.19
> 
>     	
> 
>     605.21
> 
>     	
> 
>     619.31
> 
>     	
> 
>     617.1695
> 
>     		
> 
>     sunflow
> 
>     	
> 
>     371.28
> 
>     	
> 
>     373.03
> 
>     	
> 
>     373.04
> 
>     	
> 
>     372.4491
> 
>     					
> 
>     sunflow
> 
>     	
> 
>     368.59
> 
>     	
> 
>     381.78
> 
>     	
> 
>     369.64
> 
>     	
> 
>     373.289
> 
>     		
> 
>     mpegaudio
> 
>     	
> 
>     743.85
> 
>     	
> 
>     734.46
> 
>     	
> 
>     752.62
> 
>     	
> 
>     743.6064
> 
>     					
> 
>     mpegaudio
> 
>     	
> 
>     775.45
> 
>     	
> 
>     773.35
> 
>     	
> 
>     776.98
> 
>     	
> 
>     775.2586
> 
>     		
> 
>     derby
> 
>     	
> 
>     1929.9
> 
>     	
> 
>     1901.28
> 
>     	
> 
>     1922.56
> 
>     	
> 
>     1917.875
> 
>     					
> 
>     derby
> 
>     	
> 
>     1927.97
> 
>     	
> 
>     1865.47
> 
>     	
> 
>     1919.17
> 
>     	
> 
>     1904.002
> 
>     		
> 
>     Total
> 
>     	
> 
>     780.54
> 
>     	
> 
>     779.91
> 
>     	
> 
>     786.98
> 
>     	
> 
>     782.4702
> 
>     					
> 
>     Total
> 
>     	
> 
>     801
> 
>     	
> 
>     812.98
> 
>     	
> 
>     819
> 
>     	
> 
>     810.9587
> 
>     	
> 
>     3.65% improvement
> 
>     regards,
> 
>     Rahul
> 


More information about the hotspot-compiler-dev mailing list