8234863: Increase default value of MaxInlineLevel

Claes Redestad claes.redestad at oracle.com
Mon Feb 24 23:09:51 UTC 2020


Hi,

I was finally able to build the 1.3.9 version of the benchmark, but on
my local system (Haswell Xeon) I can not reliably reproduce the
regression.

The run-to-run variation in this microbenchmark seem very high
regardless of MaxInlineLevel value.

For size 100000 run over 10 forks, I'm getting anything from a stable
170ops/s in one fork to 330 ops/s in another (MaxInlineLevel=15, jdk/jdk
build), and exactly the same highs and lows for MaxInlineLevel=9, see
attached logs for details[1]. Not how the result per fork settles in
pretty quick, but at a more or less random level in the above range.

I'm leaning toward "Not an issue". Maybe there's a reason it's not there
anymore in the 3.x branch.

/Claes

On 2020-02-24 20:25, Jiva, Azeem wrote:
> Claes,
>    As we discussed, this is an issue with the RxJava 1.3.x microbenchmarks and the delta is quite significant.  Have you had a chance to dive into what's happening with the deeper inlining level?
> 


[1]

Run logs for MaxInlineLevel=15 followed by =9

$ numactl --cpunodebind=0 $jdk/bin/java -jar 
build/libs/rxjava-1.3.9-SNAPSHOT-benchmarks.jar 
ComputationSchedulerPerf.*observeOn -f 10  -p size=100000 -wi 5 -w 2 -i 
5 -r 2 -jvmArgs="-XX:MaxInlineLevel=15"
# JMH 1.11.3 (released 1502 days ago, please consider updating!)
# VM version: JDK 15-internal, VM 
15-internal+0-2020-02-24-1146405.clredest...
# VM invoker: build/linux-x64/images/jdk/bin/java
# VM options: -XX:MaxInlineLevel=15
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 2 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: rx.schedulers.ComputationSchedulerPerf.observeOn
# Parameters: (size = 100000)

# Run progress: 0.00% complete, ETA 00:03:20
# Fork: 1 of 10
# Warmup Iteration   1: 308.137 ops/s
# Warmup Iteration   2: 377.802 ops/s
# Warmup Iteration   3: 381.467 ops/s
# Warmup Iteration   4: 350.456 ops/s
# Warmup Iteration   5: 331.369 ops/s
Iteration   1: 308.591 ops/s
Iteration   2: 306.630 ops/s
Iteration   3: 310.932 ops/s
Iteration   4: 311.423 ops/s
Iteration   5: 331.754 ops/s

# Run progress: 10.00% complete, ETA 00:03:03
# Fork: 2 of 10
# Warmup Iteration   1: 247.713 ops/s
# Warmup Iteration   2: 312.164 ops/s
# Warmup Iteration   3: 273.882 ops/s
# Warmup Iteration   4: 237.406 ops/s
# Warmup Iteration   5: 234.439 ops/s
Iteration   1: 231.938 ops/s
Iteration   2: 225.449 ops/s
Iteration   3: 232.026 ops/s
Iteration   4: 221.079 ops/s
Iteration   5: 241.194 ops/s

# Run progress: 20.00% complete, ETA 00:02:43
# Fork: 3 of 10
# Warmup Iteration   1: 292.728 ops/s
# Warmup Iteration   2: 314.710 ops/s
# Warmup Iteration   3: 320.282 ops/s
# Warmup Iteration   4: 322.557 ops/s
# Warmup Iteration   5: 316.284 ops/s
Iteration   1: 327.343 ops/s
Iteration   2: 306.609 ops/s
Iteration   3: 317.059 ops/s
Iteration   4: 315.174 ops/s
Iteration   5: 283.809 ops/s

# Run progress: 30.00% complete, ETA 00:02:22
# Fork: 4 of 10
# Warmup Iteration   1: 150.172 ops/s
# Warmup Iteration   2: 190.462 ops/s
# Warmup Iteration   3: 197.778 ops/s
# Warmup Iteration   4: 229.885 ops/s
# Warmup Iteration   5: 199.934 ops/s
Iteration   1: 208.933 ops/s
Iteration   2: 203.446 ops/s
Iteration   3: 182.926 ops/s
Iteration   4: 198.219 ops/s
Iteration   5: 189.674 ops/s

# Run progress: 40.00% complete, ETA 00:02:02
# Fork: 5 of 10
# Warmup Iteration   1: 173.136 ops/s
# Warmup Iteration   2: 197.542 ops/s
# Warmup Iteration   3: 199.816 ops/s
# Warmup Iteration   4: 183.061 ops/s
# Warmup Iteration   5: 181.675 ops/s
Iteration   1: 182.893 ops/s
Iteration   2: 182.437 ops/s
Iteration   3: 170.715 ops/s
Iteration   4: 185.610 ops/s
Iteration   5: 186.175 ops/s

# Run progress: 50.00% complete, ETA 00:01:41
# Fork: 6 of 10
# Warmup Iteration   1: 253.295 ops/s
# Warmup Iteration   2: 286.376 ops/s
# Warmup Iteration   3: 248.205 ops/s
# Warmup Iteration   4: 217.620 ops/s
# Warmup Iteration   5: 226.234 ops/s
Iteration   1: 218.880 ops/s
Iteration   2: 219.477 ops/s
Iteration   3: 215.964 ops/s
Iteration   4: 225.023 ops/s
Iteration   5: 236.793 ops/s

# Run progress: 60.00% complete, ETA 00:01:21
# Fork: 7 of 10
# Warmup Iteration   1: 304.706 ops/s
# Warmup Iteration   2: 334.557 ops/s
# Warmup Iteration   3: 306.369 ops/s
# Warmup Iteration   4: 314.033 ops/s
# Warmup Iteration   5: 309.778 ops/s
Iteration   1: 287.821 ops/s
Iteration   2: 288.081 ops/s
Iteration   3: 302.696 ops/s
Iteration   4: 309.526 ops/s
Iteration   5: 319.038 ops/s

# Run progress: 70.00% complete, ETA 00:01:01
# Fork: 8 of 10
# Warmup Iteration   1: 187.162 ops/s
# Warmup Iteration   2: 228.934 ops/s
# Warmup Iteration   3: 185.007 ops/s
# Warmup Iteration   4: 183.127 ops/s
# Warmup Iteration   5: 183.610 ops/s
Iteration   1: 174.896 ops/s
Iteration   2: 177.716 ops/s
Iteration   3: 162.185 ops/s
Iteration   4: 175.696 ops/s
Iteration   5: 175.483 ops/s

# Run progress: 80.00% complete, ETA 00:00:40
# Fork: 9 of 10
# Warmup Iteration   1: 247.256 ops/s
# Warmup Iteration   2: 344.854 ops/s
# Warmup Iteration   3: 332.317 ops/s
# Warmup Iteration   4: 326.852 ops/s
# Warmup Iteration   5: 309.211 ops/s
Iteration   1: 280.466 ops/s
Iteration   2: 302.288 ops/s
Iteration   3: 267.403 ops/s
Iteration   4: 278.625 ops/s
Iteration   5: 263.112 ops/s

# Run progress: 90.00% complete, ETA 00:00:20
# Fork: 10 of 10
# Warmup Iteration   1: 158.022 ops/s
# Warmup Iteration   2: 189.021 ops/s
# Warmup Iteration   3: 188.279 ops/s
# Warmup Iteration   4: 185.716 ops/s
# Warmup Iteration   5: 185.880 ops/s
Iteration   1: 186.384 ops/s
Iteration   2: 186.226 ops/s
Iteration   3: 183.449 ops/s
Iteration   4: 173.699 ops/s
Iteration   5: 184.808 ops/s


Result "observeOn":
   239.155 ±(99.9%) 27.340 ops/s [Average]
   (min, avg, max) = (162.185, 239.155, 331.754), stdev = 55.229
   CI (99.9%): [211.815, 266.496] (assumes normal distribution)


# Run complete. Total time: 00:03:23

Benchmark                           (size)   Mode  Cnt    Score    Error 
  Units
ComputationSchedulerPerf.observeOn  100000  thrpt   50  239.155 ± 27.340 
  ops/s

$ numactl --cpunodebind=0 $jdk/bin/java -jar 
build/libs/rxjava-1.3.9-SNAPSHOT-benchmarks.jar 
ComputationSchedulerPerf.*observeOn -f 10  -p size=100000 -wi 5 -w 2 -i 
5 -r 2 -jvmArgs="-XX:MaxInlineLevel=9"
# JMH 1.11.3 (released 1503 days ago, please consider updating!)
# VM version: JDK 15-internal, VM 
15-internal+0-2020-02-24-1146405.clredest...
# VM invoker: build/linux-x64/images/jdk/bin/java
# VM options: 
--add-exports=jdk.management/com.sun.management.internal=ALL-UNNAMED 
-XX:MaxInlineLevel=9
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 2 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: rx.schedulers.ComputationSchedulerPerf.observeOn
# Parameters: (size = 100000)

# Run progress: 0.00% complete, ETA 00:03:20
# Fork: 1 of 10
# Warmup Iteration   1: 218.720 ops/s
# Warmup Iteration   2: 259.219 ops/s
# Warmup Iteration   3: 245.145 ops/s
# Warmup Iteration   4: 244.944 ops/s
# Warmup Iteration   5: 186.248 ops/s
Iteration   1: 180.263 ops/s
Iteration   2: 165.831 ops/s
Iteration   3: 178.660 ops/s
Iteration   4: 180.241 ops/s
Iteration   5: 176.193 ops/s

# Run progress: 10.00% complete, ETA 00:03:04
# Fork: 2 of 10
# Warmup Iteration   1: 267.636 ops/s
# Warmup Iteration   2: 332.165 ops/s
# Warmup Iteration   3: 325.828 ops/s
# Warmup Iteration   4: 325.195 ops/s
# Warmup Iteration   5: 333.456 ops/s
Iteration   1: 331.260 ops/s
Iteration   2: 310.459 ops/s
Iteration   3: 335.419 ops/s
Iteration   4: 328.600 ops/s
Iteration   5: 303.829 ops/s

# Run progress: 20.00% complete, ETA 00:02:43
# Fork: 3 of 10
# Warmup Iteration   1: 259.850 ops/s
# Warmup Iteration   2: 255.872 ops/s
# Warmup Iteration   3: 268.157 ops/s
# Warmup Iteration   4: 254.214 ops/s
# Warmup Iteration   5: 270.159 ops/s
Iteration   1: 240.874 ops/s
Iteration   2: 225.527 ops/s
Iteration   3: 254.826 ops/s
Iteration   4: 238.400 ops/s
Iteration   5: 243.383 ops/s

# Run progress: 30.00% complete, ETA 00:02:22
# Fork: 4 of 10
# Warmup Iteration   1: 188.073 ops/s
# Warmup Iteration   2: 258.390 ops/s
# Warmup Iteration   3: 224.278 ops/s
# Warmup Iteration   4: 228.948 ops/s
# Warmup Iteration   5: 218.226 ops/s
Iteration   1: 191.827 ops/s
Iteration   2: 181.651 ops/s
Iteration   3: 173.203 ops/s
Iteration   4: 183.152 ops/s
Iteration   5: 176.889 ops/s

# Run progress: 40.00% complete, ETA 00:02:02
# Fork: 5 of 10
# Warmup Iteration   1: 232.270 ops/s
# Warmup Iteration   2: 258.410 ops/s
# Warmup Iteration   3: 244.721 ops/s
# Warmup Iteration   4: 238.275 ops/s
# Warmup Iteration   5: 227.297 ops/s
Iteration   1: 217.916 ops/s
Iteration   2: 203.226 ops/s
Iteration   3: 198.320 ops/s
Iteration   4: 211.672 ops/s
Iteration   5: 231.153 ops/s

# Run progress: 50.00% complete, ETA 00:01:42
# Fork: 6 of 10
# Warmup Iteration   1: 154.199 ops/s
# Warmup Iteration   2: 188.968 ops/s
# Warmup Iteration   3: 200.437 ops/s
# Warmup Iteration   4: 205.786 ops/s
# Warmup Iteration   5: 193.097 ops/s
Iteration   1: 191.951 ops/s
Iteration   2: 190.814 ops/s
Iteration   3: 182.418 ops/s
Iteration   4: 183.914 ops/s
Iteration   5: 189.912 ops/s

# Run progress: 60.00% complete, ETA 00:01:21
# Fork: 7 of 10
# Warmup Iteration   1: 151.756 ops/s
# Warmup Iteration   2: 184.493 ops/s
# Warmup Iteration   3: 180.740 ops/s
# Warmup Iteration   4: 181.102 ops/s
# Warmup Iteration   5: 181.172 ops/s
Iteration   1: 181.011 ops/s
Iteration   2: 181.592 ops/s
Iteration   3: 187.456 ops/s
Iteration   4: 174.353 ops/s
Iteration   5: 186.561 ops/s

# Run progress: 70.00% complete, ETA 00:01:01
# Fork: 8 of 10
# Warmup Iteration   1: 187.211 ops/s
# Warmup Iteration   2: 194.340 ops/s
# Warmup Iteration   3: 210.136 ops/s
# Warmup Iteration   4: 202.079 ops/s
# Warmup Iteration   5: 198.685 ops/s
Iteration   1: 207.574 ops/s
Iteration   2: 221.047 ops/s
Iteration   3: 199.842 ops/s
Iteration   4: 184.712 ops/s
Iteration   5: 192.552 ops/s

# Run progress: 80.00% complete, ETA 00:00:40
# Fork: 9 of 10
# Warmup Iteration   1: 149.395 ops/s
# Warmup Iteration   2: 179.125 ops/s
# Warmup Iteration   3: 175.259 ops/s
# Warmup Iteration   4: 177.971 ops/s
# Warmup Iteration   5: 169.131 ops/s
Iteration   1: 174.573 ops/s
Iteration   2: 174.098 ops/s
Iteration   3: 171.107 ops/s
Iteration   4: 163.116 ops/s
Iteration   5: 174.914 ops/s

# Run progress: 90.00% complete, ETA 00:00:20
# Fork: 10 of 10
# Warmup Iteration   1: 284.148 ops/s
# Warmup Iteration   2: 352.798 ops/s
# Warmup Iteration   3: 306.848 ops/s
# Warmup Iteration   4: 250.787 ops/s
# Warmup Iteration   5: 266.872 ops/s
Iteration   1: 270.604 ops/s
Iteration   2: 272.181 ops/s
Iteration   3: 271.800 ops/s
Iteration   4: 269.214 ops/s
Iteration   5: 271.461 ops/s


Result "observeOn":
   214.631 ±(99.9%) 23.679 ops/s [Average]
   (min, avg, max) = (163.116, 214.631, 335.419), stdev = 47.833
   CI (99.9%): [190.952, 238.310] (assumes normal distribution)


# Run complete. Total time: 00:03:23

Benchmark                           (size)   Mode  Cnt    Score    Error 
  Units
ComputationSchedulerPerf.observeOn  100000  thrpt   50  214.631 ± 23.679 
  ops/s


More information about the hotspot-compiler-dev mailing list