8234863: Increase default value of MaxInlineLevel
Claes Redestad
claes.redestad at oracle.com
Mon Feb 24 23:09:51 UTC 2020
Hi,
I was finally able to build the 1.3.9 version of the benchmark, but on
my local system (Haswell Xeon) I can not reliably reproduce the
regression.
The run-to-run variation in this microbenchmark seem very high
regardless of MaxInlineLevel value.
For size 100000 run over 10 forks, I'm getting anything from a stable
170ops/s in one fork to 330 ops/s in another (MaxInlineLevel=15, jdk/jdk
build), and exactly the same highs and lows for MaxInlineLevel=9, see
attached logs for details[1]. Not how the result per fork settles in
pretty quick, but at a more or less random level in the above range.
I'm leaning toward "Not an issue". Maybe there's a reason it's not there
anymore in the 3.x branch.
/Claes
On 2020-02-24 20:25, Jiva, Azeem wrote:
> Claes,
> As we discussed, this is an issue with the RxJava 1.3.x microbenchmarks and the delta is quite significant. Have you had a chance to dive into what's happening with the deeper inlining level?
>
[1]
Run logs for MaxInlineLevel=15 followed by =9
$ numactl --cpunodebind=0 $jdk/bin/java -jar
build/libs/rxjava-1.3.9-SNAPSHOT-benchmarks.jar
ComputationSchedulerPerf.*observeOn -f 10 -p size=100000 -wi 5 -w 2 -i
5 -r 2 -jvmArgs="-XX:MaxInlineLevel=15"
# JMH 1.11.3 (released 1502 days ago, please consider updating!)
# VM version: JDK 15-internal, VM
15-internal+0-2020-02-24-1146405.clredest...
# VM invoker: build/linux-x64/images/jdk/bin/java
# VM options: -XX:MaxInlineLevel=15
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 2 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: rx.schedulers.ComputationSchedulerPerf.observeOn
# Parameters: (size = 100000)
# Run progress: 0.00% complete, ETA 00:03:20
# Fork: 1 of 10
# Warmup Iteration 1: 308.137 ops/s
# Warmup Iteration 2: 377.802 ops/s
# Warmup Iteration 3: 381.467 ops/s
# Warmup Iteration 4: 350.456 ops/s
# Warmup Iteration 5: 331.369 ops/s
Iteration 1: 308.591 ops/s
Iteration 2: 306.630 ops/s
Iteration 3: 310.932 ops/s
Iteration 4: 311.423 ops/s
Iteration 5: 331.754 ops/s
# Run progress: 10.00% complete, ETA 00:03:03
# Fork: 2 of 10
# Warmup Iteration 1: 247.713 ops/s
# Warmup Iteration 2: 312.164 ops/s
# Warmup Iteration 3: 273.882 ops/s
# Warmup Iteration 4: 237.406 ops/s
# Warmup Iteration 5: 234.439 ops/s
Iteration 1: 231.938 ops/s
Iteration 2: 225.449 ops/s
Iteration 3: 232.026 ops/s
Iteration 4: 221.079 ops/s
Iteration 5: 241.194 ops/s
# Run progress: 20.00% complete, ETA 00:02:43
# Fork: 3 of 10
# Warmup Iteration 1: 292.728 ops/s
# Warmup Iteration 2: 314.710 ops/s
# Warmup Iteration 3: 320.282 ops/s
# Warmup Iteration 4: 322.557 ops/s
# Warmup Iteration 5: 316.284 ops/s
Iteration 1: 327.343 ops/s
Iteration 2: 306.609 ops/s
Iteration 3: 317.059 ops/s
Iteration 4: 315.174 ops/s
Iteration 5: 283.809 ops/s
# Run progress: 30.00% complete, ETA 00:02:22
# Fork: 4 of 10
# Warmup Iteration 1: 150.172 ops/s
# Warmup Iteration 2: 190.462 ops/s
# Warmup Iteration 3: 197.778 ops/s
# Warmup Iteration 4: 229.885 ops/s
# Warmup Iteration 5: 199.934 ops/s
Iteration 1: 208.933 ops/s
Iteration 2: 203.446 ops/s
Iteration 3: 182.926 ops/s
Iteration 4: 198.219 ops/s
Iteration 5: 189.674 ops/s
# Run progress: 40.00% complete, ETA 00:02:02
# Fork: 5 of 10
# Warmup Iteration 1: 173.136 ops/s
# Warmup Iteration 2: 197.542 ops/s
# Warmup Iteration 3: 199.816 ops/s
# Warmup Iteration 4: 183.061 ops/s
# Warmup Iteration 5: 181.675 ops/s
Iteration 1: 182.893 ops/s
Iteration 2: 182.437 ops/s
Iteration 3: 170.715 ops/s
Iteration 4: 185.610 ops/s
Iteration 5: 186.175 ops/s
# Run progress: 50.00% complete, ETA 00:01:41
# Fork: 6 of 10
# Warmup Iteration 1: 253.295 ops/s
# Warmup Iteration 2: 286.376 ops/s
# Warmup Iteration 3: 248.205 ops/s
# Warmup Iteration 4: 217.620 ops/s
# Warmup Iteration 5: 226.234 ops/s
Iteration 1: 218.880 ops/s
Iteration 2: 219.477 ops/s
Iteration 3: 215.964 ops/s
Iteration 4: 225.023 ops/s
Iteration 5: 236.793 ops/s
# Run progress: 60.00% complete, ETA 00:01:21
# Fork: 7 of 10
# Warmup Iteration 1: 304.706 ops/s
# Warmup Iteration 2: 334.557 ops/s
# Warmup Iteration 3: 306.369 ops/s
# Warmup Iteration 4: 314.033 ops/s
# Warmup Iteration 5: 309.778 ops/s
Iteration 1: 287.821 ops/s
Iteration 2: 288.081 ops/s
Iteration 3: 302.696 ops/s
Iteration 4: 309.526 ops/s
Iteration 5: 319.038 ops/s
# Run progress: 70.00% complete, ETA 00:01:01
# Fork: 8 of 10
# Warmup Iteration 1: 187.162 ops/s
# Warmup Iteration 2: 228.934 ops/s
# Warmup Iteration 3: 185.007 ops/s
# Warmup Iteration 4: 183.127 ops/s
# Warmup Iteration 5: 183.610 ops/s
Iteration 1: 174.896 ops/s
Iteration 2: 177.716 ops/s
Iteration 3: 162.185 ops/s
Iteration 4: 175.696 ops/s
Iteration 5: 175.483 ops/s
# Run progress: 80.00% complete, ETA 00:00:40
# Fork: 9 of 10
# Warmup Iteration 1: 247.256 ops/s
# Warmup Iteration 2: 344.854 ops/s
# Warmup Iteration 3: 332.317 ops/s
# Warmup Iteration 4: 326.852 ops/s
# Warmup Iteration 5: 309.211 ops/s
Iteration 1: 280.466 ops/s
Iteration 2: 302.288 ops/s
Iteration 3: 267.403 ops/s
Iteration 4: 278.625 ops/s
Iteration 5: 263.112 ops/s
# Run progress: 90.00% complete, ETA 00:00:20
# Fork: 10 of 10
# Warmup Iteration 1: 158.022 ops/s
# Warmup Iteration 2: 189.021 ops/s
# Warmup Iteration 3: 188.279 ops/s
# Warmup Iteration 4: 185.716 ops/s
# Warmup Iteration 5: 185.880 ops/s
Iteration 1: 186.384 ops/s
Iteration 2: 186.226 ops/s
Iteration 3: 183.449 ops/s
Iteration 4: 173.699 ops/s
Iteration 5: 184.808 ops/s
Result "observeOn":
239.155 ±(99.9%) 27.340 ops/s [Average]
(min, avg, max) = (162.185, 239.155, 331.754), stdev = 55.229
CI (99.9%): [211.815, 266.496] (assumes normal distribution)
# Run complete. Total time: 00:03:23
Benchmark (size) Mode Cnt Score Error
Units
ComputationSchedulerPerf.observeOn 100000 thrpt 50 239.155 ± 27.340
ops/s
$ numactl --cpunodebind=0 $jdk/bin/java -jar
build/libs/rxjava-1.3.9-SNAPSHOT-benchmarks.jar
ComputationSchedulerPerf.*observeOn -f 10 -p size=100000 -wi 5 -w 2 -i
5 -r 2 -jvmArgs="-XX:MaxInlineLevel=9"
# JMH 1.11.3 (released 1503 days ago, please consider updating!)
# VM version: JDK 15-internal, VM
15-internal+0-2020-02-24-1146405.clredest...
# VM invoker: build/linux-x64/images/jdk/bin/java
# VM options:
--add-exports=jdk.management/com.sun.management.internal=ALL-UNNAMED
-XX:MaxInlineLevel=9
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 2 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: rx.schedulers.ComputationSchedulerPerf.observeOn
# Parameters: (size = 100000)
# Run progress: 0.00% complete, ETA 00:03:20
# Fork: 1 of 10
# Warmup Iteration 1: 218.720 ops/s
# Warmup Iteration 2: 259.219 ops/s
# Warmup Iteration 3: 245.145 ops/s
# Warmup Iteration 4: 244.944 ops/s
# Warmup Iteration 5: 186.248 ops/s
Iteration 1: 180.263 ops/s
Iteration 2: 165.831 ops/s
Iteration 3: 178.660 ops/s
Iteration 4: 180.241 ops/s
Iteration 5: 176.193 ops/s
# Run progress: 10.00% complete, ETA 00:03:04
# Fork: 2 of 10
# Warmup Iteration 1: 267.636 ops/s
# Warmup Iteration 2: 332.165 ops/s
# Warmup Iteration 3: 325.828 ops/s
# Warmup Iteration 4: 325.195 ops/s
# Warmup Iteration 5: 333.456 ops/s
Iteration 1: 331.260 ops/s
Iteration 2: 310.459 ops/s
Iteration 3: 335.419 ops/s
Iteration 4: 328.600 ops/s
Iteration 5: 303.829 ops/s
# Run progress: 20.00% complete, ETA 00:02:43
# Fork: 3 of 10
# Warmup Iteration 1: 259.850 ops/s
# Warmup Iteration 2: 255.872 ops/s
# Warmup Iteration 3: 268.157 ops/s
# Warmup Iteration 4: 254.214 ops/s
# Warmup Iteration 5: 270.159 ops/s
Iteration 1: 240.874 ops/s
Iteration 2: 225.527 ops/s
Iteration 3: 254.826 ops/s
Iteration 4: 238.400 ops/s
Iteration 5: 243.383 ops/s
# Run progress: 30.00% complete, ETA 00:02:22
# Fork: 4 of 10
# Warmup Iteration 1: 188.073 ops/s
# Warmup Iteration 2: 258.390 ops/s
# Warmup Iteration 3: 224.278 ops/s
# Warmup Iteration 4: 228.948 ops/s
# Warmup Iteration 5: 218.226 ops/s
Iteration 1: 191.827 ops/s
Iteration 2: 181.651 ops/s
Iteration 3: 173.203 ops/s
Iteration 4: 183.152 ops/s
Iteration 5: 176.889 ops/s
# Run progress: 40.00% complete, ETA 00:02:02
# Fork: 5 of 10
# Warmup Iteration 1: 232.270 ops/s
# Warmup Iteration 2: 258.410 ops/s
# Warmup Iteration 3: 244.721 ops/s
# Warmup Iteration 4: 238.275 ops/s
# Warmup Iteration 5: 227.297 ops/s
Iteration 1: 217.916 ops/s
Iteration 2: 203.226 ops/s
Iteration 3: 198.320 ops/s
Iteration 4: 211.672 ops/s
Iteration 5: 231.153 ops/s
# Run progress: 50.00% complete, ETA 00:01:42
# Fork: 6 of 10
# Warmup Iteration 1: 154.199 ops/s
# Warmup Iteration 2: 188.968 ops/s
# Warmup Iteration 3: 200.437 ops/s
# Warmup Iteration 4: 205.786 ops/s
# Warmup Iteration 5: 193.097 ops/s
Iteration 1: 191.951 ops/s
Iteration 2: 190.814 ops/s
Iteration 3: 182.418 ops/s
Iteration 4: 183.914 ops/s
Iteration 5: 189.912 ops/s
# Run progress: 60.00% complete, ETA 00:01:21
# Fork: 7 of 10
# Warmup Iteration 1: 151.756 ops/s
# Warmup Iteration 2: 184.493 ops/s
# Warmup Iteration 3: 180.740 ops/s
# Warmup Iteration 4: 181.102 ops/s
# Warmup Iteration 5: 181.172 ops/s
Iteration 1: 181.011 ops/s
Iteration 2: 181.592 ops/s
Iteration 3: 187.456 ops/s
Iteration 4: 174.353 ops/s
Iteration 5: 186.561 ops/s
# Run progress: 70.00% complete, ETA 00:01:01
# Fork: 8 of 10
# Warmup Iteration 1: 187.211 ops/s
# Warmup Iteration 2: 194.340 ops/s
# Warmup Iteration 3: 210.136 ops/s
# Warmup Iteration 4: 202.079 ops/s
# Warmup Iteration 5: 198.685 ops/s
Iteration 1: 207.574 ops/s
Iteration 2: 221.047 ops/s
Iteration 3: 199.842 ops/s
Iteration 4: 184.712 ops/s
Iteration 5: 192.552 ops/s
# Run progress: 80.00% complete, ETA 00:00:40
# Fork: 9 of 10
# Warmup Iteration 1: 149.395 ops/s
# Warmup Iteration 2: 179.125 ops/s
# Warmup Iteration 3: 175.259 ops/s
# Warmup Iteration 4: 177.971 ops/s
# Warmup Iteration 5: 169.131 ops/s
Iteration 1: 174.573 ops/s
Iteration 2: 174.098 ops/s
Iteration 3: 171.107 ops/s
Iteration 4: 163.116 ops/s
Iteration 5: 174.914 ops/s
# Run progress: 90.00% complete, ETA 00:00:20
# Fork: 10 of 10
# Warmup Iteration 1: 284.148 ops/s
# Warmup Iteration 2: 352.798 ops/s
# Warmup Iteration 3: 306.848 ops/s
# Warmup Iteration 4: 250.787 ops/s
# Warmup Iteration 5: 266.872 ops/s
Iteration 1: 270.604 ops/s
Iteration 2: 272.181 ops/s
Iteration 3: 271.800 ops/s
Iteration 4: 269.214 ops/s
Iteration 5: 271.461 ops/s
Result "observeOn":
214.631 ±(99.9%) 23.679 ops/s [Average]
(min, avg, max) = (163.116, 214.631, 335.419), stdev = 47.833
CI (99.9%): [190.952, 238.310] (assumes normal distribution)
# Run complete. Total time: 00:03:23
Benchmark (size) Mode Cnt Score Error
Units
ComputationSchedulerPerf.observeOn 100000 thrpt 50 214.631 ± 23.679
ops/s
More information about the hotspot-compiler-dev
mailing list