Effect of setting CPU quota on Java performance
Ashutosh Mehra1
asmehra1 at in.ibm.com
Mon Feb 5 05:48:17 UTC 2018
I have been trying to understand if setting CPU quota limit on a docker
container, provided the "effective" CPUs are the same, has any impact on
the application performance.
As an example, if my app is running on 4 CPUs @ 100% quota, would I get
same performance if my app is running on 8 CPUs at 50% quota? Note that
"effective" CPUs is 4 in both cases.
Since OpenJDK early access builds for Java 10 have improved support for
docker container (https://bugs.openjdk.java.net/browse/JDK-8146115), I
decided to do some measurements using that build.
I got the build OpenJDK build jdk-10-ea+40 from http://jdk.java.net/10/.
This build by default has container support enabled.
The system I used has 32 CPUs including 2 hyperthreads per core. I turned
off hyperthreading for this experiment. That leaves me with 16 cores on 2
sockets: 0-7 on 1 socket and 8-15 on 2nd socket.
System details are:
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
# uname -r
4.4.0-103-generic
For measurements I used AcmeAir benchmark at
https://github.com/sabkrish/acmeair/tree/microservice_changes.
I ran the AcmeAir benchmark with the said build for following cases:
1) Ran AcmeAir with JVM bound to 4 cpus (8-11) and no limit on quota. Lets
call this 4CPU at 100.
This used following JVM settings:
CICompilerCount = 3, ParallelGCThreads = 4, ConcGCThreads = 1
2) Case 8Cpu50Quota: Ran AcmeAir with JVM bound to 8 cpus (8-15) and 50%
quota. Lets call this 8CPU at 50.
In this case container support was enabled by default and it used
following JVM settings:
CICompilerCount = 3, ParallelGCThreads = 4, ConcGCThreads = 1
3) Ran AcmeAir with JVM bound to 8 cpus (8-15) and 50% quota with
-XX:-UseContainerSupport option to disable container support. Lets call
this 8CPU at 50NoCS.
This used following JVM settings:
CICompilerCount = 4, ParallelGCThreads = 8, ConcGCThreads = 2
Load on the server was applied using JMeter which was running on same box
but bound to 0-8 CPUs. I applied the load for few minutes to warm up the
JVM before starting the final "measure" run.
Throughput reported below is for the final "measure" run. All numbers
reported below are an average of 10 iterations.
Throughput result:
4Cpu1 at 100 | 8Cpu at 50 | 8Cpu at 50NoCS
9621.5 | 6970.6 | 7252.1
I also measured Total compilation time (in seconds) and Total pause time
(in seconds) for the duration of the server (which includes warm up
phase):
Compilation time:
4Cpu1 at 100 | 8Cpu at 50 | 8Cpu at 50NoCS
79.8545 | 76.7041 | 100.085
GC Pasue time:
4Cpu1 at 100 | 8Cpu at 50 | 8Cpu at 50NoCS
1.829 | 1.886 | 1.927
I am quite surprised to see the drop in throughput between 4Cpu100Quota
and 8Cpu50Quota case.
Looking deeper into the results, I do notice that numbers for 8Cpu50Quota
case were not very consistent, but in general I can say they are not
matching 4Cpu100Quota case.
I will be doing additional runs for this setup (and on different OS/kernel
version) and increase warm up time for the JVM to see if that improves the
consistency.
Meanwhile, couple of questions I wanted to put forward:
1) Has anyone else noticed this kind of difference in Java
application/JVM performance when CPU quota is used?
2) What other open-source benchmarks are available that I can use to
verify the behavior I am observing?
Any comments/feedback are welcome.
Thanks,
Ashutosh Mehra
More information about the jdk-dev
mailing list