[EXTERNAL] Re: Is G1GC the default in JDK 15 for any machine class/size?

Severin Gehwolf sgehwolf at redhat.com
Mon Aug 24 09:32:33 UTC 2020


Hi Bruno,

On Fri, 2020-08-21 at 20:48 +0000, Bruno Borges wrote:
> Hi Thomas,
> 
> To clarify, this issue is being observed on Mac OS and Docker
> Desktop. And even though Docker Desktop for Mac runs a Linux VM
> behind the scenes, I was not able to reproduce this issue on a
> VirtualBox VM with Ubuntu, nor on a remote VM (also Ubuntu) running
> similar versions of containerd and Docker engine.
> 
> Charlie Gracie noticed the following change that may be causing this
> issue. 
> 
> 	src/hotspot/os/linux/cgroupSubsystem_linux.cpp
> 
> 	-- line 237: before, used to compare if moutinfo output is != 3
> 	++ line 292: now it compares whether it is equal to 4

OK. How does /proc/self/mountinfo look like on this setup? Same for
/proc/cgroups and /proc/self/cgroup files. It might help diagnose the
issue. Please put them into a bug report if possible.

> It is unclear why this is only happening on Docker Desktop for macOS,
> and not on an Ubuntu VM on VirtualBox on the same machine. The only
> differences I could spot between Docker on the VBox machine and
> Docker Desktop for macOS are the following:
> 
>  - Docker Engine
>    - ubuntu at vbox: 19.03.11
>    - dockerdesktop at macos: 19.03.12
> 
> - Go
>   - ubuntu at vbox: 1.13.12
>   - dockerdesktop at macos: 1.13.10
> 
>  - Linux Kernel
>   - ubuntu at vbox: 5.4.0-42
>   - dockerdesktop at macos: 4.19.76-linuxkit
> 
> This is output from Docker LinuxKit VM on my macOS. And more below
> you will see -Xlog:os=trace output with JDK 15.

I'd suspect this might be caused by how these kernels set up cgroup
files. Your container trace output suggests that container detection
isn't working as expected in this kind of setup.

> To access the Docker Desktop LinuxKit VM on Mac OS, you can do the
> following:
> 
> 	$ docker container run --rm -it -v /:/host alpine
> 	/ # chroot /host
> 
> Once inside, you can get these outputs:
> 
> ~ # docker version
> Client: Docker Engine - Community
>  Version:           19.03.12
>  API version:       1.40
>  Go version:        go1.13.10
>  Git commit:        48a66213fe
>  Built:             Mon Jun 22 15:42:52 2020
>  OS/Arch:           linux/amd64
>  Experimental:      false
> 
> Server: Docker Engine - Community
>  Engine:
>   Version:          19.03.12
>   API version:      1.40 (minimum version 1.12)
>   Go version:       go1.13.10
>   Git commit:       48a66213fe
>   Built:            Mon Jun 22 15:49:27 2020
>   OS/Arch:          linux/amd64
>   Experimental:     false
>  containerd:
>   Version:          v1.2.13
>   GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
>  runc:
>   Version:          1.0.0-rc10
>   GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
>  docker-init:
>   Version:          0.18.0
>   GitCommit:        fec3683
> 
> And this is the docker run command with os=trace enabled:
> 
> ~ # docker run -ti --memory=256m --cpus=1 openjdk:15-jdk-slim java -Xlog:os=trace,os+container=trace -version
> [0.000s][trace][os,container] OSContainer::init: Initializing Container Support
> [0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers
> [0.001s][debug][os,container] Required cgroup v1 memory subsystem not found

This seems an indication of a failure in the container detection code.

> [0.001s][trace][os          ] active_processor_count: using static path - configured processors: 4
> [0.001s][trace][os          ] active_processor_count: sched_getaffinity processor count: 4
> [0.001s][debug][os          ] Initial active processor count set to 4
> [0.001s][trace][os          ] active_processor_count: using static path - configured processors: 4
> [0.001s][trace][os          ] active_processor_count: sched_getaffinity processor count: 4
> [0.001s][trace][os          ] total system memory: 2087837696
> [0.001s][trace][os          ] total system memory: 2087837696
> [0.001s][info ][os          ] Use of CLOCK_MONOTONIC is supported
> [0.001s][info ][os          ] Use of pthread_condattr_setclock is supported
> [0.001s][info ][os          ] Relative timed-wait using pthread_cond_timedwait is associated with CLOCK_MONOTONIC
> [0.001s][info ][os          ] HotSpot is running with glibc 2.28, NPTL 2.28
> [0.001s][info ][os          ] SafePoint Polling address, bad (protected) page:0x00007f50ea001000, good (unprotected) page:0x00007f50ea002000
> [0.002s][info ][os          ] attempting shared library load of /usr/local/openjdk-15/lib/libjava.so
> [0.002s][info ][os          ] shared library load of /usr/local/openjdk-15/lib/libjava.so was successful
> [0.002s][trace][os          ] active_processor_count: using static path - configured processors: 4
> [0.002s][trace][os          ] active_processor_count: sched_getaffinity processor count: 4
> [0.007s][trace][os          ] total system memory: 2087837696
> [0.020s][trace][os          ] active_processor_count: using static path - configured processors: 4
> [0.020s][trace][os          ] active_processor_count: sched_getaffinity processor count: 4
> [0.030s][trace][os          ] available memory: 150142976
> [0.034s][trace][os          ] available memory: 149884928
> openjdk version "15" 2020-09-15
> OpenJDK Runtime Environment (build 15+36-1562)
> OpenJDK 64-Bit Server VM (build 15+36-1562, mixed mode, sharing)
> 
> 
> Running the same `docker run` command above with `openjdk:14-jdk-
> slim` instead, will give the right active_processor_count = 1.

OK. Seems like a regression from JDK 14 then. In JDK 15 we've added
cgroups v2 support:
https://bugs.openjdk.java.net/browse/JDK-8230305

We should get a bug created for this container detection issue and
gather additional info there. Would you be willing to do this? Perhaps
a test failure would be reproducible on such a system with one of the
docker tests in:

test/hotspot/jtreg/containers/docker
test/jdk/jdk/internal/platform/docker

In the meantime, you should be able to override CPU count with
-XX:ActiveProcessorCount=1 and memory with -XX:MaxRAM=256m and should
see SerialGC being selected in docker as Thomas pointed out earlier.

./bin/java -XX:ActiveProcessorCount=1 -XX:MaxRAM=256m -XX:+PrintFlagsFinal -version 2>&1 | grep Use | grep GC
     bool UseAdaptiveGCBoundary                    = false                                     {product} {default}
     bool UseAdaptiveSizeDecayMajorGCCost          = true                                      {product} {default}
     bool UseAdaptiveSizePolicyWithSystemGC        = false                                     {product} {default}
     bool UseDynamicNumberOfGCThreads              = true                                      {product} {default}
     bool UseG1GC                                  = false                                     {product} {default}
     bool UseGCOverheadLimit                       = true                                      {product} {default}
     bool UseMaximumCompactionOnSystemGC           = true                                      {product} {default}
     bool UseParallelGC                            = false                                     {product} {default}
     bool UseSerialGC                              = true                                      {product} {ergonomic}

Thanks,
Severin

> Hope this is helpful. 
> 
> bb.
> 
> On 2020-08-19, 2:16 AM, "discuss on behalf of Thomas Schatzl" <
> discuss-retn at openjdk.java.net on behalf of thomas.schatzl at oracle.com>
> wrote:
> 
>     Hi,
> 
>     On 19.08.20 00:42, Bruno Borges wrote:
>     > Hi,
>     > 
>     > Up until JDK 14, SerialGC would be picked by default under
> certain conditions when the is_server_class_machine() returns false
> [1][2].
>     > 
>     > In JDK 15, at least on the binary available in the Docker image
> 'openjdk/15-jdk-slim', G1GC is being picked no matter how much memory
> or CPU is available to a container.
> 
>     Doesn't help you directly, but on bare metal, the detection
> works:
> 
>     $ numactl --physcpubind=1 bin/java -XX:+PrintFlagsFinal -version
> | egrep 
>     "UseSerialGC"
> 
>     gives
> 
>           bool UseSerialGC                              = true 
>                             {product} {ergonomic}
> 
>     Can you enable os=trace logging on what the VM thinks what
> resources are 
>     available?
> 
>     > 
>     > I was not able to find in the source code of jdk15 where it is
> being indicated that G1GC should always be picked.
> 
>     You got the right location where the type of machine is
> determined.
> 
>     Thomas
> 



More information about the discuss mailing list