Java 8 + Docker container - CMS collector leaves around instances that have no GC roots

Sat Nov 24 01:35:31 UTC 2018

Hello Leo,

Thank you for responding. Replies inline.

On 23/11/18 10:37 PM, Leo Korinth wrote:
> Hi!
>
> On 23/11/2018 14:55, Jaikiran Pai wrote:
>>
>>
>> java -Xms2G -Xmx6G -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC ....
>
> What is your limit? 

The docker cgroups limit (set via --memory) is set to 8G. Here's the
docker stats output of that container right now (been running for around
16 hours now):

CONTAINER ID        NAME                             CPU %              
MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK
I/O           PIDS
da5ee21b1d        elasticsearch   94.62%              3.657GiB /
8GiB     45.71%              0B / 0B             1.21MB / 18.2GB     227
> You will need _more_ than 6 gig of memory; exactly how much is hard to
> say. You will probably find the limit faster if you use Xms==Xmx and
> maybe not need days of running the application.
That's interesting and useful - I'll give that a try. That will
certainly help speed up the investigation.

>
>>
>> Notice the vast difference between the live and non-live instances of
>> the same class. This isn't just in one "snapshot". I have been
>> monitoring this for more than a day and this pattern continues. Even
>> taking heap dumps and using tools like visualvm shows that these
>> instances have "no GC root" and I have even checked the gc log files to
>> see that the CMS collector does occasionally run. However these objects
>> never seem to get collected.
>
> What makes you believe they never get collected?
I have gc logs enabled on this setup. I notice that the Full GC gets
triggered once in a while. However, even after that Full GC completes, I
still see these vast amount of "non-live" objects staying around in the
heap (via jmap -histo output as well as a heap dump using visualvm).
This is the "total" in heap via jmap -histo:

           #instances   #bytes     
Total      22880189     1644503392 (around 1.5 G)

and this is the -histo:live

Total       1292631      102790184 (around 98M, not even 100M)

Some of these non-live objects hold on to the ByteBuffer(s) which keep
filling up then non-heap memory too (right now the non-heap "mapped"
ByteBuffer memory as shown in the JMX MBean is around 2.5G). The Full GC
log message looks like this:

2018-11-24T00:57:00.665+0000: 59655.295: [Full GC (Heap Inspection
Initiated GC))
 2018-11-24T00:57:00.665+0000: 59655.295: [CMS:
711842K->101527K(989632K), 0.5322
0752 secs] 1016723K->101527K(1986432K), [Metaspace:
48054K->48054K(1093632K)], 00
.5325692 secs] [Times: user=0.53 sys=0.00, real=0.53 secs]

The jmap -heap output is this:

Server compiler detected.
JVM version is 25.172-b11

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 6442450944 (6144.0MB)
   NewSize                  = 1134100480 (1081.5625MB)
   MaxNewSize               = 1134100480 (1081.5625MB)
   OldSize                  = 1013383168 (966.4375MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 1020723200 (973.4375MB)
   used     = 900988112 (859.2492218017578MB)
   free     = 119735088 (114.18827819824219MB)
   88.26958297802969% used
Eden Space:
   capacity = 907345920 (865.3125MB)
   used     = 879277776 (838.5446319580078MB)
   free     = 28068144 (26.767868041992188MB)
   96.9065663512324% used
From Space:
   capacity = 113377280 (108.125MB)
   used     = 21710336 (20.70458984375MB)
   free     = 91666944 (87.42041015625MB)
   19.148753612716764% used
To Space:
   capacity = 113377280 (108.125MB)
   used     = 0 (0.0MB)
   free     = 113377280 (108.125MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 1013383168 (966.4375MB)
   used     = 338488264 (322.8075637817383MB)
   free     = 674894904 (643.6299362182617MB)
   33.40180443968061% used

14231 interned Strings occupying 2258280 bytes.

So based on these various logs and data, I notice that they aren't being
collected. At the time when the process gets chosen to be killed, even
then I see these vast non-live objects holding on. I don't have too much
knowledge in the GC area, so a genuine question - Is it just a wrong
expectation that whenever a Full GC runs, it is supposed to clear out
the non-live objects? Or is it a usual thing that the collector doesn't
choose them for collection for some reason?
>
>>
>> I realize this data may not be enough to narrow down the issue, but what
>> I am looking for is some kind of help/input/hints/suggestions on what I
>> should be trying to figure out why these instances aren't GCed. Is this
>> something that's expected in certain situations?
>
> Being killed by the oom-killer suggests that your limit is too low
> and/or your -Xmx is too large. If an increasing number of objects does
> not get collected, you would get an exception (not being killed by the
> oom-killer). What likely is happening is that your java heap slowly
> grows (not unusual with CMS that does not do compaction of old
> objects) and that the memory consumed by your docker image exceeds
> your limit.
You are right. The heap usage (keeping into consideration the
uncollected non-live objects) does indeed grow very slowly. There
however aren't too many (live) objects on heap really (even after days
when this process is about to be killed). I'll have to read up on the
compaction that you mentioned about CMS.

Here's one example output from dmesg when this process was killed
previously:

[Wed Aug  8 03:21:13 2018] java invoked oom-killer: gfp_mask=0xd0,
order=0, oom_score_adj=-600
[Wed Aug  8 03:21:13 2018] java
cpuset=390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
mems_allowed=0
[Wed Aug  8 03:21:13 2018] CPU: 7 PID: 15827 Comm: java Not tainted
3.10.0-693.5.2.el7.x86_64 #1
...
[Wed Aug  8 03:21:13 2018]  ffff88081472dee0 00000000fe941297
ffff8808062cfcb8 ffffffff816a3e51
[Wed Aug  8 03:21:13 2018]  ffff8808062cfd48 ffffffff8169f246
ffff88080d6d3500 0000000000000001
[Wed Aug  8 03:21:13 2018]  0000000000000000 0000000000000000
ffff8808062cfcf8 0000000000000046
[Wed Aug  8 03:21:13 2018] Call Trace:
[Wed Aug  8 03:21:13 2018]  [<ffffffff816a3e51>] dump_stack+0x19/0x1b
[Wed Aug  8 03:21:13 2018]  [<ffffffff8169f246>] dump_header+0x90/0x229
[Wed Aug  8 03:21:13 2018]  [<ffffffff81185ef6>] ?
find_lock_task_mm+0x56/0xc0
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f14f8>] ?
try_get_mem_cgroup_from_mm+0x28/0x60
[Wed Aug  8 03:21:13 2018]  [<ffffffff811863a4>]
oom_kill_process+0x254/0x3d0
[Wed Aug  8 03:21:13 2018]  [<ffffffff812b7eec>] ? selinux_capable+0x1c/0x40
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f5216>]
mem_cgroup_oom_synchronize+0x546/0x570
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f4690>] ?
mem_cgroup_charge_common+0xc0/0xc0
[Wed Aug  8 03:21:13 2018]  [<ffffffff81186c34>]
pagefault_out_of_memory+0x14/0x90
[Wed Aug  8 03:21:13 2018]  [<ffffffff8169d60e>] mm_fault_error+0x68/0x12b
[Wed Aug  8 03:21:13 2018]  [<ffffffff816b02b1>] __do_page_fault+0x391/0x450
[Wed Aug  8 03:21:13 2018]  [<ffffffff816b03a5>] do_page_fault+0x35/0x90
[Wed Aug  8 03:21:13 2018]  [<ffffffff816ac5c8>] page_fault+0x28/0x30
[Wed Aug  8 03:21:13 2018] Task in
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
killed as a result of limit of
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
[Wed Aug  8 03:21:13 2018] memory: usage 8388608kB, limit 8388608kB,
failcnt 426669
[Wed Aug  8 03:21:13 2018] memory+swap: usage 8388608kB, limit
8388608kB, failcnt 152
[Wed Aug  8 03:21:13 2018] kmem: usage 5773512kB, limit
9007199254740988kB, failcnt 0
[Wed Aug  8 03:21:13 2018] Memory cgroup stats for
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5:
cache:14056KB rss:2601040KB rss_huge:2484224KB mapped_file:6428KB
swap:0KB inactive_anon:0KB active_anon:142640KB inactive_file:4288KB
active_file:3324KB unevictable:2464692KB
[Wed Aug  8 03:21:13 2018] [ pid ]   uid  tgid total_vm      rss nr_ptes
swapents oom_score_adj name
[Wed Aug  8 03:21:13 2018] [15582]  1000 15582  3049827   654465   
1446        0          -600 java
[Wed Aug  8 03:21:13 2018] Memory cgroup out of memory: Kill process
12625 (java) score 0 or sacrifice child
[Wed Aug  8 03:21:13 2018] Killed process 15582 (java)
total-vm:12199308kB, anon-rss:2599320kB, file-rss:18540kB, shmem-rss:0kB

> How big the limit should be is hard to tell, but it _must_ be larger
> than your "-Xmx" (the JVM is using more memory than the java heap, so
> this would be true even without the addition of docker).
Agreed and that's where the ByteBuffer(s) are coming into picture
(around 2.5 GB right now and growing steadily).

-Jaikiran