OpenJDK OOM issue -
Yu, Tim (NSB - CN/Chengdu)
tim.yu at nokia-sbell.com
Mon Sep 25 09:04:25 UTC 2017
Hi David
Thanks for your reply. I checked with your comments. Finding is as below, I have verified at least more than 2000 thousand process/thread can be created on the issue system with user "esbamin".( The issued JAVA program is executed by user "esbadmin")
Could you please help to explain why NPROC number in hs_err file is 1024? Many thanks.
1. check nproc number with below command, it can be seen the max user process number is 43497.
# su esbadmin
sh-4.1$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 43497
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 43497
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
2. Based on https://blog.dbi-services.com/linux-how-to-monitor-the-nproc-limit-1/, I compiled testnpron.c and launch 1500 more processes on the issued system with user "esbadmin".
# sudo -u esbadmin ./testnproc
3. Monitor the nproc number
//before testnproc triggered
[root at cloudyvm16 ~]# ps h -Led -o user | sort | uniq -c | sort -n|grep esbadmin
915 esbadmin
//after testnproc triggered and you can see 2419 process/thread is created with user "esbadmin"
[root at cloudyvm16 ~]# ps h -Led -o user | sort | uniq -c | sort -n|grep esbadmin
2419 esbadmin
-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com]
Sent: Monday, September 25, 2017 3:29 PM
To: Yu, Tim (NSB - CN/Chengdu) <tim.yu at nokia-sbell.com>; Kirk Pepperdine <kirk.pepperdine at gmail.com>; Zhengyu Gu <zgu at redhat.com>
Cc: Poonam Parhar <poonam.bajaj at oracle.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>; Andrew Haley <aph at redhat.com>; Ray Hindman <hotspot-dev at openjdk.java.net>; Shen, David (NSB - CN/Chengdu) <david.shen at nokia-sbell.com>
Subject: Re: OpenJDK OOM issue -
Hi Tim,
You previously showed ulimit output that indicated a large user
process/threads limit, but the hs-err log shows:
rlimit: STACK 10240k, CORE 0k, NPROC 1024, NOFILE 4096, AS infinity
With NPROC = 1024 you very likely hit the maximum user processes/threads
limit.
Cheers,
David
On 25/09/2017 4:54 PM, Yu, Tim (NSB - CN/Chengdu) wrote:
> Hi Kirk & Zhengyu
>
> Thanks for your reply. This issue does not re-occur from Sep 22, but I will keep monitoring.
> Below is the NMT information for our JVM during normal situation. As the JAVA program is triggered periodically with one minute interval and the OOM happens during JVM initialization, could you please just reference it?
>
> For hs_err_pid10907.log, it can be seen clearly that lots of memory is available, but the OOM occurs due to "unable to create new native thread". It's quite strange to me and could you please help to explain what's the possible reasons? Many thanks.
> Memory: 4k page, physical 11163792k(274656k free), swap 16777212k(16756468k free)
>
> Native Memory Tracking:
>
> Total: reserved=4243930KB, committed=325642KB
> - Java Heap (reserved=2791424KB, committed=176128KB)
> (mmap: reserved=2791424KB, committed=176128KB)
>
> - Class (reserved=1064215KB, committed=16791KB)
> (classes #1800)
> (malloc=5399KB #1210)
> (mmap: reserved=1058816KB, committed=11392KB)
>
> - Thread (reserved=18582KB, committed=18582KB)
> (thread #18)
> (stack: reserved=18504KB, committed=18504KB)
> (malloc=57KB #96)
> (arena=21KB #36)
>
> - Code (reserved=249870KB, committed=2806KB)
> (malloc=270KB #919)
> (mmap: reserved=249600KB, committed=2536KB)
>
> - GC (reserved=107764KB, committed=99260KB)
> (malloc=5772KB #132)
> (mmap: reserved=101992KB, committed=93488KB)
>
> - Compiler (reserved=134KB, committed=134KB)
> (malloc=3KB #48)
> (arena=131KB #3)
>
> - Internal (reserved=5573KB, committed=5573KB)
> (malloc=5541KB #3125)
> (mmap: reserved=32KB, committed=32KB)
>
> - Symbol (reserved=3364KB, committed=3364KB)
> (malloc=1853KB #4257)
> (arena=1511KB #1)
>
> - Native Memory Tracking (reserved=160KB, committed=160KB)
> (malloc=4KB #45)
> (tracking overhead=156KB)
>
> - Arena Chunk (reserved=2844KB, committed=2844KB)
> (malloc=2844KB)
>
>
> Br,
> Tim
>
>
> -----Original Message-----
> From: Kirk Pepperdine [mailto:kirk.pepperdine at gmail.com]
> Sent: Thursday, September 21, 2017 10:30 PM
> To: Zhengyu Gu <zgu at redhat.com>
> Cc: Yu, Tim (NSB - CN/Chengdu) <tim.yu at nokia-sbell.com>; Poonam Parhar <poonam.bajaj at oracle.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>; david.holmes at oracle.com Holmes <david.holmes at oracle.com>; Andrew Haley <aph at redhat.com>; Ray Hindman <hotspot-dev at openjdk.java.net>; Shen, David (NSB - CN/Chengdu) <david.shen at nokia-sbell.com>
> Subject: Re: OpenJDK OOM issue -
>
> Have you tried running pmap?
>
> Kind regards,
> Kirk
>
>> On Sep 21, 2017, at 4:07 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>>
>> Hi Tim,
>>
>> Try to run with -XX:NativeMemoryTracking=summary , this should give you some hints on native memory side.
>>
>> In your case, not be able to create native thread, more likely to be on native side than heap.
>>
>> Thanks,
>>
>> -Zhengyu
>>
>> On 09/20/2017 11:22 PM, Yu, Tim (NSB - CN/Chengdu) wrote:
>>> Hi Poonam & Vladimir
>>> After add " -XX:-UseCompressedOops" flag, the OMM still happens. The corresponding GC log is as below and no heap is printed out. So, what's the next step to do? Please help on this and many thanks :)
>>> OpenJDK 64-Bit Server VM (25.131-b11) for linux-amd64 JRE (1.8.0_131-b11), built on Apr 13 2017 17:56:19 by "mockbuild" with gcc 4.4.7 20120313 (Red Hat 4.4.7-18)
>>> Memory: 4k page, physical 11163792k(551024k free), swap 16777212k(16722204k free)
>>> CommandLine flags: -XX:InitialHeapSize=178620672 -XX:MaxHeapSize=2857930752 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:-UseCompressedOops -XX:+UseParallelGC
>>> Br,
>>> Tim
>>> -----Original Message-----
>>> From: Poonam Parhar [mailto:poonam.bajaj at oracle.com]
>>> Sent: Thursday, September 21, 2017 12:37 AM
>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; David Holmes <david.holmes at oracle.com>; Yu, Tim (NSB - CN/Chengdu) <tim.yu at nokia-sbell.com>; Andrew Haley <aph at redhat.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Cc: Shen, David (NSB - CN/Chengdu) <david.shen at nokia-sbell.com>
>>> Subject: RE: OpenJDK OOM issue -
>>> Hi Vladimir,
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov
>>>> Sent: Wednesday, September 20, 2017 9:30 AM
>>>> To: Poonam Parhar; David Holmes; Yu, Tim (NSB - CN/Chengdu); Andrew
>>>> Haley; hotspot-dev developers
>>>> Cc: Shen, David (NSB - CN/Chengdu)
>>>> Subject: Re: OpenJDK OOM issue -
>>>>
>>>> On Linux we should not have java heap's low address memory problem.
>>>>
>>>> Small swap space? Also memory left is less than 1Gb ("MemFree: 898332
>>>> kB").
>>>>
>>>> Also 5326 processes it a lot. Overloaded system?
>>>>
>>>> Poonam, do we have bug for this? Can you attached hs_err file to it.
>>>>
>>> No, there is no bug for this.
>>> Thanks,
>>> Poonam
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 9/20/17 9:15 AM, Poonam Parhar wrote:
>>>>> Hello Tim,
>>>>>
>>>>> From the hs_err_pid12678.log file, the java heap is based at
>>>> 0x715a00000 which is 28gb, so there should be plenty of space available
>>>> for the native heap.
>>>>>
>>>>> Memory map:
>>>>> ...
>>>>> 00600000-00601000 rw-p 00000000 fc:01 17950
>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-
>>>> 0.b11.el6_9.x86_64/jre/bin/java
>>>>> 019bc000-019dd000 rw-p 00000000 00:00 0
>>>> [heap]
>>>>> 715a00000-71cd00000 rw-p 00000000 00:00 0
>>>>> 71cd00000-787380000 ---p 00000000 00:00 0 ...
>>>>>
>>>>> Heap
>>>>> PSYoungGen total 51200K, used 13258K [0x0000000787380000,
>>>> 0x000000078ac80000, 0x00000007c0000000)
>>>>> eden space 44032K, 30% used
>>>> [0x0000000787380000,0x0000000788072ad0,0x0000000789e80000)
>>>>> from space 7168K, 0% used
>>>> [0x000000078a580000,0x000000078a580000,0x000000078ac80000)
>>>>> to space 7168K, 0% used
>>>> [0x0000000789e80000,0x0000000789e80000,0x000000078a580000)
>>>>> ParOldGen total 117760K, used 0K [0x0000000715a00000,
>>>> 0x000000071cd00000, 0x0000000787380000)
>>>>> object space 117760K, 0% used
>>>> [0x0000000715a00000,0x0000000715a00000,0x000000071cd00000)
>>>>> Metaspace used 10485K, capacity 10722K, committed 11008K,
>>>> reserved 1058816K
>>>>> class space used 1125K, capacity 1227K, committed 1280K,
>>>> reserved 1048576K
>>>>>
>>>>>
>>>>> To narrow down the issue, would it be possible for you to test with -
>>>> XX:-UseCompressedOops?
>>>>>
>>>>> Thanks,
>>>>> Poonam
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes
>>>>>> Sent: Wednesday, September 20, 2017 4:43 AM
>>>>>> To: Yu, Tim (NSB - CN/Chengdu); Andrew Haley; Poonam Parhar;
>>>> hotspot-
>>>>>> dev developers
>>>>>> Cc: Shen, David (NSB - CN/Chengdu)
>>>>>> Subject: Re: OpenJDK OOM issue -
>>>>>>
>>>>>> Tim,
>>>>>>
>>>>>> Please note attachments get stripped from the mailing lists.
>>>>>>
>>>>>> All - please drop the jdk8-dev and jdk8u-dev mailing lists from this
>>>>>> and leave it just on hotspot-dev. I've tried to bcc those lists.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 20/09/2017 6:44 PM, Yu, Tim (NSB - CN/Chengdu) wrote:
>>>>>>> Hi All
>>>>>>>
>>>>>>> Thank you all for the quick response.
>>>>>>> The environment information is listed as below, could you please
>>>>>>> help
>>>>>> to further check?
>>>>>>>
>>>>>>> 1. What OS is this?
>>>>>>> # cat /etc/redhat-release
>>>>>>> Red Hat Enterprise Linux Server release 6.9 (Santiago) # uname -a
>>>>>>> Linux cloudyvm16 2.6.32-696.6.3.el6.x86_64 #1 SMP Fri Jun 30
>>>>>>> 13:24:18 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>>>>>
>>>>>>> 2.GC log is listed as below. The heap information cannot be printed
>>>>>> out in gc-2017_09_20-09_21_15.log when OOM happens. In gc-
>>>> 2017_09_20-
>>>>>> 09_21_17.log, you can see the heap begins with 0x0000000787380000
>>>> and
>>>>>> it should be not the first 4G virtual memory address.
>>>>>>> -rw-r--r-- 1 19477 Sep 20 09:21 hs_err_pid12678.log
>>>>>>> -rw-r--r-- 1 570 Sep 20 09:21 gc-2017_09_20-09_21_15.log
>>>>>>> -rw-r--r-- 1 17741 Sep 20 09:21 hs_err_pid12706.log
>>>>>>> -rw-r--r-- 1 1297 Sep 20 09:21 gc-2017_09_20-09_21_17.log
>>>>>>> -rw-r--r-- 1 1722 Sep 20 09:21 gc-2017_09_20-09_21_18.log
>>>>>>> -rw-r--r-- 1 1297 Sep 20 09:21 gc-2017_09_20-09_21_19.log
>>>>>>> -rw-r--r-- 1 1297 Sep 20 09:21 gc-2017_09_20-09_21_20.log
>>>>>>>
>>>>>>> 3. This issue happens occasionally but frequently. We periodically
>>>>>> launch a JAVA program to use JMX to monitor service status of
>>>> another
>>>>>> JAVA service.
>>>>>>>
>>>>>>> Br,
>>>>>>> Tim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Andrew Haley [mailto:aph at redhat.com]
>>>>>>> Sent: Tuesday, September 19, 2017 9:13 PM
>>>>>>> To: Yu, Tim (NSB - CN/Chengdu) <tim.yu at nokia-sbell.com>;
>>>>>>> jdk8-dev at openjdk.java.net; jdk8u-dev at openjdk.java.net
>>>>>>> Cc: Shen, David (NSB - CN/Chengdu) <david.shen at nokia-sbell.com>
>>>>>>> Subject: Re: OpenJDK OOM issue -
>>>>>>>
>>>>>>> On 19/09/17 09:50, Yu, Tim (NSB - CN/Chengdu) wrote:
>>>>>>>> Hi OpenJDK dev group
>>>>>>>>
>>>>>>>> We meet one issue that the VM failed to initialize. The error log
>>>>>>>> is
>>>>>> as below. We checked both memory usage and thread number. They do
>>>> not
>>>>>> hit the limit. So could you please help to confirm why
>>>>>> "java.lang.OutOfMemoryError: unable to create new native thread"
>>>>>> error occurs? Many thanks.
>>>>>>>
>>>>>>> What OS is this?
>>>>>>>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testnproc.c
URL: <http://mail.openjdk.java.net/pipermail/hotspot-dev/attachments/20170925/ece8bdf3/testnproc-0001.c>
More information about the hotspot-dev
mailing list