Metaspace OOM before reaching MaxMetaspaceSize

Shengpei Zhang shengpei92 at gmail.com
Mon Apr 29 18:05:59 UTC 2019


Thanks David and Thomas :)

I checked our mmap limit, it is 64M. The # of mmap in the hs file is around
6K. I also did checked the current running process, it is about 8K. So I do
not think it hit the mmap limit.

@Thomas
Below is the rlimit output in the hs_err log, I am not sure how to
interrupt it...

---------------  S Y S T E M  ---------------

OS:CentOS Linux release 7.5.1804 (Core)
uname:Linux 4.11.3-88_fbk23_4223_g7d3b7f39d8b1 #88 SMP Thu Jan 31 13:08:24
PST 2019 x86_64
libc:glibc 2.17 NPTL 2.17
rlimit: STACK 8192k, CORE infinity, NPROC 1030660, NOFILE 262144, AS
infinity, DATA infinity, FSIZE infinity
load average:99.10 107.68 103.71

One interesting article I saw is
https://bugs.openjdk.java.net/browse/JDK-8187709. Someone mentioned for
(-XX:-UseCompressedOops) enabled machines, the native memory might be
allocated at lower memory space, so the actual available memory is less
than free memory. Could this be the cause?

I hope I could provide more logs :( But I only saw this issue once. The
only thing I had is a hs_err file and stdout log.

Best,
Shengpei

On Fri, Apr 26, 2019 at 11:17 PM Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi Shengpei,
>
> (different Thomas :)
>
> I agree with David and Thomas 1, you probably hit a limit which prevents
> your mmap calls from succeeding. Both cases - failing to establish the
> guard page on the thread and failing to enlarge the Metaspace - can be
> caused by failing mmap calls. So the "out of metaspace" message is not
> wrong, it just is not the whole answer.
>
> My guess would be it is either the number-of-mappings Thomas mentioned, or
> a ulimit you hit, maby the virtual size ulimit.
>
> The former you is a sysctl setting: sysctl vm.max_map_count - as Thomas
> said, it is usually 64K. And yes, the number of "native libraries" in the
> hs-err file is really the list of memory mappings, aka the plain
> /proc/<pid>/maps.
>
> The ulimit you may be hitting could be the virtual size one, ulimit -d.
> ulimit -a gives you all limits. Limits are also printed as part of the
> hs-err file (look for rlimit).
>
> Still, it would also be helpfiul to have the output of VM.metaspace (via
> jcmd) to see in detail metaspace usage.
>
> Thanks, Thomas
>
>
> On Fri, Apr 26, 2019 at 10:09 PM Shengpei Zhang <shengpei92 at gmail.com>
> wrote:
>
>> Thanks David and Thomas,
>>
>> @David
>> My impression was also that this is not related with Metaspace, but I got
>> confused by the error msg "java.lang.OutOfMemoryError: Metaspace" here.
>> Maybe this is a mis-classification?
>> If that is the case, I assume some part of native memory is OOMed? It
>> should be an area with a tight limit as there are still free memory on the
>> machine. Do you know which part of memory the stack_guard_pages are using?
>>
>> @Thomas
>> Not sure how to check number of memory mappings, from the example you
>> provided seemed like you were counting the mappings under the "dynamic
>> libraries" section? Then no, I checked it was ~6K.
>> I was not able to reproduce the issue, below is a NMT diff I took just
>> now,
>> I didn't see anything suspicious here:
>>
>> Total: reserved=228671672KB +24437546KB, committed=228134376KB +24813862KB
>>
>> -                 Java Heap (reserved=188743680KB, committed=188743680KB)
>>                             (mmap: reserved=188743680KB,
>> committed=188743680KB)
>>
>> -                     Class (reserved=851747KB +655100KB,
>> committed=830967KB +635128KB)
>>                             (classes #51634 +19878)
>>                             (malloc=142319KB +134088KB #3339917 +3183393)
>>                             (mmap: reserved=709428KB +521012KB,
>> committed=688648KB +501040KB)
>>                             (  Metadata:   )
>>                             (    reserved=0KB, committed=0KB)
>>                             (    used=0KB)
>>                             (    free=0KB)
>>                             (    waste=0KB =-nan%)
>>
>> -                    Thread (reserved=7543768KB +5003196KB,
>> committed=7543768KB +5003196KB)
>>                             (thread #3725 +2435)
>>                             (stack: reserved=7533768KB +4996620KB,
>> committed=7533768KB +4996620KB)
>>                             (malloc=5603KB +3691KB #18712 +12175)
>>                             (arena=4397KB +2885 #7434 +4870)
>>
>> -                      Code (reserved=1084696KB +18645KB,
>> committed=568180KB +414933KB)
>>                             (malloc=27920KB +18645KB #75586 +41577)
>>                             (mmap: reserved=1056776KB, committed=540260KB
>> +396288KB)
>>
>> -                        GC (reserved=22058971KB +12626721KB,
>> committed=22058971KB +12626721KB)
>>                             (malloc=12911575KB +12626721KB #3259193
>> +3182155)
>>                             (mmap: reserved=9147396KB,
>> committed=9147396KB)
>>
>> -                  Compiler (reserved=30365KB +26208KB, committed=30365KB
>> +26208KB)
>>                             (malloc=30235KB +26208KB #20637 +15823)
>>                             (arena=131KB #18)
>>
>> -                  Internal (reserved=8157510KB +5937989KB,
>> committed=8157510KB +5937989KB)
>>                             (malloc=8157478KB +5937989KB #2966920
>> +2914279)
>>                             (mmap: reserved=32KB, committed=32KB)
>>
>> -                    Symbol (reserved=30174KB +9428KB, committed=30174KB
>> +9428KB)
>>                             (malloc=26745KB +9332KB #252220 +78699)
>>                             (arena=3428KB +96 #1)
>>
>> -    Native Memory Tracking (reserved=158258KB +149406KB,
>> committed=158258KB +149406KB)
>>                             (malloc=1839KB +1179KB #25013 +15914)
>>                             (tracking overhead=156419KB +148226KB)
>>
>> -               Arena Chunk (reserved=1388KB +1130KB, committed=1388KB
>> +1130KB)
>>                             (malloc=1388KB +1130KB)
>>
>> -                   Logging (reserved=8KB, committed=8KB)
>>                             (malloc=8KB #164)
>>
>> -                 Arguments (reserved=27KB, committed=27KB)
>>                             (malloc=27KB #515)
>>
>> -                    Module (reserved=11073KB +9722KB, committed=11073KB
>> +9722KB)
>>                             (malloc=11073KB +9722KB #44352 +37576)
>>
>> -                   Unknown (reserved=8KB, committed=8KB)
>>                             (mmap: reserved=8KB, committed=8KB)
>>
>>
>> Best,
>> Shengpei
>>
>> On Thu, Apr 25, 2019 at 11:45 PM Thomas Schatzl <
>> thomas.schatzl at oracle.com>
>> wrote:
>>
>> > Hi,
>> >
>> >   some random comments:
>> >
>> > On Thu, 2019-04-25 at 17:19 -0700, Shengpei Zhang wrote:
>> > > Hi,
>> > >
>> > > We recently saw a weird Metaspace OOM. The hs_err_pid log showed the
>> > > metaspace usage was 325078K, while the MaxMetaspaceSize was set to
>> > > 4G.
>> > >
>> > > The machine was running JAVA10.
>> > >
>> > > Few observations:
>> > >
>> > > 1. It only happened at one machine, and we were not able to reproduce
>> > > it.
>> > >
>> > > 2. Error log indicated memory allocation failure in native memory.
>> > >
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > > Chunkmanager (non-class):
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > >  62578 specialized (128 bytes) chunks, total 64079872 bytes
>> > > [2019-03-29T11:18:01.298-0700][126538][info
>> > > ][gc,metaspace,freelist]   5778
>> > > small (512 bytes) chunks, total 23666688 bytes
>> > > [2019-03-29T11:18:01.298-0700][126538][info
>> > > ][gc,metaspace,freelist]   27
>> > > medium (8192 bytes) chunks, total 1769472 bytes
>> > > [2019-03-29T11:18:01.298-0700][126538][info
>> > > ][gc,metaspace,freelist]   13
>> > > humongous chunks, total 8890368 bytes
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > >  total size: 98406400 bytes.
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > > Chunkmanager (class):
>> > > [2019-03-29T11:18:01.298-0700][126538][info ][gc,metaspace,freelist]
>> > > unavailable.
>> > > java.lang.OutOfMemoryError: Metaspace
>> > > Dumping heap to java_pid240.hprof ...
>> > >           vmop                            [ threads:    total
>> > > initially_running wait_to_block ][ time:    spin   block    sync
>> > > cleanup
>> > > vmop ] page_trap_count
>> > > 66467.977: ICBufferFull                    [              3427
>> > >  154           353 ][             0      60      63       8       0 ]
>> > >         58
>> > > [2019-03-29T11:18:01.308-0700][459   ][info ][safepoint            ]
>> > > Total
>> > > time for which application threads were stopped: 0.0725507 seconds,
>> > > Stopping threads took: 0.0630158 seconds
>> > > [thread 384508 also had an error]
>> > > #
>> > > # There is insufficient memory for the Java Runtime Environment to
>> > > continue.
>> > > # Native memory allocation (mmap) failed to map 16384 bytes for
>> > > committing
>> > > reserved memory.
>> > > # An error report file with more information is saved as:
>> > > # /data/hs_err_pid240.log
>> >
>> > In Linux there is a limit on the amount of memory mappings (iirc 65k by
>> > default) if your application maps memory by itself, and does not unmap
>> > in time, you will run out of memory mappings.
>> >
>> > If the hs_err log contains lots and lots of memory mappings, this is a
>> > strong indication that this is your issue, not actual native memory
>> > pressure.
>> >
>> > See e.g. https://bugs.openjdk.java.net/browse/JDK-8216619 for more
>> > details on another example.
>> >
>> > >
>> > > 3. The hs_err_pid log showed the metaspace memory usage was 325078K,
>> > > it is far less than the reserved size 568176K. Also, the
>> > > MaxMetaspaceSize was set to 4G. I checked the machine had plenty of
>> > > free native memory at that time. If there was a huge allocation, I
>> > > assume JVM would try to reserve more memory first?
>> >
>> > Please check if your program leaks memory reservations or not, see
>> > above.
>> >
>> > The difference between usage and reservation size may be down to
>> > fragmentation: later VM versions included some patches that improve the
>> > situation.
>> >
>> > Thanks,
>> >   Thomas
>> >
>> >
>> >
>>
>


More information about the hotspot-dev mailing list