Native Memory Tracking Bug
Jacob Schlather
jschlather at hubspot.com
Thu Nov 14 23:22:33 UTC 2019
Okay, I'll continue digging into our memory profile. Regarding the NMT bug,
yes I have an application in production which so far on every instance,
everytime I take an NMT summary it produces the bug. But it doesn't happen
in QA, so I can't reproduce in a test. Here's another example I just took.
Compiler (reserved=6022KB, committed=6022KB)
<https://private.hubteam.com/singularity-iad/task/EngagementsService-web-63545_614_300-1573749739254-1-steamed_walrus.iad02.hubspot_networks.net-us_east_1b/tail/EngagementsService-web-63545_614_300-1573749739254-1-steamed_walrus.iad02.hubspot_networks.net-us_east_1b/executor.commands.native-memory-summary.1573773668812.log?offset=1951>
(malloc=6628KB #5024)
<https://private.hubteam.com/singularity-iad/task/EngagementsService-web-63545_614_300-1573749739254-1-steamed_walrus.iad02.hubspot_networks.net-us_east_1b/tail/EngagementsService-web-63545_614_300-1573749739254-1-steamed_walrus.iad02.hubspot_networks.net-us_east_1b/executor.commands.native-memory-summary.1573773668812.log?offset=2002>
(arena=18014398509481378KB #5)
On Thu, Nov 14, 2019 at 4:54 PM Zhengyu Gu <zgu at redhat.com> wrote:
> Hi Jacob,
>
> On 11/14/19 3:40 PM, Jacob Schlather wrote:
> > Hi Zhengyu,
> >
> > For another web service also seeing this issue, I ran NMT summary on
> > java 8 vs java 11. As you can see our RSS increases by about 1gb when we
> > run on Java 11. We don't make use of any JNI apis that I'm aware of, but
> > we do have three shaded versions of netty and open many sockets. Has the
> > memory behavior of any of the things you mentioned changed greatly in
> > between jdk8 and jdk11? I do notice that g1gc has undergone substantial
> > work and we've tuned that heavily. I'm going to run a test tomorrow
> > morning removing our g1gc tuning and see how that impacts memory usage
> > on java 11.
> >
> > Java 8
> > NMT Committed: 3560392KB
> > Actual Memory Usage: 3177185KB
> >
> > Java 11
> > NMT Committed: 3436458KB
> > Actual Memory Usage: 4320133 KB
>
> In case of g1gc changes, the memory usage should reflect in NMT outputs,
> which is not the case. Likely to be something outside of JVM.
>
> -Zhengyu
>
>
> >
> > On Thu, Nov 14, 2019 at 3:33 PM Zhengyu Gu <zgu at redhat.com
> > <mailto:zgu at redhat.com>> wrote:
> >
> >
> >
> > On 11/14/19 1:06 PM, Jacob Schlather wrote:
> > > Upon further investigation it appears that NMT is not accurately
> > > tracking the memory usage of the JVM. I ran NMT summary and got the
> > > following output
> > >
> > > Total: reserved=9095231KB, committed=8746919KB
> > >
> > > And then I ran pmap on the host to check memory usage
> > >
> > > sudo pmap -x 89186 | tail -n 1
> > > total kB 34497556 9397564 9352480
> > >
> > > Is this sort of gap expected?
> >
> > Sorry, I was preoccupied by tracking bug.
> >
> > As the memory gap, it really depends on individual application.
> >
> > For example, if it has jni methods that allocate a lot memory
> > outside of
> > JVM? How many sockets it opens, given each socket may have buffer(s)
> > associating with it, etc.
> >
> > The NMT output, you posted on github, shows 700+ threads, each thread
> > may have per-thread malloc pool, that is managed by c library and not
> > tracked by NMT, that can add up significant amount of memory.
> >
> > -Zhengyu
> >
> >
> >
> >
> >
> >
> >
> > >
> > >
> > > On Thu, Nov 14, 2019 at 12:18 PM Jacob Schlather
> > <jschlather at hubspot.com <mailto:jschlather at hubspot.com>
> > > <mailto:jschlather at hubspot.com <mailto:jschlather at hubspot.com>>>
> > wrote:
> > >
> > > Hi Zhengyu,
> > >
> > > This bug seems to happen every time for one of our web services
> > > running in production, but it doesn't happen in qa, so I don't think
> > > I'll be able to provide a reproducer. If there are some lightweight
> > > debugging options we could turn on and provide that output to you, I
> > > could do that.
> > >
> > > Having dug into the memory issue we're seeing, what it looks like is
> > > that in Java 8 the committed memory as shown by NMT is significantly
> > > less than the memory the JVM was actually using. Now on Java 11, the
> > > JVM seems to be using much closer to the committed memory. Do you
> > > happen to know of anything we could look into for that? Thanks.
> > >
> > > On Wed, Nov 13, 2019 at 10:14 PM Zhengyu Gu <zgu at redhat.com
> > <mailto:zgu at redhat.com>
> > > <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>> wrote:
> > >
> > > Hi Jacob,
> > >
> > > It looks like JDK-8204128 [1] strikes again. It would be very
> > > helpful if
> > > you can provide a reproducer.
> > >
> > > Thanks,
> > >
> > > -Zhengyu
> > >
> > >
> > > [1] https://bugs.openjdk.java.net/browse/JDK-8204128
> <https://bugs.openjdk.java.net/browse/JDK-8204128>
> > >
> > >
> > > On 11/13/19 9:56 PM, Jacob Schlather wrote:
> > > > We're currently in the process of upgrading our Java
> > > applications from Java
> > > > 8 to Java 11. After deploying some of our production
> > > applications with Java
> > > > 11, we began to see the resident memory size grow without
> > > bound until our
> > > > orchestrator killed the applications for excessive memory
> > > usage. We've
> > > > started to debug this issue, but noticed that the NMT output
> > > appears to be
> > > > incorrect. In particular the Compiler section is displaying
> > > >
> > > > Compiler (reserved=4528KB +1010KB, committed=4528KB +1010KB)
> > > > (malloc=4896KB +1206KB #4132 +508)
> > > > (arena=18014398509481617KB -196 #5)
> > > >
> > > > Obviously the arena value here is quite wrong and there's no
> > > way the
> > > > reserved memory can be less than the malloc memory. Further
> > > there's
> > > > a 276305KB gap in the RSS size reported by our metrics and
> > > the amount of
> > > > memory NMT reports as committed. Here's our JVM args and JDK
> > > version, I've
> > > > additionally attached the full output of the NMT detailed diff.
> > > >
> > > > Running java11 with JVM arguments:
> > > -Djava.net.preferIPv4Stack=true
> > > > -Xms6144m -Xmx6g -Xss256k -XX:MetaspaceSize=128m
> > > > -XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=128m
> > > > -XX:+UseG1GC -XX:MaxGCPauseMillis=350
> > > -XX:+UnlockExperimentalVMOptions
> > > > -XX:G1NewSizePercent=20 -XX:G1MaxNewSizePercent=45
> > > > -XX:ParallelGCThreads=8 -XX:ConcGCThreads=6
> > > > -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem
> > > > -XX:-UseBiasedLocking -XX:G1HeapRegionSize=4m
> > > > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=../logs
> > > > -Djava.io.tmpdir=REDACTED -XX:+ExitOnOutOfMemoryError
> > > > -XX:-OmitStackTraceInFastThrow
> > > >
> > >
> >
> -javaagent:/usr/local/appoptics-6.5.1/appoptics-agent.jar=config=appoptics-agent.json
> > > > -XX:-PreferContainerQuotaForCPUCount
> > > -XX:NativeMemoryTracking=detail
> > > > -jar REDACTED
> > > > openjdk version "11.0.5" 2019-10-15
> > > > OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.5+10)
> > > > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.5+10, mixed
> > > mode)
> > > >
> > >
> >
>
More information about the hotspot-dev
mailing list