high cpu usage caused by weak reference
shaobin zeng
zengshaobin2008 at gmail.com
Wed Nov 25 15:30:54 UTC 2020
Hi Per,
It is so kind of you to explain the details to me, I got the point now.
Then this problem may be resolved if the threads are not pooled or pooled
for a short time only, right? Since the innocent instances will not be
asked again if the thread is not active any more.
Per Liden <per.liden at oracle.com> 于2020年11月25日周三 下午10:49写道:
> On 11/24/20 5:41 PM, shaobin zeng wrote:
> > Hi Per,
> > Thanks for your reply, the thread locals are used in a lib so it is
> > hard to change. And did you mean zgc is one of the SATB collectors, and
> > there is some thing wrong when the collector try to collect these weak
> > references because of the Reference::get is called?
>
> By calling Reference.get() you create a strong reference to the
> referent, which means it will not be collected until a future GC cycle
> has concluded that it's no longer strongly reachable. ThreadLocal is
> backed by a hash table. ThreadLocal.get() will do a lookup in that
> table, which in turn can call Reference.get() on a number of "innocent"
> instances to figure out if it's the instance it is looking for. This
> causes those "innocent" instances to be kept alive. This is where
> Reference.refersTo() comes in (which I linked to). refersTo() provides a
> way to ask what the referent is pointing to without creating a strong
> reference to it. This allows ThreadLocal.get() to avoid keeping
> "innocent" instances alive.
>
> cheers,
> Per
>
> (Btw, ZGC does precise wavefront marking, not SATB)
>
> >
> > Per Liden <per.liden at oracle.com <mailto:per.liden at oracle.com>> 于2020年
> > 11月24日周二 下午7:35写道:
> >
> > Hi,
> >
> > On 11/24/20 2:40 AM, shaobin zeng wrote:
> > > Hi,
> > > I am trying to use zgc in product environment, so I updated
> > jdk from jdk8
> > > to openjdk 15, tomcat 8 to tomcat 8.5, and the gc related
> > options, but the
> > > cpu usage goes to 1000+% after a few hours later of the jvm
> > start(normal
> > > cpu usage should be 100-300%). If I make node offline about 30s,
> > the cpu
> > > goes down, and then make it online, it will work normally for
> > hours until
> > > the cpu goes high again. Here is the gc option:
> > >
> > >> export JAVA_OPTS='-Xms10g -Xmx10g -XX:+UseLargePages
> > >> -XX:ZAllocationSpikeTolerance=5 -XX:ParallelGCThreads=8
> > -XX:ConcGCThreads=4
> > >> -Xss2m -XX:+UseZGC
> > >>
> >
> -Xlog:gc,gc+phases,safepoint:file=/logs/gc.log:t:filecount=10,filesize=10m
> > >> -XX:+HeapDumpOnOutOfMemoryError'
> > >
> > >
> > > I profiled it with async-profile, the most hot method is
> > > java/lang/ThreadLocal$ThreadLocalMap.getEntryAfterMiss, maybe
> > there are too
> > > many thread local map entry which are weak referenced are not
> > reclaimed in
> > > time (jdk8 and cms works well on this)?
> > > The following gc logs show the discovered weak reference count
> > keep
> > > increasing after start, but the requests is almost constant from
> > > 11:00-17:00. Note that cpu dropped from 600% to 400%
> > automatically after
> > > GC(9821), the enqueued is ~250K. GC(10265) the node was offline,
> > enqueued
> > > was ~770K. I'm confused why the enqueued count is small in a long
> > time and
> > > the discovered count going up straightly?
> > > Thanks for any suggestions!
> >
> > I think you've identified the problem. WeakReferences are constantly
> > being resurrected and kept alive by calls to ThreadLocal.get(),
> > which in
> > turn calls getEntryAfterMiss(). Over time the table of WeakReferences
> > grows (because they are all alive) and becomes more and more
> expensive
> > to process. When you take the node off-line, and the calls
> > ThreadLocal.get() stop (i.e. resurrection stops), then the GC gets a
> > change to clean out the stale WeakReferences.
> >
> > This is a known problem with WeakReferences, which ThreadLocal makes
> > heavy use of. The good news is that work is in progress to fix this.
> > The
> > first part (https://bugs.openjdk.java.net/browse/JDK-8188055)
> recently
> > went into mainline, and the two follow up patches
> > (https://bugs.openjdk.java.net/browse/JDK-8256377 and
> > https://bugs.openjdk.java.net/browse/JDK-8256167) will hopefully go
> in
> > soon, and will resolve this issue for good.
> >
> > In the mean time, you might want to look into calling
> > ThreadLocal.remove() once a thread local is no longer needed. This is
> > not always feasible, since those thread locals might have be created
> by
> > a library you don't control, or it might be difficult to tell when a
> > thread local is no longer needed, etc. However, if it is feasible in
> > your case, then it might be a way for you to lower to cost of
> > processing
> > WeakReferences.
> >
> > (CMS in JDK 8 was less sensitive to this issue, because it had a
> > different marking strategy).
> >
> > cheers,
> > Per
> >
> > >
> > > [2020-11-19T11:00:00.245+0800] GC(992) Weak: 155658 encountered,
> > 72334
> > >> discovered, 0 enqueued
> > >> [2020-11-19T12:00:00.397+0800] GC(2194) Weak: 220462
> > encountered, 122216
> > >> discovered, 1380 enqueued
> > >> [2020-11-19T12:00:03.411+0800] GC(2195) Weak: 220598
> > encountered, 107228
> > >> discovered, 677 enqueued
> > >> [2020-11-19T13:00:00.497+0800] GC(3395) Weak: 222536
> > encountered, 82199
> > >> discovered, 1713 enqueued
> > >> [2020-11-19T14:00:00.647+0800] GC(4613) Weak: 443946
> > encountered, 291651
> > >> discovered, 292 enqueued
> > >> [2020-11-19T15:00:01.173+0800] GC(5819) Weak: 338065
> > encountered, 124351
> > >> discovered, 815 enqueued
> > >> [2020-11-19T16:00:01.283+0800] GC(7022) Weak: 459070
> > encountered, 298932
> > >> discovered, 353 enqueued
> > >> [2020-11-19T17:00:01.426+0800] GC(8222) Weak: 688162
> > encountered, 519369
> > >> discovered, 4648 enqueued
> > >> [2020-11-19T16:00:01.283+0800] GC(7022) Weak: 459070
> > encountered, 298932
> > >> discovered, 353 enqueued
> > >> [2020-11-19T17:00:01.426+0800] GC(8222) Weak: 688162
> > encountered, 519369
> > >> discovered, 4648 enqueued
> > >> [2020-11-19T18:00:01.556+0800] GC(9430) Weak: 1078757
> > encountered, 928748
> > >> discovered, 1691 enqueued
> > >> [2020-11-19T18:18:43.595+0800] GC(9821) Weak: 1022080
> > encountered, 841168
> > >> discovered, 247352 enqueued
> > >> [2020-11-19T18:18:46.592+0800] GC(9822) Weak: 774253
> > encountered, 568564
> > >> discovered, 3938 enqueued
> > >> [2020-11-19T18:40:49.616+0800] GC(10265) Weak: 842081
> > encountered, 788825
> > >> discovered, 767288 enqueued
> > >> [2020-11-19T18:40:52.593+0800] GC(10266) Weak: 74876
> > encountered, 18186
> > >> discovered, 1 enqueued
> > >
> > >
> >
> >
> >
> > --
> > 曾少斌
>
--
曾少斌
More information about the zgc-dev
mailing list