From prasanthmathialagan at gmail.com  Fri Jun  5 16:44:14 2020
From: prasanthmathialagan at gmail.com (Prasanth Mathialagan)
Date: Fri, 5 Jun 2020 09:44:14 -0700
Subject: Increased CPU time with G1GC
In-Reply-To: <CABmC+k8DrK8CnMf3oDtVPHQr+3PSMpHWLvyeD_E2d0K1=nyMDQ@mail.gmail.com>
References: <CABmC+k8DrK8CnMf3oDtVPHQr+3PSMpHWLvyeD_E2d0K1=nyMDQ@mail.gmail.com>
Message-ID: <CABmC+k_oRzc8ddxqJGMpWXqbmfh8xo2mUpwP+vRRQj4w9_qD6g@mail.gmail.com>

Hi,
We recently switched our Java application from CMS to G1. Since then we
observed increased CPU time (user cpu) and latency for the requests.

*Observations*

   - Count of GC pauses remains the same with CMS and G1 and so does the
   pause time.
   - My initial suspicion was that the application threads were competing
   with GC threads to get CPU cycles. But I don't see any indication of
   increased concurrent time in GC logs.

I suspect that the overhead associated with read/write barriers could be
the reason for the increased CPU cycles but I want to confirm that. *Are
there any GC flags that prints statistics about read/write barriers? Or is
there a way to debug this?*

java -version

openjdk version "1.8.0_222"

OpenJDK Runtime Environment Corretto-8.222.10.1 (build 1.8.0_222-b10)

OpenJDK 64-Bit Server VM Corretto-8.222.10.1 (build 25.222-b10, mixed mode)

These are the command line flags I find in GC logs that the application
uses.
-XX:+UseG1GC
-XX:CICompilerCount=3
-XX:CompressedClassSpaceSize=931135488
-XX:ConcGCThreads=1
-XX:G1HeapRegionSize=4194304
-XX:InitialCodeCacheSize=402653184
-XX:InitialHeapSize=8589934592
-XX:InitialTenuringThreshold=6
-XX:InitiatingHeapOccupancyPercent=50
-XX:MarkStackSize=4194304
-XX:MaxGCPauseMillis=200
-XX:MaxHeapSize=8589934592
-XX:MaxMetaspaceSize=939524096
-XX:MaxNewSize=5150605312
-XX:MaxTenuringThreshold=6
-XX:MetaspaceSize=268435456
-XX:MinHeapDeltaBytes=4194304
-XX:+ParallelRefProcEnabled
-XX:+PrintAdaptiveSizePolicy
-XX:+PrintClassHistogram
-XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
-XX:PrintSafepointStatisticsTimeout=1000
-XX:+PrintTenuringDistribution
-XX:ReservedCodeCacheSize=402653184
-XX:+ScavengeBeforeFullGC
-XX:SoftRefLRUPolicyMSPerMB=2048
-XX:StackShadowPages=20
-XX:ThreadStackSize=512
-XX:+TieredCompilation
-XX:+UseBiasedLocking
-XX:+UseCompressedClassPointers
-XX:+UseCompressedOops
-XX:+UseFastAccessorMethods
-XX:+UseLargePages
-XX:+UseTLAB

Let me know if I need to provide any other information.

Regards,
Prasanth
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200605/ee8a6d78/attachment.htm>

From thomas.schatzl at oracle.com  Mon Jun  8 09:02:33 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 8 Jun 2020 11:02:33 +0200
Subject: Increased CPU time with G1GC
In-Reply-To: <CABmC+k_oRzc8ddxqJGMpWXqbmfh8xo2mUpwP+vRRQj4w9_qD6g@mail.gmail.com>
References: <CABmC+k8DrK8CnMf3oDtVPHQr+3PSMpHWLvyeD_E2d0K1=nyMDQ@mail.gmail.com>
 <CABmC+k_oRzc8ddxqJGMpWXqbmfh8xo2mUpwP+vRRQj4w9_qD6g@mail.gmail.com>
Message-ID: <e4a35570-0699-3c82-d967-7deaeac71679@oracle.com>

Hi Prasanth,

On 05.06.20 18:44, Prasanth Mathialagan wrote:
> Hi,
> We recently switched our Java application from CMS to G1. Since then we 
> observed increased CPU time (user cpu) and latency for the requests.
> 
> */_Observations_/*
> 
>   * Count of GC pauses remains the same with CMS and G1 and so does the
>     pause time.
>   * My initial suspicion was that the application threads were competing
>     with GC threads to get CPU cycles. But I don't see any indication of
>     increased concurrent time in GC logs.
> 
> I suspect that the overhead associated with read/write barriers could be 
> the reason for the increased CPU cycles but I want to confirm that. *Are 
> there any GC flags that prints statistics about read/write barriers? Or 
> is there a way to debug this?*
> 

The significantly larger write barriers (there are almost no read 
barriers in g1) can have an effect as you describe, although I would not 
expect a direct impact on latency.

There is no statistics gathering option to be enabled on actual impact 
of write barriers as they are too small to measure by themselves without 
huge overhead. Tracing throughput deficiencies back to barriers is 
mostly deduced by elimination of all other causes.

> java -version
> 
> openjdk version "1.8.0_222"

In later JDKs the amount of applications where G1 improves upon CMS 
broadens. There will also always be some applications where CMS is very 
hard to beat in terms of your desired throughput/latency. Particularly 
ones where the application and options were previously tuned to CMS.

> 
> OpenJDK Runtime Environment Corretto-8.222.10.1 (build 1.8.0_222-b10)
> 
> OpenJDK 64-Bit Server VM Corretto-8.222.10.1 (build 25.222-b10, mixed mode)
> 
> 
> These are the command line flags I find in GC logs that the application 
> uses.

Thanks. Some thoughts on the options, not sure if you spent time on 
tuning them to G1, but if not it might be useful to reconsider some of 
the GC specific ones.

> -XX:+UseG1GC
> -XX:CICompilerCount=3
> -XX:CompressedClassSpaceSize=931135488

> -XX:ConcGCThreads=1

Not sure if that makes a lot of sense to slow down concurrent operation, 
but it might help eeke you last throughput. Note that in jdk8 
scalability of marking in g1 isn't that great, but that typically only 
has impact if you are in the tens of threads.

> -XX:G1HeapRegionSize=4194304

That should be automatically selected with given initial/max heap size.

> -XX:InitialCodeCacheSize=402653184
> -XX:InitialHeapSize=8589934592
> -XX:InitialTenuringThreshold=6
> -XX:InitiatingHeapOccupancyPercent=50

If you increase the number of conc gc threads, you might be able to 
increase this one to decrease the frequency of (old gen) collections. 
50% seems pretty low on a 8g heap. (That also applies to CMS I think).

> -XX:MarkStackSize=4194304

Curious about the reason for that? Afaik even in jdk8, while G1 reserves 
a lot of memory for that, it will not be allocated by the OS anyway 
unless used; I think other collectors are the same.

> -XX:MaxGCPauseMillis=200

Default.

> -XX:MaxHeapSize=8589934592
> -XX:MaxMetaspaceSize=939524096
> -XX:MaxNewSize=5150605312
> -XX:MaxTenuringThreshold=6

Not sure if that potentially prematurely pushing objects into old gen is 
a good idea, but I assume you tested that.

> -XX:MetaspaceSize=268435456
> -XX:MinHeapDeltaBytes=4194304
> -XX:+ParallelRefProcEnabled
[...lots of Print options...]
> -XX:ReservedCodeCacheSize=402653184
> -XX:+ScavengeBeforeFullGC

That last one never had any effect in G1 afair.

> -XX:SoftRefLRUPolicyMSPerMB=2048
> -XX:StackShadowPages=20
> -XX:ThreadStackSize=512
> -XX:+TieredCompilation
> -XX:+UseBiasedLocking
> -XX:+UseCompressedClassPointers
> -XX:+UseCompressedOops
> -XX:+UseFastAccessorMethods
> -XX:+UseLargePages
> -XX:+UseTLAB

Given that you set initial and max heap size the same, use large pages, 
I recommend to add -XX:+AlwaysPreTouch.

> 
> Let me know if I need to provide any other information.

Sorry for not being a great help.

Thanks,
   Thomas

From robberphex at gmail.com  Sun Jun 21 11:59:58 2020
From: robberphex at gmail.com (Robert Lu)
Date: Sun, 21 Jun 2020 19:59:58 +0800
Subject: How could self link help GC?
Message-ID: <CA+879rcnVMqznQqCZo2+Vuj_s1rB0hXervzAoyzb8qs6bmsx1g@mail.gmail.com>

Hi,
On java.util.concurrent.LinkedBlockingQueue#dequeue
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/LinkedBlockingQueue.java#L217
:

private E dequeue() {
    // assert takeLock.isHeldByCurrentThread();
    // assert head.item == null;
    Node<E> h = head;
    Node<E> first = h.next;
    h.next = h; // help GC
    head = first;
    E x = first.item;
    first.item = null;
    return x;
}

Why does h.next = h help GC?

-- 
Robert Lu <robberphex at gmail.com>
About me: https://www.robberphex.com/about-me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200621/b788f7e5/attachment.htm>

From dhd at exnet.com  Sun Jun 21 16:18:30 2020
From: dhd at exnet.com (Damon Hart-Davis)
Date: Sun, 21 Jun 2020 17:18:30 +0100
Subject: How could self link help GC?
In-Reply-To: <CA+879rcnVMqznQqCZo2+Vuj_s1rB0hXervzAoyzb8qs6bmsx1g@mail.gmail.com>
References: <CA+879rcnVMqznQqCZo2+Vuj_s1rB0hXervzAoyzb8qs6bmsx1g@mail.gmail.com>
Message-ID: <03F8B2B0-79C0-4088-BEE7-37579DF843C1@exnet.com>

By avoiding spurious pointers to h.next keeping the ?next? item alive longer than necessary.

Rgds

Damon

> On 21 Jun 2020, at 12:59, Robert Lu <robberphex at gmail.com> wrote:
> 
> Hi,
> On java.util.concurrent.LinkedBlockingQueue#dequeue https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/LinkedBlockingQueue.java#L217 <https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/LinkedBlockingQueue.java#L217> :
> 
> private E dequeue() {
>     // assert takeLock.isHeldByCurrentThread();
>     // assert head.item == null;
>     Node<E> h = head;
>     Node<E> first = h.next;
>     h.next = h; // help GC
>     head = first;
>     E x = first.item;
>     first.item = null;
>     return x;
> }
> 
> Why does h.next = h help GC?
> 
> -- 
> Robert Lu <robberphex at gmail.com <mailto:robberphex at gmail.com>>
> About me: https://www.robberphex.com/about-me <https://www.robberphex.com/about-me>
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> https://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200621/9f825aa3/attachment.htm>

From robberphex at gmail.com  Mon Jun 22 03:40:08 2020
From: robberphex at gmail.com (Robert Lu)
Date: Mon, 22 Jun 2020 11:40:08 +0800
Subject: How could self link help GC?
In-Reply-To: <03F8B2B0-79C0-4088-BEE7-37579DF843C1@exnet.com>
References: <CA+879rcnVMqznQqCZo2+Vuj_s1rB0hXervzAoyzb8qs6bmsx1g@mail.gmail.com>
 <03F8B2B0-79C0-4088-BEE7-37579DF843C1@exnet.com>
Message-ID: <CA+879reP0Gisw9w1AJ+PfEgMQVnZcG4Y4kOtd5c1B-575wZhbQ@mail.gmail.com>

Hi, Damon.

But once dequeued, old node is dead object. The pointer(h.next) from dead
object to alive/dead object makes no difference to GC.
So I thinks h.next=h is meaningless.

And, why isn't it h.next=null ?


On Mon, Jun 22, 2020 at 12:18 AM Damon Hart-Davis <dhd at exnet.com> wrote:

> By avoiding spurious pointers to h.next keeping the ?next? item alive
> longer than necessary.
>
> Rgds
>
> Damon
>
> On 21 Jun 2020, at 12:59, Robert Lu <robberphex at gmail.com> wrote:
>
> Hi,
> On java.util.concurrent.LinkedBlockingQueue#dequeue
> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/LinkedBlockingQueue.java#L217
> :
>
> private E dequeue() {
>     // assert takeLock.isHeldByCurrentThread();
>     // assert head.item == null;
>     Node<E> h = head;
>     Node<E> first = h.next;
>     h.next = h; // help GC
>     head = first;
>     E x = first.item;
>     first.item = null;
>     return x;
> }
>
> Why does h.next = h help GC?
>
> --
> Robert Lu <robberphex at gmail.com>
> About me: https://www.robberphex.com/about-me
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> https://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>

-- 
Robert Lu <robberphex at gmail.com>
About me: https://www.robberphex.com/about-me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200622/fe8d4b47/attachment.htm>

From ecki at zusammenkunft.net  Mon Jun 22 04:33:44 2020
From: ecki at zusammenkunft.net (Bernd Eckenfels)
Date: Mon, 22 Jun 2020 04:33:44 +0000
Subject: How could self link help GC?
In-Reply-To: <CA+879reP0Gisw9w1AJ+PfEgMQVnZcG4Y4kOtd5c1B-575wZhbQ@mail.gmail.com>
References: <CA+879rcnVMqznQqCZo2+Vuj_s1rB0hXervzAoyzb8qs6bmsx1g@mail.gmail.com>
 <03F8B2B0-79C0-4088-BEE7-37579DF843C1@exnet.com>,
 <CA+879reP0Gisw9w1AJ+PfEgMQVnZcG4Y4kOtd5c1B-575wZhbQ@mail.gmail.com>
Message-ID: <AM6PR03MB438927E340C0B2FA6B4DFCAAFF970@AM6PR03MB4389.eurprd03.prod.outlook.com>

It is probably not needed here, since the slot is freed up quite quickly (at the end of the method) and the remaining queue is also kept alive anyway.

Setting it to selve instead of null is typically used to reduce the risk of NPEs (but it might on the other hand increase the risk of lovelocks)

Not sure if it's a good idea to remove the setting for concurrency reasons, but it seems not really be beneficial for GC.

Gruss
Bernd

--
http://bernd.eckenfels.net
________________________________
Von: hotspot-gc-use <hotspot-gc-use-bounces at openjdk.java.net> im Auftrag von Robert Lu <robberphex at gmail.com>
Gesendet: Monday, June 22, 2020 5:40:08 AM
An: Damon Hart-Davis <dhd at exnet.com>
Cc: hotspot-gc-use at openjdk.java.net <hotspot-gc-use at openjdk.java.net>
Betreff: Re: How could self link help GC?

Hi, Damon.

But once dequeued, old node is dead object. The pointer(h.next) from dead object to alive/dead object makes no difference to GC.
So I thinks h.next=h is meaningless.

And, why isn't it h.next=null ?


On Mon, Jun 22, 2020 at 12:18 AM Damon Hart-Davis <dhd at exnet.com<mailto:dhd at exnet.com>> wrote:
By avoiding spurious pointers to h.next keeping the ?next? item alive longer than necessary.

Rgds

Damon

On 21 Jun 2020, at 12:59, Robert Lu <robberphex at gmail.com<mailto:robberphex at gmail.com>> wrote:

Hi,
On java.util.concurrent.LinkedBlockingQueue#dequeue https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/LinkedBlockingQueue.java#L217 :

private E dequeue() {
    // assert takeLock.isHeldByCurrentThread();
    // assert head.item == null;
    Node<E> h = head;
    Node<E> first = h.next;
    h.next = h; // help GC
    head = first;
    E x = first.item;
    first.item = null;
    return x;
}

Why does h.next = h help GC?

--
Robert Lu <robberphex at gmail.com<mailto:robberphex at gmail.com>>
About me: https://www.robberphex.com/about-me

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>
https://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


--
Robert Lu <robberphex at gmail.com<mailto:robberphex at gmail.com>>
About me: https://www.robberphex.com/about-me

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200622/15408983/attachment.htm>