From christopherberner at gmail.com  Mon Mar  2 17:44:38 2015
From: christopherberner at gmail.com (Christopher Berner)
Date: Mon, 2 Mar 2015 09:44:38 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
Message-ID: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>

Hello,

I work on the Presto project (https://github.com/facebook/presto) and am
trying to understand the behavior of G1. We run a 45GB heap on the worker
machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except
that every day a few machines hit a "to-space exhausted" failure and either
dies due to an OutOfMemory error, or does a full gc with such a long pause
that it fails our health checks and is restarted by our service manager.

Looking at the GC logs, the sequence of events is always the same. The
young gen is quite large (~50% of the heap), and every collection is fast,
but then it hits a "to-space exhausted" failure which appears to increase
heap used (see log below). After that the young gen is tiny and it never
recovers.

Two questions: 1) why does heap used increase in the middle of the GC
cycle? 2) Looking at some of the logs it appears that it starts a full GC,
but also throws an OutOfMemory error concurrently (they show up a hundred
lines apart or so in stdout). Why would there be an OutOfMemory error
before the full GC finished?

Thanks for any help!
Christopher

2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated GC)
(young) 199078.407: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 16136, predicted base

time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms]

 199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet,
eden: 805 regions, survivors: 11 regions, predicted young region time:
56.53 ms]

 199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
805 regions, survivors: 11 regions, old: 0 regions, predicted pause time:
86.82 ms, target pause time: 200.00 ms]

, 0.0722119 secs]

   [Parallel Time: 46.7 ms, GC Workers: 28]

      [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max:
199078407.5, Diff: 0.6]

      [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1,
Sum: 39.7]

      [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9]

         [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum: 182]

      [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3]

      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4,
Sum: 0.7]

      [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum:
1155.3]

      [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum: 25.8]

      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum:
3.9]

      [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6,
Sum: 1289.7]

      [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max:
199078453.4, Diff: 0.2]

   [Code Root Fixup: 0.3 ms]

   [Code Root Purge: 0.0 ms]

   [Clear CT: 3.0 ms]

   [Other: 22.2 ms]

      [Choose CSet: 0.0 ms]

      [Ref Proc: 18.0 ms]

      [Ref Enq: 0.5 ms]

      [Redirty Cards: 0.9 ms]

      [Humongous Reclaim: 0.0 ms]

      [Free CSet: 1.7 ms]

   [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap:
39.7G(45.0G)->14.6G(45.0G)]

 [Times: user=1.37 sys=0.00, real=0.08 secs]

2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated GC)
(young) 201610.820: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 56252, predicted base

time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms]

 201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet,
eden: 807 regions, survivors: 10 regions, predicted young region time:
60.67 ms]

 201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
807 regions, survivors: 10 regions, old: 0 regions, predicted pause time:
95.67 ms, target pause time: 200.00 ms]

 201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
region allocation request failed, allocation request: 3058176 bytes]

 201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes]

 201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
heap already fully expanded]

 201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle
initiation, reason: occupancy higher than threshold, occupancy: 44291850240
bytes, allocation request: 0 bytes, thres

hold: 21743271900 bytes (45.00 %), source: end of GC]

 (to-space exhausted), 9.0961593 secs]

   [Parallel Time: 8209.7 ms, GC Workers: 28]

      [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max:
201610864.4, Diff: 0.5]

      [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6,
Sum: 47.8]

      [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum: 131.1]

         [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum: 766]

      [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1]

      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5,
Sum: 0.8]

      [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff: 6.3,
Sum: 229661.8]

      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 7.0]

      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum:
2.8]

      [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff:
0.6, Sum: 229858.3]

      [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max:
201619073.5, Diff: 0.2]

   [Code Root Fixup: 0.3 ms]

   [Code Root Purge: 0.0 ms]

   [Clear CT: 3.0 ms]

   [Other: 883.1 ms]

      [Evacuation Failure: 788.4 ms]

      [Choose CSet: 0.0 ms]

      [Ref Proc: 45.0 ms]

      [Ref Enq: 0.6 ms]

      [Redirty Cards: 1.4 ms]

      [Humongous Reclaim: 0.1 ms]

      [Free CSet: 0.6 ms]

   [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap:
39.8G(45.0G)->44.1G(45.0G)]

 [Times: user=46.07 sys=2.21, real=9.10 secs]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150302/fa9aee42/attachment-0001.html>

From simone.bordet at gmail.com  Mon Mar  2 18:15:53 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Mon, 2 Mar 2015 19:15:53 +0100
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
Message-ID: <CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>

Hi,

On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner
<christopherberner at gmail.com> wrote:
> Hello,
>
> I work on the Presto project (https://github.com/facebook/presto) and am
> trying to understand the behavior of G1. We run a 45GB heap on the worker
> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly,

Just out of curiosity, you seem to have IHOP=45% and an eden that is
55% of the heap (25 GiB out of 45 GiB).
Is there any reason why you keep IHOP this low or you're just running
with defaults ?

To the hotspot gc experts, is there any way to limit the Eden size
without impacting on the ergonomics ?
Does -XX:MaxNewSize impact ergonomics ?

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From christopherberner at gmail.com  Mon Mar  2 18:52:45 2015
From: christopherberner at gmail.com (Christopher Berner)
Date: Mon, 2 Mar 2015 10:52:45 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
Message-ID: <CACeOMVtXFa0uHqWwuLTFk_ybkYeZmuYiocoVDwhZdSCREQG9Pw@mail.gmail.com>

We're just running with the default IHOP

On Mon, Mar 2, 2015 at 10:15 AM, Simone Bordet <simone.bordet at gmail.com>
wrote:

> Hi,
>
> On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner
> <christopherberner at gmail.com> wrote:
> > Hello,
> >
> > I work on the Presto project (https://github.com/facebook/presto) and am
> > trying to understand the behavior of G1. We run a 45GB heap on the worker
> > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly,
>
> Just out of curiosity, you seem to have IHOP=45% and an eden that is
> 55% of the heap (25 GiB out of 45 GiB).
> Is there any reason why you keep IHOP this low or you're just running
> with defaults ?
>
> To the hotspot gc experts, is there any way to limit the Eden size
> without impacting on the ergonomics ?
> Does -XX:MaxNewSize impact ergonomics ?
>
> --
> Simone Bordet
> http://bordet.blogspot.com
> ---
> Finally, no matter how good the architecture and design are,
> to deliver bug-free software with optimal performance and reliability,
> the implementation technique must be flawless.   Victoria Livschitz
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150302/639c9ac7/attachment.html>

From yu.zhang at oracle.com  Mon Mar  2 22:44:06 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Mon, 02 Mar 2015 14:44:06 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CACeOMVtXFa0uHqWwuLTFk_ybkYeZmuYiocoVDwhZdSCREQG9Pw@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
	<CACeOMVtXFa0uHqWwuLTFk_ybkYeZmuYiocoVDwhZdSCREQG9Pw@mail.gmail.com>
Message-ID: <54F4E7B6.90704@oracle.com>

Christopher,


        8. What is 'to-space exhausted'?  Why it is slow? How to avoid it?

'to-space exhausted' happens when there is not enough space to copy to 
during evacuation.  It is slow because g1 has to do a lot of work to 
make sure the heap is in a ready-to-use state.  There are several ways 
you can try to avoid it.  Trigger the mixed gc earlier by decreasing 
-XX:InitiatingHeapOccupancyPercent (default 45), adjust Young Gen size, 
increase G1ReservePercent, etc.  There is no one size fits all tuning.

 From the log snip you posted, my guess is most of the time the objects 
die in young gen.  But some times they live longer and promoted to old 
gen.  But there is not enough space in old gen.

Thanks,
Jenny

On 3/2/2015 10:52 AM, Christopher Berner wrote:
> We're just running with the default IHOP
>
> On Mon, Mar 2, 2015 at 10:15 AM, Simone Bordet 
> <simone.bordet at gmail.com <mailto:simone.bordet at gmail.com>> wrote:
>
>     Hi,
>
>     On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner
>     <christopherberner at gmail.com <mailto:christopherberner at gmail.com>>
>     wrote:
>     > Hello,
>     >
>     > I work on the Presto project
>     (https://github.com/facebook/presto) and am
>     > trying to understand the behavior of G1. We run a 45GB heap on
>     the worker
>     > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly,
>
>     Just out of curiosity, you seem to have IHOP=45% and an eden that is
>     55% of the heap (25 GiB out of 45 GiB).
>     Is there any reason why you keep IHOP this low or you're just running
>     with defaults ?
>
>     To the hotspot gc experts, is there any way to limit the Eden size
>     without impacting on the ergonomics ?
>     Does -XX:MaxNewSize impact ergonomics ?
>
>     --
>     Simone Bordet
>     http://bordet.blogspot.com
>     ---
>     Finally, no matter how good the architecture and design are,
>     to deliver bug-free software with optimal performance and reliability,
>     the implementation technique must be flawless.  Victoria Livschitz
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150302/dd2144b6/attachment.html>

From yu.zhang at oracle.com  Mon Mar  2 22:45:08 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Mon, 02 Mar 2015 14:45:08 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
Message-ID: <54F4E7F4.9030105@oracle.com>

I am starting a FAQ page, I added this question 
https://blogs.oracle.com/g1gc/


      9. What is the recommended way to limit Eden size for g1?

  The recommended way is to set  -XX:MaxGCPauseMillis.  G1 will adjust 
YoungGen size trying to meet the pause goal.  The Young gen size is 
between 5 to 60 percent of the heap size.  To control it further, you 
can use Experimental flags:

-XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5> 
-XX:G1MaxNewSizePercent=<60>.

  G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, 
-Xmn

-Xmn: the same as NewSize=MaxNewSize

only -XX:NewSize is set, the young gen size is between specified NewSize 
and  G1MaxNewSizePercent

only -XX:MaxNewSize is set, the young gen size is between 
specified G1NewSizePercent and MaxNewSize.

Both  -XX:NewSize and  -XX:MaxNewSize are used, young gen will be 
between those 2 sizes.  But when heap size change, the young gen size 
will not change accordingly.

If  -XX:NewRatio is used, the Young Gen size is heap size * newRatio. 
  NewRatio is ignored if it is used with NewSize and MaxNewSize.


Thanks,
Jenny

On 3/2/2015 10:15 AM, Simone Bordet wrote:
> Hi,
>
> On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner
> <christopherberner at gmail.com> wrote:
>> Hello,
>>
>> I work on the Presto project (https://github.com/facebook/presto) and am
>> trying to understand the behavior of G1. We run a 45GB heap on the worker
>> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly,
> Just out of curiosity, you seem to have IHOP=45% and an eden that is
> 55% of the heap (25 GiB out of 45 GiB).
> Is there any reason why you keep IHOP this low or you're just running
> with defaults ?
>
> To the hotspot gc experts, is there any way to limit the Eden size
> without impacting on the ergonomics ?
> Does -XX:MaxNewSize impact ergonomics ?
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150302/6ae8013c/attachment.html>

From christopherberner at gmail.com  Tue Mar  3 02:41:08 2015
From: christopherberner at gmail.com (Christopher Berner)
Date: Mon, 2 Mar 2015 18:41:08 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <54F4E7F4.9030105@oracle.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
	<54F4E7F4.9030105@oracle.com>
Message-ID: <CACeOMVvQJbPhjeEZW-QY55XotF0298deW8jcYcv1=ud6CoF3XQ@mail.gmail.com>

Thanks! I'll try adjusting the pause target, and if that doesn't help I'll
try those other settings

On Mon, Mar 2, 2015 at 2:45 PM, Yu Zhang <yu.zhang at oracle.com> wrote:

>  I am starting a FAQ page, I added this question
> https://blogs.oracle.com/g1gc/
> 9. What is the recommended way to limit Eden size for g1?
>
>  The recommended way is to set  -XX:MaxGCPauseMillis.  G1 will adjust
> YoungGen size trying to meet the pause goal.  The Young gen size is between
> 5 to 60 percent of the heap size.  To control it further, you can use
> Experimental flags:
>
> -XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5>
> -XX:G1MaxNewSizePercent=<60>.
>
>  G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize,
> -Xmn
>
> -Xmn: the same as NewSize=MaxNewSize
>
> only -XX:NewSize is set, the young gen size is between specified NewSize
> and  G1MaxNewSizePercent
>
> only -XX:MaxNewSize is set, the young gen size is between
> specified G1NewSizePercent and MaxNewSize.
>
> Both  -XX:NewSize and  -XX:MaxNewSize are used, young gen will be between
> those 2 sizes.  But when heap size change, the young gen size will not
> change accordingly.
>
> If  -XX:NewRatio is used, the Young Gen size is heap size * newRatio.
> NewRatio is ignored if it is used with NewSize and MaxNewSize.
>
> Thanks,
> Jenny
>
> On 3/2/2015 10:15 AM, Simone Bordet wrote:
>
> Hi,
>
> On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner<christopherberner at gmail.com> <christopherberner at gmail.com> wrote:
>
>  Hello,
>
> I work on the Presto project (https://github.com/facebook/presto) and am
> trying to understand the behavior of G1. We run a 45GB heap on the worker
> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly,
>
>  Just out of curiosity, you seem to have IHOP=45% and an eden that is
> 55% of the heap (25 GiB out of 45 GiB).
> Is there any reason why you keep IHOP this low or you're just running
> with defaults ?
>
> To the hotspot gc experts, is there any way to limit the Eden size
> without impacting on the ergonomics ?
> Does -XX:MaxNewSize impact ergonomics ?
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150302/56211650/attachment.html>

From simone.bordet at gmail.com  Tue Mar  3 07:16:56 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Tue, 3 Mar 2015 08:16:56 +0100
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <54F4E7F4.9030105@oracle.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
	<54F4E7F4.9030105@oracle.com>
Message-ID: <CAFWmRJ0DL0kjPoLp=1MJym7YrMO0PqA_NRHw=N3LRHZL=O1TGA@mail.gmail.com>

Jenny,

On Mon, Mar 2, 2015 at 11:45 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
> I am starting a FAQ page, I added this question
> https://blogs.oracle.com/g1gc/
>
> 9. What is the recommended way to limit Eden size for g1?
>
>  The recommended way is to set  -XX:MaxGCPauseMillis.  G1 will adjust
> YoungGen size trying to meet the pause goal.  The Young gen size is between
> 5 to 60 percent of the heap size.  To control it further, you can use
> Experimental flags:
>
> -XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5>
> -XX:G1MaxNewSizePercent=<60>.
>
>  G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, -Xmn
>
> -Xmn: the same as NewSize=MaxNewSize
>
> only -XX:NewSize is set, the young gen size is between specified NewSize and
> G1MaxNewSizePercent
>
> only -XX:MaxNewSize is set, the young gen size is between specified
> G1NewSizePercent and MaxNewSize.
>
> Both  -XX:NewSize and  -XX:MaxNewSize are used, young gen will be between
> those 2 sizes.  But when heap size change, the young gen size will not
> change accordingly.
>
> If  -XX:NewRatio is used, the Young Gen size is heap size * newRatio.
> NewRatio is ignored if it is used with NewSize and MaxNewSize.

I take that all of these options disable ergonomics and therefore the
attempts of G1 to respect MaxGCPauseMillis ?
Or setting one or some of these will still make G1 try to respect
MaxGCPauseMillis ?

Thanks !

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From thomas.schatzl at oracle.com  Tue Mar  3 11:18:01 2015
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 03 Mar 2015 12:18:01 +0100
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CAFWmRJ0DL0kjPoLp=1MJym7YrMO0PqA_NRHw=N3LRHZL=O1TGA@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
	<54F4E7F4.9030105@oracle.com>
	<CAFWmRJ0DL0kjPoLp=1MJym7YrMO0PqA_NRHw=N3LRHZL=O1TGA@mail.gmail.com>
Message-ID: <1425381481.3315.62.camel@oracle.com>

Hi Simone,

On Tue, 2015-03-03 at 08:16 +0100, Simone Bordet wrote:
> Jenny,
> 
> On Mon, Mar 2, 2015 at 11:45 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
> > I am starting a FAQ page, I added this question
> > https://blogs.oracle.com/g1gc/
> >
> > 9. What is the recommended way to limit Eden size for g1?
> >
[...]
> >  G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, -Xmn
> >
> > -Xmn: the same as NewSize=MaxNewSize
> >
> > only -XX:NewSize is set, the young gen size is between specified NewSize and
> > G1MaxNewSizePercent
> >
> > only -XX:MaxNewSize is set, the young gen size is between specified
> > G1NewSizePercent and MaxNewSize.
> >
> > Both  -XX:NewSize and  -XX:MaxNewSize are used, young gen will be between
> > those 2 sizes.  But when heap size change, the young gen size will not
> > change accordingly.
> >
> > If  -XX:NewRatio is used, the Young Gen size is heap size * newRatio.
> > NewRatio is ignored if it is used with NewSize and MaxNewSize.
> 
> I take that all of these options disable ergonomics and therefore the
> attempts of G1 to respect MaxGCPauseMillis ?
> Or setting one or some of these will still make G1 try to respect
> MaxGCPauseMillis ?

G1 will attempt to respect pause time as much as possible, except if you
set the min and max limits to the same value. This is the case if you do
that explicitly for both, or if NewRatio is set.
Basically, by setting one or the other value, you fix that bound to a
certain value.

I recommend only setting G1MaxNewSize or MaxNewSize for the case
mentioned in the original post. It is appropriate for cases when there
may be sudden changes in the survival rate of eden that the gc cannot
predict to avoid missing the pause time goal excessively, which seems to
be the case.

Thanks,
  Thomas


From simone.bordet at gmail.com  Tue Mar  3 11:23:49 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Tue, 3 Mar 2015 12:23:49 +0100
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <1425381481.3315.62.camel@oracle.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAFWmRJ0A2jh3qt+MVo9aQs6+5EShE9_yCfFy7BBhab_DB5nzEQ@mail.gmail.com>
	<54F4E7F4.9030105@oracle.com>
	<CAFWmRJ0DL0kjPoLp=1MJym7YrMO0PqA_NRHw=N3LRHZL=O1TGA@mail.gmail.com>
	<1425381481.3315.62.camel@oracle.com>
Message-ID: <CAFWmRJ3gcBFZ_Ne1TwRvV79+_YNVLSov6-FQQp63SB4J9G00uQ@mail.gmail.com>

Hi,

On Tue, Mar 3, 2015 at 12:18 PM, Thomas Schatzl
<thomas.schatzl at oracle.com> wrote:
> G1 will attempt to respect pause time as much as possible, except if you
> set the min and max limits to the same value. This is the case if you do
> that explicitly for both, or if NewRatio is set.
> Basically, by setting one or the other value, you fix that bound to a
> certain value.

Thanks for this clarification !

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From chkwok at digibites.nl  Tue Mar  3 11:43:25 2015
From: chkwok at digibites.nl (Chi Ho Kwok)
Date: Tue, 3 Mar 2015 12:43:25 +0100
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
Message-ID: <CAG7eTFpVjDSHRHfmv7qe+1O-Ye2Pu8Ob4KHoPgq96a5x-6wXwA@mail.gmail.com>

Hi,

When there are live objects during an eden collection, they must be copied
to a new, empty region. With a huge eden size, this may require more space
than there is available, causing a to-space exhaustion. We always run with
a fixed new generation size to avoid this kind of issues; when ergonomics
think it can hit the pause time target with a very large eden, it will
allocate a very large eden as it can be more efficient, that's a bit too
unpredictable for us in production.

We usually set a NewRatio of 4 to 10, when set to 4, the eden is fixed at
1/5th the size of the full heap or ~9GB. This also pretty much guarantee a
tiny static eden collection pause, in your case, of ~21ms. (60/807*208
regions). Your promotion failure happened when 25.5G produced 3.2G of
survivors, with a eden of 9G, this should only be 1.2G, which shouldn't be
any issue if the old collector is run regularly. The old collector is only
triggered after a young collection by the way, so having them spaced closer
(smaller eden -> full more quickly) gives it more chance to run and add
almost empty regions to the next mixed gc run.

Con: GC will run more often, with smaller pauses, and promote more objects
to the old generation which require more work to collect (concurrent scan
required). But as your collections run once per many minutes, this extra
overhead is basically zero. Our prod young collectors run multiple times
per second on a 4G eden, so you're not pushing the limits of the collector
at all.


Kind regards,

-- 
Chi Ho Kwok
Digibites Technology
chkwok at digibites.nl

On 2 March 2015 at 18:44, Christopher Berner <christopherberner at gmail.com>
wrote:

> Hello,
>
> I work on the Presto project (https://github.com/facebook/presto) and am
> trying to understand the behavior of G1. We run a 45GB heap on the worker
> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except
> that every day a few machines hit a "to-space exhausted" failure and either
> dies due to an OutOfMemory error, or does a full gc with such a long pause
> that it fails our health checks and is restarted by our service manager.
>
> Looking at the GC logs, the sequence of events is always the same. The
> young gen is quite large (~50% of the heap), and every collection is fast,
> but then it hits a "to-space exhausted" failure which appears to increase
> heap used (see log below). After that the young gen is tiny and it never
> recovers.
>
> Two questions: 1) why does heap used increase in the middle of the GC
> cycle? 2) Looking at some of the logs it appears that it starts a full GC,
> but also throws an OutOfMemory error concurrently (they show up a hundred
> lines apart or so in stdout). Why would there be an OutOfMemory error
> before the full GC finished?
>
> Thanks for any help!
> Christopher
>
> 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated
> GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing
> CSet, _pending_cards: 16136, predicted base
>
> time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms]
>
>  199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet,
> eden: 805 regions, survivors: 11 regions, predicted young region time:
> 56.53 ms]
>
>  199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
> 805 regions, survivors: 11 regions, old: 0 regions, predicted pause time:
> 86.82 ms, target pause time: 200.00 ms]
>
> , 0.0722119 secs]
>
>    [Parallel Time: 46.7 ms, GC Workers: 28]
>
>       [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max:
> 199078407.5, Diff: 0.6]
>
>       [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1,
> Sum: 39.7]
>
>       [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9]
>
>          [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum: 182]
>
>       [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3]
>
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4,
> Sum: 0.7]
>
>       [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum:
> 1155.3]
>
>       [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum:
> 25.8]
>
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum:
> 3.9]
>
>       [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6,
> Sum: 1289.7]
>
>       [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max:
> 199078453.4, Diff: 0.2]
>
>    [Code Root Fixup: 0.3 ms]
>
>    [Code Root Purge: 0.0 ms]
>
>    [Clear CT: 3.0 ms]
>
>    [Other: 22.2 ms]
>
>       [Choose CSet: 0.0 ms]
>
>       [Ref Proc: 18.0 ms]
>
>       [Ref Enq: 0.5 ms]
>
>       [Redirty Cards: 0.9 ms]
>
>       [Humongous Reclaim: 0.0 ms]
>
>       [Free CSet: 1.7 ms]
>
>    [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap:
> 39.7G(45.0G)->14.6G(45.0G)]
>
>  [Times: user=1.37 sys=0.00, real=0.08 secs]
>
> 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated
> GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing
> CSet, _pending_cards: 56252, predicted base
>
> time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms]
>
>  201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet,
> eden: 807 regions, survivors: 10 regions, predicted young region time:
> 60.67 ms]
>
>  201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
> 807 regions, survivors: 10 regions, old: 0 regions, predicted pause time:
> 95.67 ms, target pause time: 200.00 ms]
>
>  201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
> region allocation request failed, allocation request: 3058176 bytes]
>
>  201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
> expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes]
>
>  201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
> heap already fully expanded]
>
>  201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle
> initiation, reason: occupancy higher than threshold, occupancy: 44291850240
> bytes, allocation request: 0 bytes, thres
>
> hold: 21743271900 bytes (45.00 %), source: end of GC]
>
>  (to-space exhausted), 9.0961593 secs]
>
>    [Parallel Time: 8209.7 ms, GC Workers: 28]
>
>       [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max:
> 201610864.4, Diff: 0.5]
>
>       [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6,
> Sum: 47.8]
>
>       [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum: 131.1]
>
>          [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum:
> 766]
>
>       [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1]
>
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5,
> Sum: 0.8]
>
>       [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff: 6.3,
> Sum: 229661.8]
>
>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 7.0]
>
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum:
> 2.8]
>
>       [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff:
> 0.6, Sum: 229858.3]
>
>       [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max:
> 201619073.5, Diff: 0.2]
>
>    [Code Root Fixup: 0.3 ms]
>
>    [Code Root Purge: 0.0 ms]
>
>    [Clear CT: 3.0 ms]
>
>    [Other: 883.1 ms]
>
>       [Evacuation Failure: 788.4 ms]
>
>       [Choose CSet: 0.0 ms]
>
>       [Ref Proc: 45.0 ms]
>
>       [Ref Enq: 0.6 ms]
>
>       [Redirty Cards: 1.4 ms]
>
>       [Humongous Reclaim: 0.1 ms]
>
>       [Free CSet: 0.6 ms]
>
>    [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap:
> 39.8G(45.0G)->44.1G(45.0G)]
>
>  [Times: user=46.07 sys=2.21, real=9.10 secs]
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150303/fbce5164/attachment.html>

From christopherberner at gmail.com  Wed Mar  4 20:56:06 2015
From: christopherberner at gmail.com (Christopher Berner)
Date: Wed, 4 Mar 2015 12:56:06 -0800
Subject: G1 "to-space exhausted" causes used heap space to increase?
In-Reply-To: <CAG7eTFpVjDSHRHfmv7qe+1O-Ye2Pu8Ob4KHoPgq96a5x-6wXwA@mail.gmail.com>
References: <CACeOMVvpc3GG+eUbk8h72qkuX8T9VpLuZ535XC=bnxrsFm8r=w@mail.gmail.com>
	<CAG7eTFpVjDSHRHfmv7qe+1O-Ye2Pu8Ob4KHoPgq96a5x-6wXwA@mail.gmail.com>
Message-ID: <CACeOMVuvP2=bX_GfgyOxf8gqQUCJhokdrrT_8xz94OHYX6LQ6g@mail.gmail.com>

Adjusting -XX:G1MaxNewSizePercent seemed to work better than changing the
target pause time, at least for us. Thanks for all the help!

On Tue, Mar 3, 2015 at 3:43 AM, Chi Ho Kwok <chkwok at digibites.nl> wrote:

> Hi,
>
> When there are live objects during an eden collection, they must be copied
> to a new, empty region. With a huge eden size, this may require more space
> than there is available, causing a to-space exhaustion. We always run with
> a fixed new generation size to avoid this kind of issues; when ergonomics
> think it can hit the pause time target with a very large eden, it will
> allocate a very large eden as it can be more efficient, that's a bit too
> unpredictable for us in production.
>
> We usually set a NewRatio of 4 to 10, when set to 4, the eden is fixed at
> 1/5th the size of the full heap or ~9GB. This also pretty much guarantee a
> tiny static eden collection pause, in your case, of ~21ms. (60/807*208
> regions). Your promotion failure happened when 25.5G produced 3.2G of
> survivors, with a eden of 9G, this should only be 1.2G, which shouldn't be
> any issue if the old collector is run regularly. The old collector is only
> triggered after a young collection by the way, so having them spaced closer
> (smaller eden -> full more quickly) gives it more chance to run and add
> almost empty regions to the next mixed gc run.
>
> Con: GC will run more often, with smaller pauses, and promote more objects
> to the old generation which require more work to collect (concurrent scan
> required). But as your collections run once per many minutes, this extra
> overhead is basically zero. Our prod young collectors run multiple times
> per second on a 4G eden, so you're not pushing the limits of the collector
> at all.
>
>
> Kind regards,
>
> --
> Chi Ho Kwok
> Digibites Technology
> chkwok at digibites.nl
>
> On 2 March 2015 at 18:44, Christopher Berner <christopherberner at gmail.com>
> wrote:
>
>> Hello,
>>
>> I work on the Presto project (https://github.com/facebook/presto) and am
>> trying to understand the behavior of G1. We run a 45GB heap on the worker
>> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except
>> that every day a few machines hit a "to-space exhausted" failure and either
>> dies due to an OutOfMemory error, or does a full gc with such a long pause
>> that it fails our health checks and is restarted by our service manager.
>>
>> Looking at the GC logs, the sequence of events is always the same. The
>> young gen is quite large (~50% of the heap), and every collection is fast,
>> but then it hits a "to-space exhausted" failure which appears to increase
>> heap used (see log below). After that the young gen is tiny and it never
>> recovers.
>>
>> Two questions: 1) why does heap used increase in the middle of the GC
>> cycle? 2) Looking at some of the logs it appears that it starts a full GC,
>> but also throws an OutOfMemory error concurrently (they show up a hundred
>> lines apart or so in stdout). Why would there be an OutOfMemory error
>> before the full GC finished?
>>
>> Thanks for any help!
>> Christopher
>>
>> 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated
>> GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing
>> CSet, _pending_cards: 16136, predicted base
>>
>> time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms]
>>
>>  199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 805 regions, survivors: 11 regions, predicted young region time:
>> 56.53 ms]
>>
>>  199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 805 regions, survivors: 11 regions, old: 0 regions, predicted pause
>> time: 86.82 ms, target pause time: 200.00 ms]
>>
>> , 0.0722119 secs]
>>
>>    [Parallel Time: 46.7 ms, GC Workers: 28]
>>
>>       [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max:
>> 199078407.5, Diff: 0.6]
>>
>>       [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1,
>> Sum: 39.7]
>>
>>       [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9]
>>
>>          [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum:
>> 182]
>>
>>       [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3]
>>
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4,
>> Sum: 0.7]
>>
>>       [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum:
>> 1155.3]
>>
>>       [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum:
>> 25.8]
>>
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2,
>> Sum: 3.9]
>>
>>       [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6,
>> Sum: 1289.7]
>>
>>       [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max:
>> 199078453.4, Diff: 0.2]
>>
>>    [Code Root Fixup: 0.3 ms]
>>
>>    [Code Root Purge: 0.0 ms]
>>
>>    [Clear CT: 3.0 ms]
>>
>>    [Other: 22.2 ms]
>>
>>       [Choose CSet: 0.0 ms]
>>
>>       [Ref Proc: 18.0 ms]
>>
>>       [Ref Enq: 0.5 ms]
>>
>>       [Redirty Cards: 0.9 ms]
>>
>>       [Humongous Reclaim: 0.0 ms]
>>
>>       [Free CSet: 1.7 ms]
>>
>>    [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap:
>> 39.7G(45.0G)->14.6G(45.0G)]
>>
>>  [Times: user=1.37 sys=0.00, real=0.08 secs]
>>
>> 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated
>> GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing
>> CSet, _pending_cards: 56252, predicted base
>>
>> time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms]
>>
>>  201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 807 regions, survivors: 10 regions, predicted young region time:
>> 60.67 ms]
>>
>>  201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 807 regions, survivors: 10 regions, old: 0 regions, predicted pause
>> time: 95.67 ms, target pause time: 200.00 ms]
>>
>>  201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
>> region allocation request failed, allocation request: 3058176 bytes]
>>
>>  201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
>> expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes]
>>
>>  201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
>> heap already fully expanded]
>>
>>  201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle
>> initiation, reason: occupancy higher than threshold, occupancy: 44291850240
>> bytes, allocation request: 0 bytes, thres
>>
>> hold: 21743271900 bytes (45.00 %), source: end of GC]
>>
>>  (to-space exhausted), 9.0961593 secs]
>>
>>    [Parallel Time: 8209.7 ms, GC Workers: 28]
>>
>>       [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max:
>> 201610864.4, Diff: 0.5]
>>
>>       [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6,
>> Sum: 47.8]
>>
>>       [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum:
>> 131.1]
>>
>>          [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum:
>> 766]
>>
>>       [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1]
>>
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5,
>> Sum: 0.8]
>>
>>       [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff:
>> 6.3, Sum: 229661.8]
>>
>>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum:
>> 7.0]
>>
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2,
>> Sum: 2.8]
>>
>>       [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff:
>> 0.6, Sum: 229858.3]
>>
>>       [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max:
>> 201619073.5, Diff: 0.2]
>>
>>    [Code Root Fixup: 0.3 ms]
>>
>>    [Code Root Purge: 0.0 ms]
>>
>>    [Clear CT: 3.0 ms]
>>
>>    [Other: 883.1 ms]
>>
>>       [Evacuation Failure: 788.4 ms]
>>
>>       [Choose CSet: 0.0 ms]
>>
>>       [Ref Proc: 45.0 ms]
>>
>>       [Ref Enq: 0.6 ms]
>>
>>       [Redirty Cards: 1.4 ms]
>>
>>       [Humongous Reclaim: 0.1 ms]
>>
>>       [Free CSet: 0.6 ms]
>>
>>    [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap:
>> 39.8G(45.0G)->44.1G(45.0G)]
>>
>>  [Times: user=46.07 sys=2.21, real=9.10 secs]
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150304/5abf2ce8/attachment.html>

From yu.zhang at oracle.com  Thu Mar  5 19:00:53 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Thu, 05 Mar 2015 11:00:53 -0800
Subject: G1GC, Java8u40ea, Metaspace questions
In-Reply-To: <54E5CD2B.7030201@finkzeit.at>
References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at>	<54DE3254.9030503@oracle.com>	<54DE41A7.6050004@finkzeit.at>
	<54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com>
	<54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com>
	<54E5CD2B.7030201@finkzeit.at>
Message-ID: <54F8A7E5.5080606@oracle.com>

Wolfgang,

Thanks for reporting this.  I can reproduce this behavior with a micro.
After consulting with Stefan and Jon, it is the current behavior.  For 
now you can keep MaxMetaspaceFreeRatio low to bring HWM down.  We might 
file an enhancement bug on this.

You do not need a mixed gc to clean metaspace.

Thanks,
Jenny

On 2/19/2015 3:46 AM, Wolfgang Pedot wrote:
> One more, something just came to me:
>
> Class unloading happens during the concurrent marking-cycle so the 
> mixed collects that would free up unused classloaders in oldGen happen 
> after that, right?
> That would mean the classes can only be cleaned up at the next cycle 
> and stay in Metaspace until then. My test causes only 
> Metaspace-triggered concurrent cycles so the garbage-collector is 
> always behind by one cycle and therefor the amount of classes that can 
> be unloaded can be different each time, irregardless of the percentage 
> of wasted heap. I guess I have to extend my test-scenario in a way 
> that also causes at least some heap-driven concurrent cycles and see 
> what happens then.
> Still does not explain why I hardly ever see HWM go down but it 
> explains some of my more confusing test-results...
>
> regards
> Wolfgang
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150305/109c86cc/attachment-0001.html>

From wolfgang.pedot at finkzeit.at  Thu Mar  5 20:41:34 2015
From: wolfgang.pedot at finkzeit.at (Wolfgang Pedot)
Date: Thu, 05 Mar 2015 21:41:34 +0100
Subject: G1GC, Java8u40ea, Metaspace questions
In-Reply-To: <54F8A7E5.5080606@oracle.com>
References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at>	<54DE3254.9030503@oracle.com>	<54DE41A7.6050004@finkzeit.at>
	<54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com>
	<54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com>
	<54E5CD2B.7030201@finkzeit.at> <54F8A7E5.5080606@oracle.com>
Message-ID: <54F8BF7E.2000805@finkzeit.at>

Jenny,

thanks for getting back to me with this info. I think I found a good 
setting for now and I am letting a smaller system run with that under 
more normal use (most concurrent cycles triggered by heap with only some 
Metaspace-spikes).
Definetly looking forward to use this "for real" after 8u40 is released.

As for my thoughts below:
As far as I know otherwise unused Classes are kept alive by their 
ClassLoaders which are stored in the heap, right? So if Classloaders get 
promoted to oldGen mixedGCs are required to clean them up before the 
Classes can be unloaded in the next concurrent cycle. That would explain 
why it usually takes an additional concurrent cycle (triggered by 
heap-occupation) after a spike of class generation before Metaspace 
usage returns to normal. Or maybe stuff that keeps the ClassLoaders 
alive needs to be collected first...

regards
Wolfgang


Am 05.03.2015 20:00, schrieb Yu Zhang:
> Wolfgang,
>
> Thanks for reporting this.  I can reproduce this behavior with a micro.
> After consulting with Stefan and Jon, it is the current behavior.  For 
> now you can keep MaxMetaspaceFreeRatio low to bring HWM down.  We 
> might file an enhancement bug on this.
>
> You do not need a mixed gc to clean metaspace.
> Thanks,
> Jenny
> On 2/19/2015 3:46 AM, Wolfgang Pedot wrote:
>> One more, something just came to me:
>>
>> Class unloading happens during the concurrent marking-cycle so the 
>> mixed collects that would free up unused classloaders in oldGen 
>> happen after that, right?
>> That would mean the classes can only be cleaned up at the next cycle 
>> and stay in Metaspace until then. My test causes only 
>> Metaspace-triggered concurrent cycles so the garbage-collector is 
>> always behind by one cycle and therefor the amount of classes that 
>> can be unloaded can be different each time, irregardless of the 
>> percentage of wasted heap. I guess I have to extend my test-scenario 
>> in a way that also causes at least some heap-driven concurrent cycles 
>> and see what happens then.
>> Still does not explain why I hardly ever see HWM go down but it 
>> explains some of my more confusing test-results...
>>
>> regards
>> Wolfgang
>>
>


-- 
Mit freundlichen Gr??en
Wolfgang Pedot
F&E
?????????????????
Fink Zeitsysteme GmbH | M?slestra?e 19-21 | 6844 Altach | ?sterreich
Tel: +43 5576 72388 | Fax: +43 5576 72388 14
wolfgang.pedot at finkzeit.at | www.finkzeit.at

Landesgericht Feldkirch, 72223k | USt.ld: ATU36401407

Wir erbringen unsere Leistungen ausschlie?lich auf Basis unserer AGB und Leistungs- und Nutzungsvereinbarung, die wir auf unserer Webseite unter www.finkzeit.at/rechtliches ver?ffentlicht haben.


From stefan.karlsson at oracle.com  Fri Mar  6 08:14:32 2015
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 06 Mar 2015 09:14:32 +0100
Subject: G1GC, Java8u40ea, Metaspace questions
In-Reply-To: <54F8BF7E.2000805@finkzeit.at>
References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at>	<54DE3254.9030503@oracle.com>	<54DE41A7.6050004@finkzeit.at>
	<54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com>
	<54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com>
	<54E5CD2B.7030201@finkzeit.at> <54F8A7E5.5080606@oracle.com>
	<54F8BF7E.2000805@finkzeit.at>
Message-ID: <54F961E8.1010000@oracle.com>

Hi Wolfgang,

On 2015-03-05 21:41, Wolfgang Pedot wrote:
> Jenny,
>
> thanks for getting back to me with this info. I think I found a good 
> setting for now and I am letting a smaller system run with that under 
> more normal use (most concurrent cycles triggered by heap with only 
> some Metaspace-spikes).
> Definetly looking forward to use this "for real" after 8u40 is released.

8u40 has now been released.

>
> As for my thoughts below:
> As far as I know otherwise unused Classes are kept alive by their 
> ClassLoaders which are stored in the heap, right?

There are different ways to hold classes alive:
1) You have a live reference to the java.lang.ClassLoader (or subclass) 
object.
2) You have a live reference to any of the java.lang.Class objects 
belonging to the ClassLoader.
3) You have an instance of a class, that is described by any of the 
java.lang.Class objects belonging to the ClassLoader.
4) You have a "dependency" between a class in another ClassLoader, 
refering to a class in the ClassLoader that is kept alive. E.g. from 
class resolution in the constant pool, super classes, interfaces, JSR 
292 specific code.

You have to break all of these chains before your classes and class 
loader will be eligible for class unloading.

> So if Classloaders get promoted to oldGen mixedGCs are required to 
> clean them up before the Classes can be unloaded in the next 
> concurrent cycle. That would explain why it usually takes an 
> additional concurrent cycle (triggered by heap-occupation) after a 
> spike of class generation before Metaspace usage returns to normal. Or 
> maybe stuff that keeps the ClassLoaders alive needs to be collected 
> first...

We only require one marking cycle to clean out metadata. Maybe something 
is holding references to your class loader, classes, instances, but then 
gets cleaned out during the second GC. Things to look out for are, for 
example, SoftReferences and Finalizers.

After the remark phase, after the concurrent marking phase, we have 
enough information to unload the classes. Most of the JVM internal data 
structures are cleaned out during the remark phase, the actual metaspace 
memory is handed back during the cleanup phase.

If the JVM manages to clean out an entire "virtual space area" of 
metadata, the memory will be hand back to the OS and the amount of 
committed memory will be decreased. If not, it puts the committed memory 
onto the free lists so that it can be used by other metaspaces.

StefanK

>
> regards
> Wolfgang
>
>
>
> Am 05.03.2015 20:00, schrieb Yu Zhang:
>> Wolfgang,
>>
>> Thanks for reporting this.  I can reproduce this behavior with a micro.
>> After consulting with Stefan and Jon, it is the current behavior.  
>> For now you can keep MaxMetaspaceFreeRatio low to bring HWM down.  We 
>> might file an enhancement bug on this.
>>
>> You do not need a mixed gc to clean metaspace.
>> Thanks,
>> Jenny
>> On 2/19/2015 3:46 AM, Wolfgang Pedot wrote:
>>> One more, something just came to me:
>>>
>>> Class unloading happens during the concurrent marking-cycle so the 
>>> mixed collects that would free up unused classloaders in oldGen 
>>> happen after that, right?
>>> That would mean the classes can only be cleaned up at the next cycle 
>>> and stay in Metaspace until then. My test causes only 
>>> Metaspace-triggered concurrent cycles so the garbage-collector is 
>>> always behind by one cycle and therefor the amount of classes that 
>>> can be unloaded can be different each time, irregardless of the 
>>> percentage of wasted heap. I guess I have to extend my test-scenario 
>>> in a way that also causes at least some heap-driven concurrent 
>>> cycles and see what happens then.
>>> Still does not explain why I hardly ever see HWM go down but it 
>>> explains some of my more confusing test-results...
>>>
>>> regards
>>> Wolfgang
>>>
>>
>
>


From narmak101 at gmail.com  Tue Mar 24 21:48:51 2015
From: narmak101 at gmail.com (Kamran Khawaja)
Date: Tue, 24 Mar 2015 17:48:51 -0400
Subject: Using G1 with Apache Solr
Message-ID: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>

I'm running Solr 4.7.2 with Java 7u75 with the following JVM params:

-verbose:gc
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintAdaptiveSizePolicy
-XX:+PrintReferenceGC
-Xmx3072m
-Xms3072m
-XX:+UseG1GC
-XX:+UseLargePages
-XX:+AggressiveOpts
-XX:+ParallelRefProcEnabled
-XX:G1HeapRegionSize=8m
-XX:InitiatingHeapOccupancyPercent=35


What I'm currently seeing is that many of the gc pauses are under an
acceptable 0.25 seconds but seeing way too many full GCs with an average
stop time of 3.2 seconds.

You can find the gc logs here:
https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0

I initially tested without specifying the HeapRegionSize but that resulted
in the "humongous" message in the gc logs and a ton of full gc pauses.

Any pointers or areas to further investigate would be appreciated.


Thanks,
--
Kam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150324/ee9efbba/attachment.html>

From java at elyograg.org  Wed Mar 25 06:47:34 2015
From: java at elyograg.org (Shawn Heisey)
Date: Wed, 25 Mar 2015 00:47:34 -0600
Subject: Using G1 with Apache Solr
In-Reply-To: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
References: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
Message-ID: <55125A06.7080107@elyograg.org>

On 3/24/2015 3:48 PM, Kamran Khawaja wrote:
> I'm running Solr 4.7.2 with Java 7u75 with the following JVM params:
> 
>     -verbose:gc 
>     -XX:+PrintGCDateStamps 
>     -XX:+PrintGCDetails 
>     -XX:+PrintAdaptiveSizePolicy 
>     -XX:+PrintReferenceGC 
>     -Xmx3072m 
>     -Xms3072m 
>     -XX:+UseG1GC 
>     -XX:+UseLargePages 
>     -XX:+AggressiveOpts 
>     -XX:+ParallelRefProcEnabled 
>     -XX:G1HeapRegionSize=8m 
>     -XX:InitiatingHeapOccupancyPercent=35 
> 
> 
> What I'm currently seeing is that many of the gc pauses are under an
> acceptable 0.25 seconds but seeing way too many full GCs with an average
> stop time of 3.2 seconds.
> 
> You can find the gc logs
> here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0
> 
> I initially tested without specifying the HeapRegionSize but that
> resulted in the "humongous" message in the gc logs and a ton of full gc
> pauses.

When I replied the first time, I only sent it to Kamran.  I quickly
realized that I'd made that error, but I did not remember that the
original message was on this list, so I sent the reply again, assuming
that I saw the original on the solr-user mailing list.  Now I am
bringing the silliness full-circle by sending the same reply here.

Some additional info:

When I initially brought my settings up on this list a few months ago, I
got the recommendation to try changing InitiatingHeapOccupancyPercent to
70-75 from the default of 45 ... so setting it to 35 might not be the
best idea.  I do currently have it set to 75 (not reflected on the
wiki), but I haven't done any further analysis.

I have now upgraded java on those machines to 8u40 with the following
settings, I hope to have a useful gc.log soon for comparison purposes.

-XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m
-XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75
-XX:+UseLargePages -XX:+AggressiveOpts


---- original reply ----

This is similar to the settings I've been working on that I've
documented on my wiki page, with better results than you are seeing, and
a larger heap than you have configured:

https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector

You have one additional option that I don't --
InitiatingHeapOccupancyPercent.  I would suggest running without that
option to see how it affects your GC times.

I'm curious what OS you're running under, whether the OS and Java are
64-bit, and whether you have actually enabled huge pages in your
operating system.  If it's Linux and you have enabled huge pages, have
you turned off transparent huge pages as documented by Oracle:

https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge

On my servers, I do *not* have huge pages configured in the operating
system, so the UseLargePages java option isn't doing anything.

One final thing ... Oracle developers have claimed that Java 8u40 has
some major improvements to the G1 collector, particularly for programs
that allocate very large objects.  Can you try 8u40?

Thanks,
Shawn

From thomas.schatzl at oracle.com  Wed Mar 25 14:28:12 2015
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 25 Mar 2015 15:28:12 +0100
Subject: Using G1 with Apache Solr
In-Reply-To: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
References: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
Message-ID: <1427293692.3163.34.camel@oracle.com>

Hi Kamran,

On Tue, 2015-03-24 at 17:48 -0400, Kamran Khawaja wrote:
> I'm running Solr 4.7.2 with Java 7u75 with the following JVM params:
>         -verbose:gc 
>         -XX:+PrintGCDateStamps 
>         -XX:+PrintGCDetails 
>         -XX:+PrintAdaptiveSizePolicy 
>         -XX:+PrintReferenceGC 
>         -Xmx3072m 
>         -Xms3072m 
>         -XX:+UseG1GC 
>         -XX:+UseLargePages 
>         -XX:+AggressiveOpts 
>         -XX:+ParallelRefProcEnabled 
>         -XX:G1HeapRegionSize=8m 
>         -XX:InitiatingHeapOccupancyPercent=35 

> What I'm currently seeing is that many of the gc pauses are under an
> acceptable 0.25 seconds but seeing way too many full GCs with an
> average stop time of 3.2 seconds.
> 
> You can find the gc logs
> here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0

> 
> I initially tested without specifying the HeapRegionSize but that
> resulted in the "humongous" message in the gc logs and a ton of full
> gc pauses.
> 
> Any pointers or areas to further investigate would be appreciated.

The problem seems to be somewhat inconsistent survival rate in the young
gen. Most of the time, >5% of the young gen survives, while every now
and then >33% (or more) survives.

Just before these full gcs the heap seems already be fairly full, and
the existing mechanisms can not handle this.

There are a few things you could try:
- disable PLAB resizing (-XX:-ResizePLAB), as this may decrease the
amount of space that is actually required for copying.
- increase the evacuation reserve (-XX:G1ReservePercent=15; default is
10) which purpose is exactly some safety buffer for such cases.
- cap the maximum young generation size, so that even when a large part
of the young generation survives, this part is not that big. E.g.
G1MaxNewSizePercent=25 (which limits young gen size to 768M which seems
okay to me; default is 60; you also need to set -XX:
+UnlockExperimentalVMOptions in front of that)

Thanks,
  Thomas


From narmak101 at gmail.com  Wed Mar 25 18:05:46 2015
From: narmak101 at gmail.com (Kamran Khawaja)
Date: Wed, 25 Mar 2015 14:05:46 -0400
Subject: Using G1 with Apache Solr
In-Reply-To: <55125A06.7080107@elyograg.org>
References: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
	<55125A06.7080107@elyograg.org>
Message-ID: <CAK0ukBSEjzhzr5THFgzjH=ekh8wVRqCy+30D0qdRfAMrHXd6Kg@mail.gmail.com>

Solr is being run on a CentOS 7 server.  Both the os and java are 64 bit.
I see that THP is enabled on the server.
I'll have to discuss with the rest of my team about disabling THP and
upgrading to java 8 but I'll post back when I have some results from my
testing.


Thanks,

--
Kamran Khawaja


On Wed, Mar 25, 2015 at 2:47 AM, Shawn Heisey <java at elyograg.org> wrote:

> On 3/24/2015 3:48 PM, Kamran Khawaja wrote:
> > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params:
> >
> >     -verbose:gc
> >     -XX:+PrintGCDateStamps
> >     -XX:+PrintGCDetails
> >     -XX:+PrintAdaptiveSizePolicy
> >     -XX:+PrintReferenceGC
> >     -Xmx3072m
> >     -Xms3072m
> >     -XX:+UseG1GC
> >     -XX:+UseLargePages
> >     -XX:+AggressiveOpts
> >     -XX:+ParallelRefProcEnabled
> >     -XX:G1HeapRegionSize=8m
> >     -XX:InitiatingHeapOccupancyPercent=35
> >
> >
> > What I'm currently seeing is that many of the gc pauses are under an
> > acceptable 0.25 seconds but seeing way too many full GCs with an average
> > stop time of 3.2 seconds.
> >
> > You can find the gc logs
> > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0
> >
> > I initially tested without specifying the HeapRegionSize but that
> > resulted in the "humongous" message in the gc logs and a ton of full gc
> > pauses.
>
> When I replied the first time, I only sent it to Kamran.  I quickly
> realized that I'd made that error, but I did not remember that the
> original message was on this list, so I sent the reply again, assuming
> that I saw the original on the solr-user mailing list.  Now I am
> bringing the silliness full-circle by sending the same reply here.
>
> Some additional info:
>
> When I initially brought my settings up on this list a few months ago, I
> got the recommendation to try changing InitiatingHeapOccupancyPercent to
> 70-75 from the default of 45 ... so setting it to 35 might not be the
> best idea.  I do currently have it set to 75 (not reflected on the
> wiki), but I haven't done any further analysis.
>
> I have now upgraded java on those machines to 8u40 with the following
> settings, I hope to have a useful gc.log soon for comparison purposes.
>
> -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m
> -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75
> -XX:+UseLargePages -XX:+AggressiveOpts
>
>
> ---- original reply ----
>
> This is similar to the settings I've been working on that I've
> documented on my wiki page, with better results than you are seeing, and
> a larger heap than you have configured:
>
> https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector
>
> You have one additional option that I don't --
> InitiatingHeapOccupancyPercent.  I would suggest running without that
> option to see how it affects your GC times.
>
> I'm curious what OS you're running under, whether the OS and Java are
> 64-bit, and whether you have actually enabled huge pages in your
> operating system.  If it's Linux and you have enabled huge pages, have
> you turned off transparent huge pages as documented by Oracle:
>
>
> https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge
>
> On my servers, I do *not* have huge pages configured in the operating
> system, so the UseLargePages java option isn't doing anything.
>
> One final thing ... Oracle developers have claimed that Java 8u40 has
> some major improvements to the G1 collector, particularly for programs
> that allocate very large objects.  Can you try 8u40?
>
> Thanks,
> Shawn
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150325/d56094ec/attachment.html>

From charlie.hunt at oracle.com  Wed Mar 25 20:24:38 2015
From: charlie.hunt at oracle.com (charlie hunt)
Date: Wed, 25 Mar 2015 15:24:38 -0500
Subject: Using G1 with Apache Solr
In-Reply-To: <CAK0ukBSEjzhzr5THFgzjH=ekh8wVRqCy+30D0qdRfAMrHXd6Kg@mail.gmail.com>
References: <CAK0ukBQpi-rq9DNpTSeYpoZY=Hgsqhuw1THb2U0Kptv=xVJ0ow@mail.gmail.com>
	<55125A06.7080107@elyograg.org>
	<CAK0ukBSEjzhzr5THFgzjH=ekh8wVRqCy+30D0qdRfAMrHXd6Kg@mail.gmail.com>
Message-ID: <85F3242A-B523-49FE-BAF2-1F710D9BDC94@oracle.com>

If on Linux, most definitely disable THP (transparent huge pages). You will likely not have a good experience with any GC with THP enabled.

charlie

> On Mar 25, 2015, at 1:05 PM, Kamran Khawaja <narmak101 at gmail.com> wrote:
> 
> Solr is being run on a CentOS 7 server.  Both the os and java are 64 bit.  I see that THP is enabled on the server.
> I'll have to discuss with the rest of my team about disabling THP and upgrading to java 8 but I'll post back when I have some results from my testing.
> 
> 
> Thanks,
> 
> --
> Kamran Khawaja
> 
> 
> On Wed, Mar 25, 2015 at 2:47 AM, Shawn Heisey <java at elyograg.org <mailto:java at elyograg.org>> wrote:
> On 3/24/2015 3:48 PM, Kamran Khawaja wrote:
> > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params:
> >
> >     -verbose:gc
> >     -XX:+PrintGCDateStamps
> >     -XX:+PrintGCDetails
> >     -XX:+PrintAdaptiveSizePolicy
> >     -XX:+PrintReferenceGC
> >     -Xmx3072m
> >     -Xms3072m
> >     -XX:+UseG1GC
> >     -XX:+UseLargePages
> >     -XX:+AggressiveOpts
> >     -XX:+ParallelRefProcEnabled
> >     -XX:G1HeapRegionSize=8m
> >     -XX:InitiatingHeapOccupancyPercent=35
> >
> >
> > What I'm currently seeing is that many of the gc pauses are under an
> > acceptable 0.25 seconds but seeing way too many full GCs with an average
> > stop time of 3.2 seconds.
> >
> > You can find the gc logs
> > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 <https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0>
> >
> > I initially tested without specifying the HeapRegionSize but that
> > resulted in the "humongous" message in the gc logs and a ton of full gc
> > pauses.
> 
> When I replied the first time, I only sent it to Kamran.  I quickly
> realized that I'd made that error, but I did not remember that the
> original message was on this list, so I sent the reply again, assuming
> that I saw the original on the solr-user mailing list.  Now I am
> bringing the silliness full-circle by sending the same reply here.
> 
> Some additional info:
> 
> When I initially brought my settings up on this list a few months ago, I
> got the recommendation to try changing InitiatingHeapOccupancyPercent to
> 70-75 from the default of 45 ... so setting it to 35 might not be the
> best idea.  I do currently have it set to 75 (not reflected on the
> wiki), but I haven't done any further analysis.
> 
> I have now upgraded java on those machines to 8u40 with the following
> settings, I hope to have a useful gc.log soon for comparison purposes.
> 
> -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m
> -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75
> -XX:+UseLargePages -XX:+AggressiveOpts
> 
> 
> ---- original reply ----
> 
> This is similar to the settings I've been working on that I've
> documented on my wiki page, with better results than you are seeing, and
> a larger heap than you have configured:
> 
> https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector <https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector>
> 
> You have one additional option that I don't --
> InitiatingHeapOccupancyPercent.  I would suggest running without that
> option to see how it affects your GC times.
> 
> I'm curious what OS you're running under, whether the OS and Java are
> 64-bit, and whether you have actually enabled huge pages in your
> operating system.  If it's Linux and you have enabled huge pages, have
> you turned off transparent huge pages as documented by Oracle:
> 
> https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge <https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge>
> 
> On my servers, I do *not* have huge pages configured in the operating
> system, so the UseLargePages java option isn't doing anything.
> 
> One final thing ... Oracle developers have claimed that Java 8u40 has
> some major improvements to the G1 collector, particularly for programs
> that allocate very large objects.  Can you try 8u40?
> 
> Thanks,
> Shawn
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150325/bf4b6b59/attachment-0001.html>

From ejones at twitter.com  Thu Mar 26 15:04:06 2015
From: ejones at twitter.com (Evan Jones)
Date: Thu, 26 Mar 2015 11:04:06 -0400
Subject: GC / safepoint pauses consuming more real time than user plus system
Message-ID: <CAG4wsO6LmL=qfLbjB5ymNWw=guMZAd+csJ_4=rYWDc1JQU2kwg@mail.gmail.com>

I finally figured out the source of a problematic garbage collection pauses
that take more real time than user plus system time, on our systems that
are otherwise unloaded: It turns out that writes to the mmap-ed hsperfdata
file can block when the system is under heavy disk IO. Since safepoint and
GC threads increment counters in this file, it causes long safepoint and
garbage collection pauses.

In case anyone ever observes pauses that look like this, you may want to
add the -XX:+PerfDisableSharedMem JVM flag and see if that resolves them.
It has worked for our services.

See the following for more detail:
http://www.evanjones.ca/jvm-mmap-pause.html


Here is an example "suspicious" pause. I was seeing many of these, across
basically all of Twitter's services, which caused me to investigate the
issue.

2014-12-10T12:38:44.419+0000: 58758.830: [GC (Allocation Failure)[ParNew:
11868438K->103534K(13212096K), 0.7651580 secs]
  12506389K->741669K(17406400K), 0.7652510 secs] [Times: user=0.36
sys=0.01, real=0.77 secs]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150326/c32893cd/attachment.html>

From gabi_io at yahoo.com  Fri Mar 27 15:30:30 2015
From: gabi_io at yahoo.com (Medan Gavril)
Date: Fri, 27 Mar 2015 15:30:30 +0000 (UTC)
Subject: G1 root cause and tuning
Message-ID: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>

Hi ,
I saw your G1 presentation and I found it good and interesting. I am new to G1 tuning and I would need you suggestions if you ?have time.
In our app, when we have a FULL GC :? ? 1. it restarts the application? ? 2. we cannot get the right data to understand the root cause
We switched from CMS to G1 in order to avoid long FULL GCs.
JRE 1.17 update 17 it is being used.
GC params:
wrapper.java.additional.1=-serverwrapper.java.additional.2=-XX:+PrintCommandLineFlagswrapper.java.additional.3=-XX:+UseG1GC
wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500
wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffewrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffewrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryErrorwrapper.java.additional.11=-verbose:gcwrapper.java.additional.12=-XX:+PrintGCDetailswrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR%
wrapper.java.additional.52=-XX:+PrintGCTimeStampswrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTimewrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy
The error from wrapper log is:
INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ? ?[Eden: 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 8127M(22480M)->7436M(22480M)]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?[Times: user=1.51 sys=0.02, real=0.19 secs]?INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | Total time for which application threads were stopped: 0.2031307 secondsINFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | 93238.285: [Full GCERROR ?| wrapper ?| 2015/03/25 15:25:57.694 | JVM appears hung: Timed out waiting for signal from JVM.ERROR ?| wrapper ?| 2015/03/25 15:25:58.021 | JVM did not exit?

INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.335: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 10603200512 bytes, allocation request: 14584896 bytes, threshold: 10607394780 bytes (45.00 %), source: concurrent humongous allocation]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.337: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.337: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | 92696.337: [GC pause (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, target pause time: 2500.00 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.338: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 regions, survivors: 8 regions, predicted young region time: 32.04 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.338: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 regions, survivors: 8 regions, old: 0 regions, predicted pause time: 197.80 ms, target pause time: 2500.00 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.398 | ?(initial-mark), 0.15117107 secs]
We increased the wrapper timeout but still no useful data about the FULL GC.
Any suggestion is highly appreciated. Currently I suggested to add ?"PrintHeapAtGCExtended "
Best Regards,
Gabi Medan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150327/407c0191/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.zip
Type: application/octet-stream
Size: 1295341 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150327/407c0191/logs-0001.zip>

From yu.zhang at oracle.com  Fri Mar 27 23:18:31 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Fri, 27 Mar 2015 16:18:31 -0700
Subject: G1 root cause and tuning
In-Reply-To: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <5515E547.2060901@oracle.com>

Medan,

I could not find the humongous allocation in the logs you attached.  But 
from the snip you provided, it seems the humongous objects allocation 
(the bigger one is ~10g) might be the issue.
If you can provide a cleaner gc log( with -Xloggc:gc.log) without the 
wrapper information, it would be easier to analyze.

Thanks,
Jenny

On 3/27/2015 8:30 AM, Medan Gavril wrote:
> Hi ,
>
> I saw your G1 presentation and I found it good and interesting. I am 
> new to G1 tuning and I would need you suggestions if you  have time.
>
> In our app, when we have a FULL GC :
>     1. it restarts the application
>     2. we cannot get the right data to understand the root cause
>
> We switched from CMS to G1 in order to avoid long FULL GCs.
>
> JRE 1.17 update 17 it is being used.
>
> GC params:
>
> wrapper.java.additional.1=-server
> wrapper.java.additional.2=-XX:+PrintCommandLineFlags
> wrapper.java.additional.3=-XX:+UseG1GC
> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500
> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe
> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe
> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError
> wrapper.java.additional.11=-verbose:gc
> wrapper.java.additional.12=-XX:+PrintGCDetails
> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR%
>
> wrapper.java.additional.52=-XX:+PrintGCTimeStamps
> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime
> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy
>
> The error from wrapper log is:
>
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |    [Eden: 
> 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 
> 8127M(22480M)->7436M(22480M)]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  [Times: user=1.51 
> sys=0.02, real=0.19 secs] /
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous 
> allocation request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 | Total time for which 
> application threads were stopped: 0.2031307 seconds/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous 
> allocation request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation 
> request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> */INFO   | jvm 1    | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/*
> /ERROR  | wrapper  | 2015/03/25 15:25:57.694 | JVM appears hung: 
> *Timed out waiting for signal from JVM.*/
> /ERROR  | wrapper  | 2015/03/25 15:25:58.021 | JVM did not exit /
>
>
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.335: 
> [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, 
> reason: occupancy higher than threshold, occupancy: 10603200512 bytes, 
> allocation request: 14584896 bytes, threshold: 10607394780 bytes 
> (45.00 %), source: concurrent humongous allocation]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
> [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, 
> reason: requested by GC cause, GC cause: G1 Humongous Allocation]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
> [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: 
> concurrent cycle initiation requested]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 | 92696.337: [GC pause 
> (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing 
> CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, 
> target pause time: 2500.00 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
> [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 
> regions, survivors: 8 regions, predicted young region time: 32.04 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
> [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 
> regions, survivors: 8 regions, old: 0 regions, predicted pause time: 
> 197.80 ms, target pause time: 2500.00 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.398 |  (initial-mark), 
> 0.15117107 secs]/
> /
> /
> /We increased the wrapper timeout but still no useful data about the 
> FULL GC./
> /
> /
> /Any suggestion is highly appreciated. Currently I suggested to add 
>  "PrintHeapAtGCExtended "/
> /
> /
> /Best Regards,
> Gabi Medan/
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150327/bdb33c7c/attachment.html>

From rmohta.coder at gmail.com  Mon Mar 30 09:47:05 2015
From: rmohta.coder at gmail.com (Rohit Mohta)
Date: Mon, 30 Mar 2015 10:47:05 +0100
Subject: Java 7 Default GC for server
Message-ID: <CAM69f3ywxhxs-4nSGULYOkpmDjiXtH=Q1f9XnY4bzyj2_a7Q7A@mail.gmail.com>

Hi All,


   I have quite some questions around default GC in server mode

(a) Is the formula for calculating number GC threads is
3 + (5 * cores/8)

(b) In the GC logs, I can see the below lines printed even when the
application is idle. Is this something to do with JIT or some other JVM
internal operation?

2015-03-05T14:42:18.320+0000: 520807.126: Total time for which application
threads were stopped: 0.0000500 seconds
2015-03-05T14:42:18.320+0000: 520807.126: Application time: 0.0000240
seconds
2015-03-05T14:42:18.320+0000: 520807.126: Total time for which application
threads were stopped: 0.0000500 seconds
2015-03-05T14:42:58.405+0000: 520847.212: Application time: 40.0857170
seconds
2015-03-05T14:42:58.406+0000: 520847.212: Total time for which application
threads were stopped: 0.0001980 seconds
2015-03-05T14:42:58.406+0000: 520847.212: Application time: 0.0000250
seconds
2015-03-05T14:42:58.406+0000: 520847.212: Total time for which application
threads were stopped: 0.0000520 seconds
2015-03-05T14:43:28.406+0000: 520877.213: Application time: 30.0001550
second

(c) We have about 15 JVM's on a single server. Linux Server has 24 cores
and about 37GB of RAM. When we restart all the JVM's, they start with a
good heap size, about 700MB+. And we have no issues with that. After a day
or so, some of the processes drop down to less than 100MB of heap size and
they start doing very frequent minor and major GC. We have lot of unused
memory on the server.

Why won't GC cause expansion?
I know we can set a Xms to a minimum value, but we are curious to know why
few of them go from 750MB to 100MB, whereas some of them stay around 500MB.
Is this has to do with SizeIncrement, SizeSupplement or AdaptiveeSize
values?

Thanks,
Rohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/4c1c8c8f/attachment.html>

From rmohta.coder at gmail.com  Mon Mar 30 09:54:37 2015
From: rmohta.coder at gmail.com (Rohit Mohta)
Date: Mon, 30 Mar 2015 10:54:37 +0100
Subject: Java 7 Print Tenuring Distribution
Message-ID: <CAM69f3xo28EVvwa0XBGfUTCRL6hCtXBr4zvo51ZuwucE-m8bBQ@mail.gmail.com>

Hi,

  We are using JDK 7 in server mode. There is no explicit GC configuration,
so it's using default GC collector. We are trying to print the tenuring
distribution in the logs, but it won't print.

We have tried +PrintTenuringDistribution and also
-PrintTenuringDistribution, but neither work. Is this not configured to
work with Default Parallel GC?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/8794c1d1/attachment.html>

From jon.masamitsu at oracle.com  Mon Mar 30 20:59:51 2015
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 30 Mar 2015 13:59:51 -0700
Subject: Java 7 Default GC for server
In-Reply-To: <CAM69f3ywxhxs-4nSGULYOkpmDjiXtH=Q1f9XnY4bzyj2_a7Q7A@mail.gmail.com>
References: <CAM69f3ywxhxs-4nSGULYOkpmDjiXtH=Q1f9XnY4bzyj2_a7Q7A@mail.gmail.com>
Message-ID: <5519B947.4020906@oracle.com>


On 03/30/2015 02:47 AM, Rohit Mohta wrote:
> Hi All,
>
>    I have quite some questions around default GC in server mode
>
> (a) Is the formula for calculating number GC threads is
> 3 + (5 * cores/8)

For N hardware threads for N <= 8, GC threads = N
For N > 8,  8 + (N-8) * 5 / 8
>
> (b) In the GC logs, I can see the below lines printed even when the 
> application is idle. Is this something to do with JIT or some other 
> JVM internal operation?

Yes, some other (than GC) JVM operation that requires a safepoint.
>
> 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which 
> application threads were stopped: 0.0000500 seconds
> 2015-03-05T14:42:18.320+0000: 520807.126: Application time: 0.0000240 
> seconds
> 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which 
> application threads were stopped: 0.0000500 seconds
> 2015-03-05T14:42:58.405+0000: 520847.212: Application time: 40.0857170 
> seconds
> 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which 
> application threads were stopped: 0.0001980 seconds
> 2015-03-05T14:42:58.406+0000: 520847.212: Application time: 0.0000250 
> seconds
> 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which 
> application threads were stopped: 0.0000520 seconds
> 2015-03-05T14:43:28.406+0000: 520877.213: Application time: 30.0001550 
> second
>
> (c) We have about 15 JVM's on a single server. Linux Server has 24 
> cores and about 37GB of RAM. When we restart all the JVM's, they start 
> with a good heap size, about 700MB+. And we have no issues with that. 
> After a day or so, some of the processes drop down to less than 100MB 
> of heap size and they start doing very frequent minor and major GC. We 
> have lot of unused memory on the server.

I don't recall seeing that happen before.  Maybe another on the list has 
an idea.

> Why won't GC cause expansion?
> I know we can set a Xms to a minimum value, but we are curious to know 
> why few of them go from 750MB to 100MB, whereas some of them stay 
> around 500MB. Is this has to do with SizeIncrement, SizeSupplement or 
> AdaptiveeSize values?

Try -XX:+PrintAdaptiveSizePolicy and see it that tells you why the heap is
not growing.

Jon

>
> Thanks,
> Rohit
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/2bf36697/attachment.html>

From yu.zhang at oracle.com  Tue Mar 31 00:19:54 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Mon, 30 Mar 2015 17:19:54 -0700
Subject: G1 root cause and tuning
In-Reply-To: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <5519E82A.5050808@oracle.com>

Medan,

Thanks for the logs.  The log messages are somewhat mangled, some of the 
records are not complete.
There is 1 Full gc in wrapper.log.14.  Others do not have full gc.  This 
workload has a lot of humongous objects allocations, up to 10g size.  
Though g1 can reclaim some humongous objects at young gc, a lot of 
reclamations are done during full gc.

But this log snip does not make sense to me.

152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet 
Construction) start choosing CSet, predicted base time: 115.49 ms, 
remaining time: 2384.51 ms, target pause time: 2500.00 ms]
  152549.805: [G1Ergonomics (CSet Construction) add young regions to 
CSet, eden: 155 regions, survivors: 23 regions, predicted young region 
time: 71.62 ms]
  152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, 
eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted 
pause time: 187.12 ms, target pause time: 2500.00 ms]
, 0.12006414 secs]
    [Parallel Time:  93.1 ms]
       [GC Worker Start (ms):  152549804.8  152549804.8 152549804.8  
152549804.8  152549804.8  152549804.8 152549804.8  152549804.8  
152549804.8  152549804.9 152549804.9  152549805.0  152549805.0
        Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff:   0.2]
       [Ext Root Scanning (ms):  13.3  13.4  18.6  13.2  17.0 18.5  
15.3  17.2  14.8  0.1  17.0  17.2  11.4
        Avg:  14.4, Min:   0.1, Max:  18.6, Diff:  18.5]
       [Update RS (ms):  51.9  52.5  48.9  52.0  51.3  48.4 50.8  50.0  
50.7  47.1  49.1  49.6  52.2
*       Avg:  50.3, Min:  47.1, Max:  52.5, Diff:   5.4]
          [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24
           Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21]
       [Scan RS (ms):  0.3  0.0  0.0  0.2  0.0  0.0  0.0 0.0  0.0  0.0  
0.0  0.0  0.0
        Avg:   0.1, Min:   0.0, Max:   0.3, Diff:   0.3]
       [Object Copy (ms):  21.7  21.4  19.7  21.8  18.9 20.3  21.1  
20.0  21.6  18.2  21.0  20.2  23.5
        Avg:  20.7, Min:  18.2, Max:  23.5, Diff:   5.3]
       [Termination (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0 0.0  0.0  
0.0  0.0  0.0  0.0
        Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
          [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42
           Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50]
       [GC Worker End (ms):  152549892.1  152549892.0 152549892.1  
152549892.0  152549892.1  152549892.1 152549892.0  152549892.1  
152549892.0  152549892.0 152549892.0  152549892.1  152549892.1
        Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff:   0.1]
       [GC Worker (ms):  87.3  87.3  87.3  87.3  87.3  87.3 87.2  87.3  
87.2  87.1  87.1  87.1  87.1
        Avg:  87.2, Min:  87.1, Max:  87.3, Diff:   0.2]
       [GC Worker Other (ms):  5.9  5.9  5.9  5.9  5.9  5.9 5.9  5.9  
5.9  27.8  6.0  6.1  6.1
        Avg:   7.6, Min:   5.9, Max:  27.8, Diff:  21.9]
    [Clear CT:   0.2 ms]
    [Other:  26.7 ms]
       [Choose CSet:   0.0 ms]
       [Ref Proc:  24.7 ms]
       [Ref Enq:   0.1 ms]
       [Free CSet:   1.0 ms]
    [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: 
7037M(22480M)->6476M(22480M)]
  [Times: user=1.34 sys=0.00, real=0.12 secs]
  152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
reason: humongous allocation request failed, allocation request: 
305776936 bytes]
  152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested 
expansion amount: 260046848 bytes, attempted expansion amount: 260046848 
bytes]
  152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, 
reason: heap expansion operation failed]
Total time for which application threads were stopped: 0.1240664 seconds
  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
reason: humongous allocation request failed, allocation request: 
305776936 bytes]
  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested 
expansion amount: 260046848 bytes, attempted expansion amount: 260046848 
bytes]
  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, 
reason: heap expansion operation failed]
  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
reason: allocation request failed, allocation request: 305776936 bytes]
  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested 
expansion amount: 305776936 bytes, attempted expansion amount: 306184192 
bytes]
  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, 
reason: heap expansion operation failed]
152549.958: [Full GC

Before allocating humongous object, 6 out of 22g heap is used, but 
allocation 300m object caused a full gc?  I do not have an explanation 
for this.

*

Thanks,
Jenny

On 3/27/2015 8:30 AM, Medan Gavril wrote:
> Hi ,
>
> I saw your G1 presentation and I found it good and interesting. I am 
> new to G1 tuning and I would need you suggestions if you  have time.
>
> In our app, when we have a FULL GC :
>     1. it restarts the application
>     2. we cannot get the right data to understand the root cause
>
> We switched from CMS to G1 in order to avoid long FULL GCs.
>
> JRE 1.17 update 17 it is being used.
>
> GC params:
>
> wrapper.java.additional.1=-server
> wrapper.java.additional.2=-XX:+PrintCommandLineFlags
> wrapper.java.additional.3=-XX:+UseG1GC
> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500
> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe
> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe
> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError
> wrapper.java.additional.11=-verbose:gc
> wrapper.java.additional.12=-XX:+PrintGCDetails
> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR%
>
> wrapper.java.additional.52=-XX:+PrintGCTimeStamps
> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime
> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy
>
> The error from wrapper log is:
>
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |    [Eden: 
> 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 
> 8127M(22480M)->7436M(22480M)]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  [Times: user=1.51 
> sys=0.02, real=0.19 secs] /
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous 
> allocation request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 | Total time for which 
> application threads were stopped: 0.2031307 seconds/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous 
> allocation request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation 
> request failed, allocation request: 189267984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
> amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/
> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
> expansion operation failed]/
> */INFO   | jvm 1    | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/*
> /ERROR  | wrapper  | 2015/03/25 15:25:57.694 | JVM appears hung: 
> *Timed out waiting for signal from JVM.*/
> /ERROR  | wrapper  | 2015/03/25 15:25:58.021 | JVM did not exit /
>
>
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.335: 
> [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, 
> reason: occupancy higher than threshold, occupancy: 10603200512 bytes, 
> allocation request: 14584896 bytes, threshold: 10607394780 bytes 
> (45.00 %), source: concurrent humongous allocation]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
> [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, 
> reason: requested by GC cause, GC cause: G1 Humongous Allocation]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
> [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: 
> concurrent cycle initiation requested]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 | 92696.337: [GC pause 
> (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing 
> CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, 
> target pause time: 2500.00 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
> [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 
> regions, survivors: 8 regions, predicted young region time: 32.04 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
> [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 
> regions, survivors: 8 regions, old: 0 regions, predicted pause time: 
> 197.80 ms, target pause time: 2500.00 ms]/
> /INFO   | jvm 1    | 2015/03/25 15:14:41.398 |  (initial-mark), 
> 0.15117107 secs]/
> /
> /
> /We increased the wrapper timeout but still no useful data about the 
> FULL GC./
> /
> /
> /Any suggestion is highly appreciated. Currently I suggested to add 
>  "PrintHeapAtGCExtended "/
> /
> /
> /Best Regards,
> Gabi Medan/
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/0f064794/attachment-0001.html>

From charlie.hunt at oracle.com  Tue Mar 31 01:41:03 2015
From: charlie.hunt at oracle.com (charlie hunt)
Date: Mon, 30 Mar 2015 20:41:03 -0500
Subject: G1 root cause and tuning
In-Reply-To: <5519E82A.5050808@oracle.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
	<5519E82A.5050808@oracle.com>
Message-ID: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>

Hi Jenny,

One possibility is that there is not enough available contiguous regions to satisfy a 300+ MB humongous allocation.

If we assume a 22 GB Java heap, (a little larger than the 22480M shown in the log), with 2048 G1 regions (default as you know), the region size would be about 11 MB. That implies there needs to be about 30 contiguous G1 regions available to satisfy the humongous allocation request.

An unrelated question ? do other GCs have a similar pattern of a rather large percentage of time in Ref Proc relative to the overall pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time.  If that?s the case, then if -XX:+ParallelRefProcEnabled is not already set, there may be some low hanging tuning fruit. But, it is not going to address the frequent humongous allocation problem.  It is also interesting in that the pause time goal is 2500 ms, yet the actual pause time is 120 ms, and eden is being sized at less than 1 GB out of a 22 GB Java heap.  Are the frequent humongous allocations messing with the heap sizing heuristics?

hths,

charlie

> On Mar 30, 2015, at 7:19 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
> 
> Medan,
> 
> Thanks for the logs.  The log messages are somewhat mangled, some of the records are not complete.
> There is 1 Full gc in wrapper.log.14.  Others do not have full gc.  This workload has a lot of humongous objects allocations, up to 10g size.  Though g1 can reclaim some humongous objects at young gc, a lot of reclamations are done during full gc.  
> 
> But this log snip does not make sense to me.
> 
> 152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 115.49 ms, remaining time: 2384.51 ms, target pause time: 2500.00 ms]
>  152549.805: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 155 regions, survivors: 23 regions, predicted young region time: 71.62 ms]
>  152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted pause time: 187.12 ms, target pause time: 2500.00 ms]
> , 0.12006414 secs]
>    [Parallel Time:  93.1 ms]
>       [GC Worker Start (ms):  152549804.8  152549804.8  152549804.8  152549804.8  152549804.8  152549804.8  152549804.8  152549804.8  152549804.8  152549804.9  152549804.9  152549805.0  152549805.0
>        Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff:   0.2]
>       [Ext Root Scanning (ms):  13.3  13.4  18.6  13.2  17.0  18.5  15.3  17.2  14.8  0.1  17.0  17.2  11.4
>        Avg:  14.4, Min:   0.1, Max:  18.6, Diff:  18.5]
>       [Update RS (ms):  51.9  52.5  48.9  52.0  51.3  48.4  50.8  50.0  50.7  47.1  49.1  49.6  52.2
>         Avg:  50.3, Min:  47.1, Max:  52.5, Diff:   5.4]
>          [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24
>           Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21]
>       [Scan RS (ms):  0.3  0.0  0.0  0.2  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
>        Avg:   0.1, Min:   0.0, Max:   0.3, Diff:   0.3]
>       [Object Copy (ms):  21.7  21.4  19.7  21.8  18.9  20.3  21.1  20.0  21.6  18.2  21.0  20.2  23.5
>        Avg:  20.7, Min:  18.2, Max:  23.5, Diff:   5.3]
>       [Termination (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
>        Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
>          [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42
>           Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50]
>       [GC Worker End (ms):  152549892.1  152549892.0  152549892.1  152549892.0  152549892.1  152549892.1  152549892.0  152549892.1  152549892.0  152549892.0              152549892.0  152549892.1  152549892.1
>        Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff:   0.1]
>       [GC Worker (ms):  87.3  87.3  87.3  87.3  87.3  87.3  87.2  87.3  87.2  87.1  87.1  87.1  87.1
>        Avg:  87.2, Min:  87.1, Max:  87.3, Diff:   0.2]
>       [GC Worker Other (ms):  5.9  5.9  5.9  5.9  5.9  5.9  5.9  5.9  5.9  27.8  6.0  6.1  6.1
>        Avg:   7.6, Min:   5.9, Max:  27.8, Diff:  21.9]
>    [Clear CT:   0.2 ms]
>    [Other:  26.7 ms]
>       [Choose CSet:   0.0 ms]
>       [Ref Proc:  24.7 ms]
>       [Ref Enq:   0.1 ms]
>       [Free CSet:   1.0 ms]
>    [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: 7037M(22480M)->6476M(22480M)]
>  [Times: user=1.34 sys=0.00, real=0.12 secs]
>  152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes]
>  152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes]
>  152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
> Total time for which application threads were stopped: 0.1240664 seconds
>  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes]
>  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes]
>  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
>  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 305776936 bytes]
>  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 305776936 bytes, attempted expansion amount: 306184192 bytes]
>  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
> 152549.958: [Full GC
> 
> Before allocating humongous object, 6 out of 22g heap is used, but allocation 300m object caused a full gc?  I do not have an explanation for this.
> 
>  Thanks,
> Jenny
> On 3/27/2015 8:30 AM, Medan Gavril wrote:
>> Hi ,
>> 
>> I saw your G1 presentation and I found it good and interesting. I am new to G1 tuning and I would need you suggestions if you  have time.
>> 
>> In our app, when we have a FULL GC :
>>     1. it restarts the application
>>     2. we cannot get the right data to understand the root cause
>> 
>> We switched from CMS to G1 in order to avoid long FULL GCs.
>> 
>> JRE 1.17 update 17 it is being used.
>> 
>> GC params:
>> 
>> wrapper.java.additional.1=-server
>> wrapper.java.additional.2=-XX:+PrintCommandLineFlags
>> wrapper.java.additional.3=-XX:+UseG1GC
>> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500
>> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe
>> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe
>> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError
>> wrapper.java.additional.11=-verbose:gc
>> wrapper.java.additional.12=-XX:+PrintGCDetails
>> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR%
>> 
>> wrapper.java.additional.52=-XX:+PrintGCTimeStamps
>> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime
>> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy
>> 
>> The error from wrapper log is:
>> 
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |    [Eden: 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 8127M(22480M)->7436M(22480M)]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  [Times: user=1.51 sys=0.02, real=0.19 secs] 
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 | Total time for which application threads were stopped: 0.2031307 seconds
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 189267984 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]
>> INFO   | jvm 1    | 2015/03/25 15:23:43.268 | 93238.285: [Full GC
>> ERROR  | wrapper  | 2015/03/25 15:25:57.694 | JVM appears hung: Timed out waiting for signal from JVM.
>> ERROR  | wrapper  | 2015/03/25 15:25:58.021 | JVM did not exit 
>> 
>> 
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.335: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 10603200512 bytes, allocation request: 14584896 bytes, threshold: 10607394780 bytes (45.00 %), source: concurrent humongous allocation]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 | 92696.337: [GC pause (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, target pause time: 2500.00 ms]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 regions, survivors: 8 regions, predicted young region time: 32.04 ms]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 regions, survivors: 8 regions, old: 0 regions, predicted pause time: 197.80 ms, target pause time: 2500.00 ms]
>> INFO   | jvm 1    | 2015/03/25 15:14:41.398 |  (initial-mark), 0.15117107 secs]
>> 
>> We increased the wrapper timeout but still no useful data about the FULL GC.
>> 
>> Any suggestion is highly appreciated. Currently I suggested to add  "PrintHeapAtGCExtended "
>> 
>> Best Regards,
>> Gabi Medan
>> 
>> 
>> 
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/6f7e2b3e/attachment.html>

From yu.zhang at oracle.com  Tue Mar 31 02:23:12 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Mon, 30 Mar 2015 19:23:12 -0700
Subject: G1 root cause and tuning
In-Reply-To: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
	<5519E82A.5050808@oracle.com>
	<4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>
Message-ID: <551A0510.4080604@oracle.com>

Charlie,

Thanks for the comments. please see my response inline.

Thanks,
Jenny

On 3/30/2015 6:41 PM, charlie hunt wrote:
> Hi Jenny,
>
> One possibility is that there is not enough available contiguous 
> regions to satisfy a 300+ MB humongous allocation.
>
> If we assume a 22 GB Java heap, (a little larger than the 22480M shown 
> in the log), with 2048 G1 regions (default as you know), the region 
> size would be about 11 MB. That implies there needs to be about 30 
> contiguous G1 regions available to satisfy the humongous allocation 
> request.
Good point!
>
> An unrelated question ? do other GCs have a similar pattern of a 
> rather large percentage of time in Ref Proc relative to the overall 
> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time.  If that?s 
> the case, then if -XX:+ParallelRefProcEnabled is not already set, 
> there may be some low hanging tuning fruit. But, it is not going to 
> address the frequent humongous allocation problem.  It is also 
> interesting in that the pause time goal is 2500 ms, yet the actual 
> pause time is 120 ms, and eden is being sized at less than 1 GB out of 
> a 22 GB Java heap.  Are the frequent humongous allocations messing 
> with the heap sizing heuristics?
Most of the time, the RefProc is below 10ms, but jumps to 20-60ms, so it 
might help with enabling parallelrefproc.  I do not remember in jdk7, if 
it is on by default or not.

This log is really strange, as most of the time, the heap usage is ~9g 
out of 22g, then the humongous allocations jumps in.  As the log entries 
are mangled, it is hard to connect the dots.
>
> hths,
>
> charlie
>
>> On Mar 30, 2015, at 7:19 PM, Yu Zhang <yu.zhang at oracle.com 
>> <mailto:yu.zhang at oracle.com>> wrote:
>>
>> Medan,
>>
>> Thanks for the logs.  The log messages are somewhat mangled, some of 
>> the records are not complete.
>> There is 1 Full gc in wrapper.log.14.  Others do not have full gc.  
>> This workload has a lot of humongous objects allocations, up to 10g 
>> size. Though g1 can reclaim some humongous objects at young gc, a lot 
>> of reclamations are done during full gc.
>>
>> But this log snip does not make sense to me.
>>
>> 152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet 
>> Construction) start choosing CSet, predicted base time: 115.49 ms, 
>> remaining time: 2384.51 ms, target pause time: 2500.00 ms]
>>  152549.805: [G1Ergonomics (CSet Construction) add young regions to 
>> CSet, eden: 155 regions, survivors: 23 regions, predicted young 
>> region time: 71.62 ms]
>>  152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, 
>> eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted 
>> pause time: 187.12 ms, target pause time: 2500.00 ms]
>> , 0.12006414 secs]
>>    [Parallel Time:  93.1 ms]
>>       [GC Worker Start (ms):  152549804.8 152549804.8  152549804.8  
>> 152549804.8 152549804.8  152549804.8  152549804.8 152549804.8  
>> 152549804.8  152549804.9 152549804.9  152549805.0  152549805.0
>>        Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff:   0.2]
>>       [Ext Root Scanning (ms):  13.3  13.4  18.6 13.2  17.0  18.5  
>> 15.3  17.2  14.8  0.1  17.0 17.2  11.4
>>        Avg:  14.4, Min:   0.1, Max:  18.6, Diff: 18.5]
>>       [Update RS (ms):  51.9  52.5  48.9  52.0 51.3  48.4  50.8  
>> 50.0  50.7  47.1  49.1  49.6 52.2
>> *       Avg:  50.3, Min:  47.1, Max: 52.5, Diff:   5.4]
>>          [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24
>>           Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21]
>>       [Scan RS (ms):  0.3  0.0  0.0  0.2  0.0 0.0  0.0  0.0  0.0  
>> 0.0  0.0  0.0  0.0
>>        Avg:   0.1, Min:   0.0, Max:   0.3, Diff:   0.3]
>>       [Object Copy (ms):  21.7  21.4  19.7 21.8  18.9  20.3  21.1  
>> 20.0  21.6  18.2  21.0 20.2  23.5
>>        Avg:  20.7, Min:  18.2, Max:  23.5, Diff:   5.3]
>>       [Termination (ms):  0.0  0.0  0.0  0.0 0.0  0.0  0.0  0.0  0.0  
>> 0.0  0.0  0.0  0.0
>>        Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
>>          [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42
>>           Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50]
>>       [GC Worker End (ms):  152549892.1 152549892.0  152549892.1  
>> 152549892.0 152549892.1  152549892.1  152549892.0 152549892.1  
>> 152549892.0  152549892.0 152549892.0  152549892.1  152549892.1
>>        Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff:   0.1]
>>       [GC Worker (ms):  87.3  87.3  87.3  87.3 87.3  87.3  87.2  
>> 87.3  87.2  87.1  87.1  87.1 87.1
>>        Avg:  87.2, Min:  87.1, Max:  87.3, Diff:   0.2]
>>       [GC Worker Other (ms):  5.9  5.9  5.9 5.9  5.9  5.9  5.9  5.9  
>> 5.9  27.8  6.0  6.1 6.1
>>        Avg:   7.6, Min:   5.9, Max:  27.8, Diff:  21.9]
>>    [Clear CT:   0.2 ms]
>>    [Other:  26.7 ms]
>>       [Choose CSet:   0.0 ms]
>>       [Ref Proc:  24.7 ms]
>>       [Ref Enq:   0.1 ms]
>>       [Free CSet:   1.0 ms]
>>    [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: 
>> 7037M(22480M)->6476M(22480M)]
>>  [Times: user=1.34 sys=0.00, real=0.12 secs]
>>  152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
>> reason: humongous allocation request failed, allocation request: 
>> 305776936 bytes]
>>  152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested 
>> expansion amount: 260046848 bytes, attempted expansion amount: 
>> 260046848 bytes]
>>  152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, 
>> reason: heap expansion operation failed]
>> Total time for which application threads were stopped: 0.1240664 seconds
>>  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
>> reason: humongous allocation request failed, allocation request: 
>> 305776936 bytes]
>>  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested 
>> expansion amount: 260046848 bytes, attempted expansion amount: 
>> 260046848 bytes]
>>  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, 
>> reason: heap expansion operation failed]
>>  152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, 
>> reason: allocation request failed, allocation request: 305776936 bytes]
>>  152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested 
>> expansion amount: 305776936 bytes, attempted expansion amount: 
>> 306184192 bytes]
>>  152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, 
>> reason: heap expansion operation failed]
>> 152549.958: [Full GC
>>
>> Before allocating humongous object, 6 out of 22g heap is used, but 
>> allocation 300m object caused a full gc?  I do not have an 
>> explanation for this.
>>
>> *
>> Thanks,
>> Jenny
>> On 3/27/2015 8:30 AM, Medan Gavril wrote:
>>> Hi ,
>>>
>>> I saw your G1 presentation and I found it good and interesting. I am 
>>> new to G1 tuning and I would need you suggestions if you  have time.
>>>
>>> In our app, when we have a FULL GC :
>>>     1. it restarts the application
>>>     2. we cannot get the right data to understand the root cause
>>>
>>> We switched from CMS to G1 in order to avoid long FULL GCs.
>>>
>>> JRE 1.17 update 17 it is being used.
>>>
>>> GC params:
>>>
>>> wrapper.java.additional.1=-server
>>> wrapper.java.additional.2=-XX:+PrintCommandLineFlags
>>> wrapper.java.additional.3=-XX:+UseG1GC
>>> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500
>>> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe
>>> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe
>>> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError
>>> wrapper.java.additional.11=-verbose:gc
>>> wrapper.java.additional.12=-XX:+PrintGCDetails
>>> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR%
>>>
>>> wrapper.java.additional.52=-XX:+PrintGCTimeStamps
>>> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime
>>> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy
>>>
>>> The error from wrapper log is:
>>>
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  [Eden: 
>>> 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 
>>> 8127M(22480M)->7436M(22480M)]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  [Times: user=1.51 
>>> sys=0.02, real=0.19 secs] /
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
>>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: 
>>> humongous allocation request failed, allocation request: 189267984 
>>> bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
>>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
>>> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.265: 
>>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
>>> expansion operation failed]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 | Total time for which 
>>> application threads were stopped: 0.2031307 seconds/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: 
>>> humongous allocation request failed, allocation request: 189267984 
>>> bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
>>> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
>>> expansion operation failed]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: 
>>> allocation request failed, allocation request: 189267984 bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion 
>>> amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/
>>> /INFO   | jvm 1    | 2015/03/25 15:23:43.268 |  93238.285: 
>>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap 
>>> expansion operation failed]/
>>> */INFO   | jvm 1    | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/*
>>> /ERROR  | wrapper  | 2015/03/25 15:25:57.694 | JVM appears hung: 
>>> *Timed out waiting for signal from JVM.*/
>>> /ERROR  | wrapper  | 2015/03/25 15:25:58.021 | JVM did not exit /
>>>
>>>
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.335: 
>>> [G1Ergonomics (Concurrent Cycles) request concurrent cycle 
>>> initiation, reason: occupancy higher than threshold, occupancy: 
>>> 10603200512 bytes, allocation request: 14584896 bytes, threshold: 
>>> 10607394780 bytes (45.00 %), source: concurrent humongous allocation]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
>>> [G1Ergonomics (Concurrent Cycles) request concurrent cycle 
>>> initiation, reason: requested by GC cause, GC cause: G1 Humongous 
>>> Allocation]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.337: 
>>> [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: 
>>> concurrent cycle initiation requested]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 | 92696.337: [GC pause 
>>> (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing 
>>> CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, 
>>> target pause time: 2500.00 ms]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
>>> [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 
>>> 114 regions, survivors: 8 regions, predicted young region time: 
>>> 32.04 ms]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.289 |  92696.338: 
>>> [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 
>>> regions, survivors: 8 regions, old: 0 regions, predicted pause time: 
>>> 197.80 ms, target pause time: 2500.00 ms]/
>>> /INFO   | jvm 1    | 2015/03/25 15:14:41.398 |  (initial-mark), 
>>> 0.15117107 secs]/
>>> /
>>> /
>>> /We increased the wrapper timeout but still no useful data about the 
>>> FULL GC./
>>> /
>>> /
>>> /Any suggestion is highly appreciated. Currently I suggested to add 
>>>  "PrintHeapAtGCExtended "/
>>> /
>>> /
>>> /Best Regards,
>>> Gabi Medan/
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/7037e613/attachment-0001.html>

From thomas.schatzl at oracle.com  Tue Mar 31 11:30:55 2015
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 31 Mar 2015 13:30:55 +0200
Subject: G1 root cause and tuning
In-Reply-To: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
	<5519E82A.5050808@oracle.com>
	<4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>
Message-ID: <1427801455.3432.78.camel@oracle.com>

Hi all,

On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote:
> Hi Jenny,
>
> One possibility is that there is not enough available contiguous
> regions to satisfy a 300+ MB humongous allocation.
>
> If we assume a 22 GB Java heap, (a little larger than the 22480M shown
> in the log), with 2048 G1 regions (default as you know), the region
> size would be about 11 MB. That implies there needs to be about 30
> contiguous G1 regions available to satisfy the humongous allocation
> request.
>
> An unrelated question ? do other GCs have a similar pattern of a
> rather large percentage of time in Ref Proc relative to the overall
> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time.  If that?s
> the case, then if -XX:+ParallelRefProcEnabled is not already set,
> there may be some low hanging tuning fruit. But, it is not going to
> address the frequent humongous allocation problem.  It is also
> interesting in that the pause time goal is 2500 ms, yet the actual
> pause time is 120 ms, and eden is being sized at less than 1 GB out of
> a 22 GB Java heap.  Are the frequent humongous allocations messing
> with the heap sizing heuristics?

While I have no solution for the problem we are aware of these problems:

- https://bugs.openjdk.java.net/browse/JDK-7068229 for dynamically
enabling MT reference processing

- https://bugs.openjdk.java.net/browse/JDK-8038487 to use mixed GC
instead of Full GC to clear out space for failing humoungous object
allocations.

I am not sure about what jdk release "JRE 1.17 update 17" actually is.
From the given strings in the PrintGCDetails output, it seems to be
something quite old, I would guess jdk6?

In that case, if possible I would recommend trying a newer version that
improves humongous object handling significantly (e.g. 8u40 is latest
official).

Another option that works in all versions I am aware of is increasing
heap region size with -XX:G1HeapRegionSize=<X>M, where X is 8/16 or 32;
it seems that 4M region size has been chosen by ergonomics.
Start with the smaller of the suggested values.

Thanks,
  Thomas


From charlie.hunt at oracle.com  Tue Mar 31 12:35:15 2015
From: charlie.hunt at oracle.com (charlie hunt)
Date: Tue, 31 Mar 2015 07:35:15 -0500
Subject: G1 root cause and tuning
In-Reply-To: <1427801455.3432.78.camel@oracle.com>
References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com>
	<5519E82A.5050808@oracle.com>
	<4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com>
	<1427801455.3432.78.camel@oracle.com>
Message-ID: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com>

To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations.  This would offer a better opportunity for that humongous allocation to be satisfied.

hths,

charlie

> On Mar 31, 2015, at 6:30 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
> On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote:
>> Hi Jenny,
>> 
>> One possibility is that there is not enough available contiguous
>> regions to satisfy a 300+ MB humongous allocation.
>> 
>> If we assume a 22 GB Java heap, (a little larger than the 22480M shown
>> in the log), with 2048 G1 regions (default as you know), the region
>> size would be about 11 MB. That implies there needs to be about 30
>> contiguous G1 regions available to satisfy the humongous allocation
>> request.
>> 
>> An unrelated question ? do other GCs have a similar pattern of a
>> rather large percentage of time in Ref Proc relative to the overall
>> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time.  If that?s
>> the case, then if -XX:+ParallelRefProcEnabled is not already set,
>> there may be some low hanging tuning fruit. But, it is not going to
>> address the frequent humongous allocation problem.  It is also
>> interesting in that the pause time goal is 2500 ms, yet the actual
>> pause time is 120 ms, and eden is being sized at less than 1 GB out of
>> a 22 GB Java heap.  Are the frequent humongous allocations messing
>> with the heap sizing heuristics?
> 
> While I have no solution for the problem we are aware of these problems:
> 
> - https://bugs.openjdk.java.net/browse/JDK-7068229 <https://bugs.openjdk.java.net/browse/JDK-7068229> for dynamically
> enabling MT reference processing
> 
> - https://bugs.openjdk.java.net/browse/JDK-8038487 <https://bugs.openjdk.java.net/browse/JDK-8038487> to use mixed GC
> instead of Full GC to clear out space for failing humoungous object
> allocations.
> 
> I am not sure about what jdk release "JRE 1.17 update 17" actually is.
> From the given strings in the PrintGCDetails output, it seems to be
> something quite old, I would guess jdk6?
> 
> In that case, if possible I would recommend trying a newer version that
> improves humongous object handling significantly (e.g. 8u40 is latest
> official).
> 
> Another option that works in all versions I am aware of is increasing
> heap region size with -XX:G1HeapRegionSize=<X>M, where X is 8/16 or 32;
> it seems that 4M region size has been chosen by ergonomics.
> Start with the smaller of the suggested values.
> 
> Thanks,
>  Thomas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150331/95996284/attachment.html>

From charlie.hunt at oracle.com  Tue Mar 31 12:52:13 2015
From: charlie.hunt at oracle.com (charlie hunt)
Date: Tue, 31 Mar 2015 07:52:13 -0500
Subject: G1 root cause and tuning
In-Reply-To: <1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com>
References: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com>
	<1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <3A50F3D7-6C40-4D3E-B8D9-4822F111DB8A@oracle.com>

Just as a clarification, the -XX:+ParallelRefProcEnabled will help reduce the time spent in reference processing. It will not help address the issue of seeing Full GCs as a result of frequent humongous object allocations, or a humongous allocations where there is not sufficient contiguous regions available to satisfy the humongous allocation request.

Thomas?s suggestion to increase the region size may help with the Full GCs as a result of humongous object allocations.

thanks,

charlie

> On Mar 31, 2015, at 7:42 AM, Medan Gavril <gabi_io at yahoo.com> wrote:
> 
> HI Charlie,
> 
> Currenltly we can only go to java 7 update 7x(latest).
> 
> We will try the following changes:
>    1. -XX:G1HeapRegionSize=8 (then increase)
>     2. -XX:+ParallelRefProcEnabled
> 
> Please let me know if you have any other suggestion.
> 
> Best Regards,
> Gabi Medan
> 
> 
> On Tuesday, March 31, 2015 3:35 PM, charlie hunt <charlie.hunt at oracle.com> wrote:
> 
> 
> To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations.  This would offer a better opportunity for that humongous allocation to be satisfied.
> 
> hths,
> 
> charlie
> 
>> On Mar 31, 2015, at 6:30 AM, Thomas Schatzl <thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.com>> wrote:
>> 
>> Hi all,
>> 
>> On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote:
>>> Hi Jenny,
>>> 
>>> One possibility is that there is not enough available contiguous
>>> regions to satisfy a 300+ MB humongous allocation.
>>> 
>>> If we assume a 22 GB Java heap, (a little larger than the 22480M shown
>>> in the log), with 2048 G1 regions (default as you know), the region
>>> size would be about 11 MB. That implies there needs to be about 30
>>> contiguous G1 regions available to satisfy the humongous allocation
>>> request.
>>> 
>>> An unrelated question ? do other GCs have a similar pattern of a
>>> rather large percentage of time in Ref Proc relative to the overall
>>> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time.  If that?s
>>> the case, then if -XX:+ParallelRefProcEnabled is not already set,
>>> there may be some low hanging tuning fruit. But, it is not going to
>>> address the frequent humongous allocation problem.  It is also
>>> interesting in that the pause time goal is 2500 ms, yet the actual
>>> pause time is 120 ms, and eden is being sized at less than 1 GB out of
>>> a 22 GB Java heap.  Are the frequent humongous allocations messing
>>> with the heap sizing heuristics?
>> 
>> While I have no solution for the problem we are aware of these problems:
>> 
>> - https://bugs.openjdk.java.net/browse/JDK-7068229 <https://bugs.openjdk.java.net/browse/JDK-7068229> for dynamically
>> enabling MT reference processing
>> 
>> - https://bugs.openjdk.java.net/browse/JDK-8038487 <https://bugs.openjdk.java.net/browse/JDK-8038487> to use mixed GC
>> instead of Full GC to clear out space for failing humoungous object
>> allocations.
>> 
>> I am not sure about what jdk release "JRE 1.17 update 17" actually is.
>> From the given strings in the PrintGCDetails output, it seems to be
>> something quite old, I would guess jdk6?
>> 
>> In that case, if possible I would recommend trying a newer version that
>> improves humongous object handling significantly (e.g. 8u40 is latest
>> official).
>> 
>> Another option that works in all versions I am aware of is increasing
>> heap region size with -XX:G1HeapRegionSize=<X>M, where X is 8/16 or 32;
>> it seems that 4M region size has been chosen by ergonomics.
>> Start with the smaller of the suggested values.
>> 
>> Thanks,
>>  Thomas
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150331/7d1f8f85/attachment-0001.html>

From gabi_io at yahoo.com  Tue Mar 31 05:47:35 2015
From: gabi_io at yahoo.com (Medan Gavril)
Date: Mon, 30 Mar 2015 22:47:35 -0700
Subject: G1 root cause and tuning
In-Reply-To: <551A0510.4080604@oracle.com>
Message-ID: <1427780855.80050.YahooMailAndroidMobile@web161701.mail.bf1.yahoo.com>

Hi guys,


Thanks a lot for your comments. So what should be the next step? Enable?

XX:+ParallelRefProcEnabled ? Any other suggestions?


About the logs... were they ok parsed? Do you want to add them to gc.log? However i do not want to be overwritten at restart.


Best regards,?

Gabi Medan

Sent from Yahoo Mail on Android

From:"Yu Zhang" <yu.zhang at oracle.com>
Date:Tue, Mar 31, 2015 at 5:23 am
Subject:Re: G1 root cause and tuning

Charlie,

Thanks for the comments. please see my response inline.
Thanks, Jenny On 3/30/2015 6:41 PM, charlie hunt wrote:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150330/1dd22fa6/attachment.html>

From gabi_io at yahoo.com  Tue Mar 31 12:42:56 2015
From: gabi_io at yahoo.com (Medan Gavril)
Date: Tue, 31 Mar 2015 12:42:56 +0000 (UTC)
Subject: G1 root cause and tuning
In-Reply-To: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com>
References: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com>
Message-ID: <1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com>

HI Charlie,
Currenltly we can only go to java 7 update 7x(latest).
We will try the following changes:? ?1.?-XX:G1HeapRegionSize=8 (then increase)? ? 2.?-XX:+ParallelRefProcEnabled

Please let me know if you have any other suggestion.
Best Regards,Gabi Medan

     On Tuesday, March 31, 2015 3:35 PM, charlie hunt <charlie.hunt at oracle.com> wrote:
   

 To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations. ?This would offer a better opportunity for that humongous allocation to be satisfied.
hths,
charlie

On Mar 31, 2015, at 6:30 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
Hi all,

On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote:

Hi Jenny,

One possibility is that there is not enough available contiguous
regions to satisfy a 300+ MB humongous allocation.

If we assume a 22 GB Java heap, (a little larger than the 22480M shown
in the log), with 2048 G1 regions (default as you know), the region
size would be about 11 MB. That implies there needs to be about 30
contiguous G1 regions available to satisfy the humongous allocation
request.

An unrelated question ? do other GCs have a similar pattern of a
rather large percentage of time in Ref Proc relative to the overall
pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. ?If that?s
the case, then if -XX:+ParallelRefProcEnabled is not already set,
there may be some low hanging tuning fruit. But, it is not going to
address the frequent humongous allocation problem. ?It is also
interesting in that the pause time goal is 2500 ms, yet the actual
pause time is 120 ms, and eden is being sized at less than 1 GB out of
a 22 GB Java heap. ?Are the frequent humongous allocations messing
with the heap sizing heuristics?


While I have no solution for the problem we are aware of these problems:

-?https://bugs.openjdk.java.net/browse/JDK-7068229?for dynamically
enabling MT reference processing

-?https://bugs.openjdk.java.net/browse/JDK-8038487?to use mixed GC
instead of Full GC to clear out space for failing humoungous object
allocations.

I am not sure about what jdk release "JRE 1.17 update 17" actually is.
>From the given strings in the PrintGCDetails output, it seems to be
something quite old, I would guess jdk6?

In that case, if possible I would recommend trying a newer version that
improves humongous object handling significantly (e.g. 8u40 is latest
official).

Another option that works in all versions I am aware of is increasing
heap region size with -XX:G1HeapRegionSize=<X>M, where X is 8/16 or 32;
it seems that 4M region size has been chosen by ergonomics.
Start with the smaller of the suggested values.

Thanks,
?Thomas


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150331/ad186862/attachment.html>