Fwd: Bug in G1GC it performs Full GC when code cache is full resulting in overkill

Charlie Hunt chunt at salesforce.com
Wed May 29 09:31:28 PDT 2013


forgot to cc hotspot-gc-use ... will try to remember on future replies.

Begin forwarded message:

> From: Charlie Hunt <chunt at salesforce.com>
> Subject: Re: Bug in G1GC it performs Full GC when code cache is full resulting in overkill
> Date: May 29, 2013 11:28:35 AM CDT
> To: Martin Makundi <martin.makundi at koodaripalvelut.com>
> 
> Couple comments below.
> 
> charlie ...
> 
> On May 29, 2013, at 10:49 AM, Martin Makundi wrote:
> 
>> Hi!
>> 
>>> A bit of constructive criticism ;-)  It would be good practice to set one option at a time and measure its performance to determine whether it improves performance rather than choosing an option because of something you read in text.  In short, always measure and reason about whether what you've observed for an improvement or regression makes sense.  And, also run multiple times to get a sense of noise versus real improvement or regression.
>> 
>> 
>> Thanks. That's one of the reasons we never changed our options. Once
>> we found someting that works very well, we know that its always n!
>> work to test changes and the system was running very nice indeed
>> before the previous tweak ;)
>> 
>>>>> -XX:+UseFastAccessorMethod  (the default is disabled)
>>>> 
>>>> Fast sounds good, the description of it is "Use optimized versions of
>>>> Get<Primitive>Field" which sounds good. I see no harm in this.
>>> 
>>> These would be JNI operations.
>>> 
>>> A quick at the HotSpot source suggests UseFastAccessorMethods
>>> is mostly confined to interpreter operations.
>> 
>> Thanks for the info. Doesn't say much to me, but does not seem to harm
>> anything. Will try setting it off at some point in time.
>> 
>>>>> -XX:+UseNUMA  (Are you running a JVM that spans NUMA memory nodes?
>>>>> Or, do you have multiple JVMs running on a NUMA or non-NUMA system?)
>>>> 
>>> 
>>> Key point is that you should use -XX:+UseNUMA only when you are
>>> deploying  a JVM that spans NUMA nodes.
>> 
>> Thanks for the info. Doesn't say much to me, but does not seem to harm
>> anything. Will try setting it off at some point in time.
> 
> You either have a NUMA system and deploying a single JVM on it, or you're on a non-NUMA system.
> 
> Come to think of it, you're on a Linux system. I don't recall the exact details of how the numa-allocator works on Linux.  I was first thinking of it in terms of how it's handled on Solaris with Solaris lgroups.  I won't go into that. :-)  On Linux it may just do round robin .. would have to go look again at the HotSpot source to see what it does on Linux.  Depending on what it does, you may not see any difference with it.  However, that could change in the future and you could be caught off guard with such a change. ;-)
> 
>> 
>>>>> -XX:+UseStringCache (Do you have evidence that this helps?  And, do you know what it does?)
>>>> 
>>>> I assume it is some sort of string interning solution. Don't know
>>>> exactly what it does, but our application uses high amount of
>>>> redundant strings, smaller memory footprint is a good idea. Again,
>>>> very little documentation about this available but seems
>>>> straightforward. Haven't benchmarked it personally.
>>> 
>>> I won't go into the details of what it does.  I don't think I can say what it does without possibly being at risk of binding separation agreement.
>>> 
>>> I'll just say that you should measure the perf difference with it off versus on if you think it might help.
>> 
>> No visible impact on performance, really, in production. If we test it
>> with a bogus test case, well.. results are as informative as our tests
>> are close to the production. Which is unlikely.
>> 
>>>>> -XX:CMSInitiatingOccupancyFraction=70 (This is applicable to CMS GC, and not applicable to G1 GC)
>>>> 
>>>> Again, not documented thoroughly where it applies and where not, jvm
>>>> gave no warning/error about it so we assumed it's valid.
>>> 
>>> There's always the HotSpot source code ;-)
>>> 
>>> It's also quite well documented in various slide ware on the internet.
>>> It's also quite well documented in the Java Performance book. :-)
>> 
>> Uh.. does it say somewhere that Do not use
>> XX:CMSInitiatingOccupancyFraction with G1GC? ;) I know performance
>> tuning is your bread and butter but is not ours... is more like we are
>> just driving the car and you are the mechanic...different
>> perspective.. just trying to fill'er'up to go..leaded or unleaded...
>> ;)
> 
> Well, uh, the command line options says "CMS" in it.  Isn't that enough to imply that it's CMS specific?  Additionally, if the description says, "The percent of old generation space occupancy at which the first CMS garbage collection cycle should start. Subsequent starts of the CMS cycle are determined at a HotSpot ergonomically computed occupancy.", isn't that enough to imply it's CMS GC specific?  That description comes directly from Java Performance.
> 
> To use your analogy, would you put diesel fuel in your gasoline powered vehicle when the label at the pump says "diesel fuel"?
> 
>> 
>>>>> -XX:InitiatingHeapOccupancyPercent=0  (You realize this will force G1's concurrent cycle to run continuously?)
>>>> 
>>>> Yes, that's what we figured out, we don't want it to sit lazy and end
>>>> up in a situation where it is required to do a Full GC. This switch
>>>> was specifically chosen in a situation we had a memory leak and tried
>>>> to aggressively fight against it before we found the root cause. Maybe
>>>> we should try without this switch now, and see what effect it has.
>>> 
>>> Having GC logs to see what available head room you have between the
>>> initiating of a G1 concurrent cycle and available regions / heap space would be most appropriate.
>> 
>> Hmm.. I don't thoroughly understand the logs either, but, here is a snap:
>> 
>> 2013-05-29T17:28:56.119+0300: 38905.407: [GC pause (young), 0.29261000 secs]
>>  [Parallel Time: 288.8 ms]
>>     [GC Worker Start (ms):  38905407.6  38905407.7  38905407.7  38905407.9
>>      Avg: 38905407.7, Min: 38905407.6, Max: 38905407.9, Diff:   0.3]
>>     [Ext Root Scanning (ms):  22.8  16.3  18.1  22.0
>>      Avg:  19.8, Min:  16.3, Max:  22.8, Diff:   6.6]
>>     [SATB Filtering (ms):  0.0  0.1  0.0  0.0
>>      Avg:   0.0, Min:   0.0, Max:   0.1, Diff:   0.1]
>>     [Update RS (ms):  31.9  37.3  35.1  33.3
>>      Avg:  34.4, Min:  31.9, Max:  37.3, Diff:   5.5]
>>        [Processed Buffers : 102 106 119 104
>>         Sum: 431, Avg: 107, Min: 102, Max: 119, Diff: 17]
>>     [Scan RS (ms):  0.0  0.0  0.1  0.0
>>      Avg:   0.0, Min:   0.0, Max:   0.1, Diff:   0.1]
>>     [Object Copy (ms):  228.2  229.1  229.5  227.3
>>      Avg: 228.5, Min: 227.3, Max: 229.5, Diff:   2.2]
>>     [Termination (ms):  0.0  0.0  0.0  0.0
>>      Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
>>        [Termination Attempts : 4 1 11 4
>>         Sum: 20, Avg: 5, Min: 1, Max: 11, Diff: 10]
>>     [GC Worker End (ms):  38905690.5  38905690.5  38905690.5  38905690.5
>>      Avg: 38905690.5, Min: 38905690.5, Max: 38905690.5, Diff:   0.0]
>>     [GC Worker (ms):  282.9  282.8  282.8  282.6
>>      Avg: 282.8, Min: 282.6, Max: 282.9, Diff:   0.3]
>>     [GC Worker Other (ms):  5.9  6.0  6.0  6.2
>>      Avg:   6.0, Min:   5.9, Max:   6.2, Diff:   0.3]
>>  [Complete CSet Marking:   0.0 ms]
>>  [Clear CT:   0.1 ms]
>>  [Other:   3.7 ms]
>>     [Choose CSet:   0.0 ms]
>>     [Ref Proc:   2.8 ms]
>>     [Ref Enq:   0.1 ms]
>>     [Free CSet:   0.3 ms]
>>  [Eden: 48M(5032M)->0B(5032M) Survivors: 288M->288M Heap:
>> 15790M(26624M)->15741M(26624M)]
>> [Times: user=1.14 sys=0.00, real=0.29 secs]
>> Heap after GC invocations=575 (full 157):
>> garbage-first heap   total 27262976K, used 16119181K
>> [0x0000000160000000, 0x00000007e0000000, 0x00000007e0000000)
>> region size 8192K, 36 young (294912K), 36 survivors (294912K)
>> compacting perm gen  total 524288K, used 164479K [0x00000007e0000000,
>> 0x0000000800000000, 0x0000000800000000)
>>  the space 524288K,  31% used [0x00000007e0000000,
>> 0x00000007ea09ff68, 0x00000007ea0a0000, 0x0000000800000000)
>> No shared spaces configured.
>> }
>> {Heap before GC invocations=575 (full 157):
>> garbage-first heap   total 27262976K, used 16119181K
>> [0x0000000160000000, 0x00000007e0000000, 0x00000007e0000000)
>> region size 8192K, 37 young (303104K), 36 survivors (294912K)
>> compacting perm gen  total 524288K, used 164479K [0x00000007e0000000,
>> 0x0000000800000000, 0x0000000800000000)
>>  the space 524288K,  31% used [0x00000007e0000000,
>> 0x00000007ea09ff68, 0x00000007ea0a0000, 0x0000000800000000)
>> No shared spaces configured.
>> 2013-05-29T17:28:56.413+0300: 38905.701: [Full GC
>> 15742M->14497M(26624M), 56.7731320 secs]
>> 
>> That's the third Full GC today after the change to 26G and change from
>> occupancypercent=0. Tomorrow will be trying again with
>> occupancypercent=0
> 
> I saw your follow-up post with additional GC logs.  Thanks!  That'll really help!
> 
>> 
>>>>> -noclassgc (This is rarely needed and haven't seen an app that required it for quite some time)
>>>> 
>>>> Jvm 1.6 stopped the world for couple of minutes several times per day
>>>> while unloading classes, so we used noclassgc to disable that. We do
>>>> not know if this is necessary for latest 1.7 to avoid class unload
>>>> pause, but we continued to use this switch and found no harm in it.
>>>> Can't afford testing that in production ;)
>>> 
>>> Haven't seen a case where unloading classes cause a several minute pause.
>>> Are you sure your system is not swapping?  And, do you have GC logs you
>>> can share that illustrate the behavior and that -noclassgc fixed it?
>> 
>> We deleted swap partition long time ago, we simply do not risk swapping at all.
> 
> You may need some additional swap, or you'll need additional memory for backing reserved space even though the application may not use it.
> 
>> 
>> We had this class unloading problem several times per day like half a
>> year ago, and fixed it with noclasssgc, that was a no-brainer, single
>> parameter that made the difference.
> 
> Ok, if you're convinced it fixes your issue, then use it. :-)  Usually class unloading issues generally implies perm gen size needs increases, or initial perm gen size could use increasing as an alternative.
> 
>> 
>> It is also discussed here (they do not discuss noclassgc though, we
>> figured that out somehow)
>> http://stackoverflow.com/questions/2833983/meaning-of-the-unloading-class-messages
>> 
>>>>> -XX:+ UseGCOverheadLimit
>>>>> -XX:ReservedCodeCacheSize=48, that is the default for 7u21.  You might consider setting it higher if you have the available space, and more importantly if you think you're running out of code space.
>>>> 
>>>> For our sun jvm linux 64bit 48m is maximum, jvm won't start if higher value.
>>> 
>>> If you can't go larger than -XX:ReservedCodeCacheSize=48m, that may suggest you have memory constraints and may also suggest you don't have enough swap space defined, and you may be experiencing swapping during JVM execution.  I've got a Linux system that has 32 GB of RAM, I can set ReservedCodeCacheSize=256m with no issues, even with -Xms30g and -Xms30g.
>> 
>> It is also documented that 48m is maximum
>> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
>> "maximum code cache size. [Solaris 64-bit, amd64, and -server x86:
>> 48m"
>> 
> 
> The documentation means that the command line option sets the maximum code cache size, not that it is the absolute maximum you can set.  Rather it sets the default maximum code cache size.  It's not any different than setting -Xmx, there is a default -Xmx value if you don't set one, and you can specify a larger one using -Xmx.
> 
> Fwiw, here's the output from 7u21 on my Linux x64 system with 32 GB of RAM, notice I also specified +AlwaysPreTouch which forces every page to be touched as part of the command execution to illustrate the memory has been reserved, committed and touched:
> $  java -Xmx26g -Xms26g -XX:+AlwaysPreTouch -XX:ReservedCodeCacheSize=256m -version
> java version "1.7.0_21"
> Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
> Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
> 
> 
>> **
>> Martin
>> 
>>> 
>>> 
>>>>> 
>>>>> On May 26, 2013, at 10:20 AM, Martin Makundi wrote:
>>>>> 
>>>>>> Sorry, forgot to mention, using:
>>>>>> 
>>>>>> java version "1.7.0_21"
>>>>>> Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
>>>>>> 
>>>>>> Linux version 3.0.1.stk64 (dfn at localhost.localdomain) (gcc version
>>>>>> 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) #1 SMP Sat Aug 13 12:53:46
>>>>>> EDT 2011
>>>>>> 
>>>>>> -Dclassworlds.conf=/usr/share/maven/maven/bin/m2.conf
>>>>>> -Dmaven.home=/usr/share/maven/maven
>>>>>> -Duser.timezone=EET
>>>>>> -XX:+AggressiveOpts
>>>>>> -XX:+DisableExplicitGC
>>>>>> -XX:+ParallelRefProcEnabled
>>>>>> -XX:+PrintGCDateStamps
>>>>>> -XX:+PrintGCDetails
>>>>>> -XX:+PrintHeapAtGC
>>>>>> -XX:+UseAdaptiveSizePolicy
>>>>>> -XX:+UseCompressedOops
>>>>>> -XX:+UseFastAccessorMethods
>>>>>> -XX:+UseG1GC
>>>>>> -XX:+UseGCOverheadLimit
>>>>>> -XX:+UseNUMA
>>>>>> -XX:+UseStringCache
>>>>>> -XX:CMSInitiatingOccupancyFraction=70
>>>>>> -XX:GCPauseIntervalMillis=10000
>>>>>> -XX:InitiatingHeapOccupancyPercent=0
>>>>>> -XX:MaxGCPauseMillis=500
>>>>>> -XX:MaxPermSize=512m
>>>>>> -XX:PermSize=512m
>>>>>> -XX:ReservedCodeCacheSize=48m
>>>>>> -Xloggc:gc.log
>>>>>> -Xmaxf1
>>>>>> -Xms30G
>>>>>> -Xmx30G
>>>>>> -Xnoclassgc
>>>>>> -Xss4096k
>>>>>> 
>>>>>> 
>>>>>> **
>>>>>> Martin
>>>>>> 
>>>>>> 2013/5/26 Charlie Hunt <chunt at salesforce.com>:
>>>>>>> Which version of the JDK/JRE are you using?
>>>>>>> 
>>>>>>> One of the links you referenced below was using JDK 6, where there is no official support for G1. The other link suggests it could have been RMI DGC or a System.gc().
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Sent from my iPhone
>>>>>>> 
>>>>>>> On May 25, 2013, at 11:43 PM, "Martin Makundi" <martin.makundi at koodaripalvelut.com> wrote:
>>>>>>> 
>>>>>>>> it occurs daily.
>>>>> 
>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130529/9978f7a5/attachment-0001.html 


More information about the hotspot-gc-use mailing list