Bug in G1GC it performs Full GC when code cache is full resulting in overkill

Martin Makundi martin.makundi at koodaripalvelut.com
Wed May 29 08:49:46 PDT 2013


Hi!

> A bit of constructive criticism ;-)  It would be good practice to set one option at a time and measure its performance to determine whether it improves performance rather than choosing an option because of something you read in text.  In short, always measure and reason about whether what you've observed for an improvement or regression makes sense.  And, also run multiple times to get a sense of noise versus real improvement or regression.


Thanks. That's one of the reasons we never changed our options. Once
we found someting that works very well, we know that its always n!
work to test changes and the system was running very nice indeed
before the previous tweak ;)

>>> -XX:+UseFastAccessorMethod  (the default is disabled)
>>
>> Fast sounds good, the description of it is "Use optimized versions of
>> Get<Primitive>Field" which sounds good. I see no harm in this.
>
> These would be JNI operations.
>
> A quick at the HotSpot source suggests UseFastAccessorMethods
> is mostly confined to interpreter operations.

Thanks for the info. Doesn't say much to me, but does not seem to harm
anything. Will try setting it off at some point in time.

>>> -XX:+UseNUMA  (Are you running a JVM that spans NUMA memory nodes?
>>> Or, do you have multiple JVMs running on a NUMA or non-NUMA system?)
>>
>
> Key point is that you should use -XX:+UseNUMA only when you are
> deploying  a JVM that spans NUMA nodes.

Thanks for the info. Doesn't say much to me, but does not seem to harm
anything. Will try setting it off at some point in time.

>>> -XX:+UseStringCache (Do you have evidence that this helps?  And, do you know what it does?)
>>
>> I assume it is some sort of string interning solution. Don't know
>> exactly what it does, but our application uses high amount of
>> redundant strings, smaller memory footprint is a good idea. Again,
>> very little documentation about this available but seems
>> straightforward. Haven't benchmarked it personally.
>
> I won't go into the details of what it does.  I don't think I can say what it does without possibly being at risk of binding separation agreement.
>
> I'll just say that you should measure the perf difference with it off versus on if you think it might help.

No visible impact on performance, really, in production. If we test it
with a bogus test case, well.. results are as informative as our tests
are close to the production. Which is unlikely.

>>> -XX:CMSInitiatingOccupancyFraction=70 (This is applicable to CMS GC, and not applicable to G1 GC)
>>
>> Again, not documented thoroughly where it applies and where not, jvm
>> gave no warning/error about it so we assumed it's valid.
>
> There's always the HotSpot source code ;-)
>
> It's also quite well documented in various slide ware on the internet.
> It's also quite well documented in the Java Performance book. :-)

Uh.. does it say somewhere that Do not use
XX:CMSInitiatingOccupancyFraction with G1GC? ;) I know performance
tuning is your bread and butter but is not ours... is more like we are
just driving the car and you are the mechanic...different
perspective.. just trying to fill'er'up to go..leaded or unleaded...
;)

>>> -XX:InitiatingHeapOccupancyPercent=0  (You realize this will force G1's concurrent cycle to run continuously?)
>>
>> Yes, that's what we figured out, we don't want it to sit lazy and end
>> up in a situation where it is required to do a Full GC. This switch
>> was specifically chosen in a situation we had a memory leak and tried
>> to aggressively fight against it before we found the root cause. Maybe
>> we should try without this switch now, and see what effect it has.
>
> Having GC logs to see what available head room you have between the
> initiating of a G1 concurrent cycle and available regions / heap space would be most appropriate.

Hmm.. I don't thoroughly understand the logs either, but, here is a snap:

2013-05-29T17:28:56.119+0300: 38905.407: [GC pause (young), 0.29261000 secs]
   [Parallel Time: 288.8 ms]
      [GC Worker Start (ms):  38905407.6  38905407.7  38905407.7  38905407.9
       Avg: 38905407.7, Min: 38905407.6, Max: 38905407.9, Diff:   0.3]
      [Ext Root Scanning (ms):  22.8  16.3  18.1  22.0
       Avg:  19.8, Min:  16.3, Max:  22.8, Diff:   6.6]
      [SATB Filtering (ms):  0.0  0.1  0.0  0.0
       Avg:   0.0, Min:   0.0, Max:   0.1, Diff:   0.1]
      [Update RS (ms):  31.9  37.3  35.1  33.3
       Avg:  34.4, Min:  31.9, Max:  37.3, Diff:   5.5]
         [Processed Buffers : 102 106 119 104
          Sum: 431, Avg: 107, Min: 102, Max: 119, Diff: 17]
      [Scan RS (ms):  0.0  0.0  0.1  0.0
       Avg:   0.0, Min:   0.0, Max:   0.1, Diff:   0.1]
      [Object Copy (ms):  228.2  229.1  229.5  227.3
       Avg: 228.5, Min: 227.3, Max: 229.5, Diff:   2.2]
      [Termination (ms):  0.0  0.0  0.0  0.0
       Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
         [Termination Attempts : 4 1 11 4
          Sum: 20, Avg: 5, Min: 1, Max: 11, Diff: 10]
      [GC Worker End (ms):  38905690.5  38905690.5  38905690.5  38905690.5
       Avg: 38905690.5, Min: 38905690.5, Max: 38905690.5, Diff:   0.0]
      [GC Worker (ms):  282.9  282.8  282.8  282.6
       Avg: 282.8, Min: 282.6, Max: 282.9, Diff:   0.3]
      [GC Worker Other (ms):  5.9  6.0  6.0  6.2
       Avg:   6.0, Min:   5.9, Max:   6.2, Diff:   0.3]
   [Complete CSet Marking:   0.0 ms]
   [Clear CT:   0.1 ms]
   [Other:   3.7 ms]
      [Choose CSet:   0.0 ms]
      [Ref Proc:   2.8 ms]
      [Ref Enq:   0.1 ms]
      [Free CSet:   0.3 ms]
   [Eden: 48M(5032M)->0B(5032M) Survivors: 288M->288M Heap:
15790M(26624M)->15741M(26624M)]
 [Times: user=1.14 sys=0.00, real=0.29 secs]
Heap after GC invocations=575 (full 157):
 garbage-first heap   total 27262976K, used 16119181K
[0x0000000160000000, 0x00000007e0000000, 0x00000007e0000000)
  region size 8192K, 36 young (294912K), 36 survivors (294912K)
 compacting perm gen  total 524288K, used 164479K [0x00000007e0000000,
0x0000000800000000, 0x0000000800000000)
   the space 524288K,  31% used [0x00000007e0000000,
0x00000007ea09ff68, 0x00000007ea0a0000, 0x0000000800000000)
No shared spaces configured.
}
{Heap before GC invocations=575 (full 157):
 garbage-first heap   total 27262976K, used 16119181K
[0x0000000160000000, 0x00000007e0000000, 0x00000007e0000000)
  region size 8192K, 37 young (303104K), 36 survivors (294912K)
 compacting perm gen  total 524288K, used 164479K [0x00000007e0000000,
0x0000000800000000, 0x0000000800000000)
   the space 524288K,  31% used [0x00000007e0000000,
0x00000007ea09ff68, 0x00000007ea0a0000, 0x0000000800000000)
No shared spaces configured.
2013-05-29T17:28:56.413+0300: 38905.701: [Full GC
15742M->14497M(26624M), 56.7731320 secs]

That's the third Full GC today after the change to 26G and change from
occupancypercent=0. Tomorrow will be trying again with
occupancypercent=0

>>> -noclassgc (This is rarely needed and haven't seen an app that required it for quite some time)
>>
>> Jvm 1.6 stopped the world for couple of minutes several times per day
>> while unloading classes, so we used noclassgc to disable that. We do
>> not know if this is necessary for latest 1.7 to avoid class unload
>> pause, but we continued to use this switch and found no harm in it.
>> Can't afford testing that in production ;)
>
> Haven't seen a case where unloading classes cause a several minute pause.
> Are you sure your system is not swapping?  And, do you have GC logs you
> can share that illustrate the behavior and that -noclassgc fixed it?

We deleted swap partition long time ago, we simply do not risk swapping at all.

We had this class unloading problem several times per day like half a
year ago, and fixed it with noclasssgc, that was a no-brainer, single
parameter that made the difference.

It is also discussed here (they do not discuss noclassgc though, we
figured that out somehow)
http://stackoverflow.com/questions/2833983/meaning-of-the-unloading-class-messages

>>> -XX:+ UseGCOverheadLimit
>>> -XX:ReservedCodeCacheSize=48, that is the default for 7u21.  You might consider setting it higher if you have the available space, and more importantly if you think you're running out of code space.
>>
>> For our sun jvm linux 64bit 48m is maximum, jvm won't start if higher value.
>
> If you can't go larger than -XX:ReservedCodeCacheSize=48m, that may suggest you have memory constraints and may also suggest you don't have enough swap space defined, and you may be experiencing swapping during JVM execution.  I've got a Linux system that has 32 GB of RAM, I can set ReservedCodeCacheSize=256m with no issues, even with -Xms30g and -Xms30g.

It is also documented that 48m is maximum
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
"maximum code cache size. [Solaris 64-bit, amd64, and -server x86:
48m"

**
Martin

>
>
>>>
>>> On May 26, 2013, at 10:20 AM, Martin Makundi wrote:
>>>
>>>> Sorry, forgot to mention, using:
>>>>
>>>> java version "1.7.0_21"
>>>> Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
>>>>
>>>> Linux version 3.0.1.stk64 (dfn at localhost.localdomain) (gcc version
>>>> 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) #1 SMP Sat Aug 13 12:53:46
>>>> EDT 2011
>>>>
>>>> -Dclassworlds.conf=/usr/share/maven/maven/bin/m2.conf
>>>> -Dmaven.home=/usr/share/maven/maven
>>>> -Duser.timezone=EET
>>>> -XX:+AggressiveOpts
>>>> -XX:+DisableExplicitGC
>>>> -XX:+ParallelRefProcEnabled
>>>> -XX:+PrintGCDateStamps
>>>> -XX:+PrintGCDetails
>>>> -XX:+PrintHeapAtGC
>>>> -XX:+UseAdaptiveSizePolicy
>>>> -XX:+UseCompressedOops
>>>> -XX:+UseFastAccessorMethods
>>>> -XX:+UseG1GC
>>>> -XX:+UseGCOverheadLimit
>>>> -XX:+UseNUMA
>>>> -XX:+UseStringCache
>>>> -XX:CMSInitiatingOccupancyFraction=70
>>>> -XX:GCPauseIntervalMillis=10000
>>>> -XX:InitiatingHeapOccupancyPercent=0
>>>> -XX:MaxGCPauseMillis=500
>>>> -XX:MaxPermSize=512m
>>>> -XX:PermSize=512m
>>>> -XX:ReservedCodeCacheSize=48m
>>>> -Xloggc:gc.log
>>>> -Xmaxf1
>>>> -Xms30G
>>>> -Xmx30G
>>>> -Xnoclassgc
>>>> -Xss4096k
>>>>
>>>>
>>>> **
>>>> Martin
>>>>
>>>> 2013/5/26 Charlie Hunt <chunt at salesforce.com>:
>>>>> Which version of the JDK/JRE are you using?
>>>>>
>>>>> One of the links you referenced below was using JDK 6, where there is no official support for G1. The other link suggests it could have been RMI DGC or a System.gc().
>>>>>
>>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On May 25, 2013, at 11:43 PM, "Martin Makundi" <martin.makundi at koodaripalvelut.com> wrote:
>>>>>
>>>>>> it occurs daily.
>>>
>


More information about the hotspot-gc-use mailing list