1.7 G1GC significantly slower than 1.6 Mark and Sweep?

John Hendrikx hjohn at xs4all.nl
Thu Apr 28 23:10:08 PDT 2011


I tried many -XX:MaxGCPauseMillis settings, including not setting it at 
all, 20, 10, 5, 2.  The results were similar each time -- it didn't 
really have much of an effect.  In retrospect you might say that the 
total CPU use is what is causing the problems, not necessarily the 
length of the pauses -- whether this extra CPU use is caused by the 
collector or because of some other change in Java 7 I donot know; the 
program is the same.  Is there perhaps another collector that I could 
try to see if this lowers CPU use?  Or settings (even non-GC related) 
that could lower CPU use?

Java 6's CMS I didn't need to tune.  After determining that the length 
of GC pauses was causing problems in the application, I tried turning 
CMS on and it resolved the problems.

What I observe is that even though with Java 7 the pauses seem (are?) 
very short, the CPU use is a lot higher (from 65% under Java 6 to 95% 
with 7).  This could be related to other causes (perhaps threading 
overhead, debug code in Java 7, etc) but I doubt it is in any specific 
Java code that I wrote as most of the heavy lifting is happening in 
native methods.  It could for example be that several ByteBuffers being 
used are being copied under Java 7 while under 6 direct access was possible.

John.

Jon Masamitsu wrote:
> John,
>
> You're telling G1 (UseG1GC) to limit pauses to 2ms.
> (-XX:MaxGCPauseMillis=2) but seemed to have tuned
> CMS (UseConcMarkSweepGC) toward a 20ms goal.
> G1 is trying to do very short collections and needs to do many
> of them to keep up with the allocation rate.  Did you
> mean you are setting MaxGCPauseMillis to 20?
>
> Jon
>
> On 4/26/2011 7:59 AM, John Hendrikx wrote:
>   
>> Hi list,
>>
>> I've been testing Java 1.6 performance vs Java 1.7 performance with a
>> timing critical application -- it's essential that garbage collection
>> pauses are very short.  What I've found is that Java 1.6 seems to
>> perform significantly better than 1.7 (b137) in this respect, although
>> with certain settings 1.6 will also fail catastrophically.   I've used
>> the following options:
>>
>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
>> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>>
>> The amount of garbage created is roughly 150 MB/sec.  The application
>> demands a response time of about 20 ms and uses half a dozen threads
>> which deal with buffering and decoding of information.
>>
>> With the above settings, the 1.6 VM will meet this goal over a 2 minute
>> period>99% of the time (with an average CPU consumption of 65% per CPU
>> core for two cores) -- from verbosegc I gather that the pause times are
>> around 0.01-0.02 seconds:
>>
>> [GC 187752K->187559K(258880K), 0.0148198 secs]
>> [GC 192156K(258880K), 0.0008281 secs]
>> [GC 144561K->144372K(258880K), 0.0153497 secs]
>> [GC 148965K(258880K), 0.0008028 secs]
>> [GC 166187K->165969K(258880K), 0.0146546 secs]
>> [GC 187935K->187754K(258880K), 0.0150638 secs]
>> [GC 192344K(258880K), 0.0008422 secs]
>>
>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.
>> It can also introduce OutOfMemory conditions and other catastrophic
>> failures (one time the GC took 10 seconds after the application had only
>> been running 20 seconds).  How stable 1.6 will perform with the initial
>> settings remains to be seen; the results with more RAM worry me somewhat.
>>
>> The 1.7 VM however performs significantly worse.  Here is some of its
>> output (over roughtly a one second period):
>>
>> [GC concurrent-mark-end, 0.0197681 sec]
>> [GC remark, 0.0030323 secs]
>> [GC concurrent-count-start]
>> [GC concurrent-count-end, 0.0060561]
>> [GC cleanup 177M->103M(256M), 0.0005319 secs]
>> [GC concurrent-cleanup-start]
>> [GC concurrent-cleanup-end, 0.0000676]
>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
>> [GC concurrent-mark-start]
>> [GC concurrent-mark-end, 0.0152915 sec]
>> [GC remark, 0.0033085 secs]
>> [GC concurrent-count-start]
>> [GC concurrent-count-end, 0.0085232]
>> [GC cleanup 163M->129M(256M), 0.0004847 secs]
>> [GC concurrent-cleanup-start]
>> [GC concurrent-cleanup-end, 0.0000363]
>>
>>   From the above output one would not expect the performance to be worse,
>> however, the application fails to meet its goals 10-20% of the time.
>> The amount of garbage created is the same.  CPU time however is hovering
>> around 90-95%, which is likely the cause of the poor performance.  The
>> GC seems to take a significantly larger amount of time to do its work
>> causing these stalls in my test application.
>>
>> I've experimented with memory sizes and max pause times with the 1.7 VM,
>> and although it seemed to be doing better with more RAM, it never comes
>> even close to the performance observed with the 1.6 VM.
>>
>> I'm not sure if there are other useful options I can try to see if I can
>> tune the 1.7 VM performance a bit better. I can provide more
>> information, although not any (useful) source code at this time due to
>> external dependencies (JNA/JNI) of this application.
>>
>> I'm wondering if I'm missing something as it seems strange to me that
>> 1.7 is actually underperforming for me when in general most seem to
>> agree that the G1GC is a huge improvement.
>>
>> --John
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>     
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>   



More information about the hotspot-gc-use mailing list