Minor GC difference Java 7 vs Java 8

Mon Jun 2 22:52:14 UTC 2014

On 06/01/2014 05:14 PM, Chris Hurst wrote:
> Hi,
>
> Thanks for the replies, I've been stuck on this issue for about a year 
> and had raised it with Oracle support but hadn't got anywhere but last 
> weekend I managed to get a lot further with it ..
>
> I wrote a trivial program to continuously fill young gen and release 
> the garbage for use with some DTrace tests and this showed a similar 
> issue spikes wise from there I could work out the issue as the 
> parallel GC threads, i.e. anything less than number of cores removed 
> the spike (ie cores-1), for the test program normal young GC 
> oscillated about 1ms but spiked at about 15ms(need to check) (Reducing 
> the parallel threads worked on the real application in a similar way).
> We managed to identify some very minor tasks (they were so small they 
> weren't showing up on some of our less fine grained CPU monitoring) 
> that occasionally competed for CPU, the effect was surprising 
> relatively but now we understand the cause we can tune the Java 6 GC 
> better.
> The spikes were again larger than I would have expected and all appear 
> to be every close in size, I wouldn't have predicted this from the 
> issue but that's fine ;-)
>
> Currently the Java 7 version is still not quite as good on overall 
> throughput when not spiking though I will recheck these results, as 
> our most recent tests were around tuning J6 with the new info. We're 
> happy with our Java 6 GC performance.
>
> Although we can now reduce these already rare spikes (potentially to 
> zero), I can't 100% guarantee they won't occur so I would still like 
> to understand why Java 7 appears to handle this scenario less efficiently.
>
> Using Dtrace we were mostly seeing yields and looking at a stack trace 
> pointed us toward some JDK 7 changes and some newer java options that 
> might be related ?? ...
>
> taskqueue.cpp
>
> libc.so.1`lwp_yield+0x15 libjvm.so`__1cCosFyield6F_v_+0x257 
> libjvm.so`__1cWParallelTaskTerminatorFyield6M_v_+0x18 
> libjvm.so`__1cWParallelTaskTerminatorRoffer_termination6MpnUTerminatorTerminator__b_+0xe8 
> libjvm.so`__1cJStealTaskFdo_it6MpnNGCTaskManager_I_v_+0x378 
> libjvm.so`__1cMGCTaskThreadDrun6M_v_+0x19f libjvm.so`java_start+0x1f2 
> libc.so.1`_thr_setup+0x4e              libc.so.1`_lwp_start 17:29
>
> uintx WorkStealingHardSpins                     = 4096            
> {experimental}
> uintx WorkStealingSleepMillis                   = 1               
> {experimental}
> uintx WorkStealingSpinToYieldRatio              = 10              
> {experimental}
> uintx WorkStealingYieldsBeforeSleep             = 5000            
> {experimental}
>
> I haven't had a chance to play with these as yet but could these be 
> involved eg j7 tuned to be more friendly to other applications at the 
> cost of latency (spin to yield) ? Would that make sense ?
>

Chris,

My best recollection is that there was a performance regression
reported internally and the change to 5000 was to fix
that regression.   Increasing the number of yield's done before
a sleep made this code work more like the previous behavior.
Let me know if you need better information  and I can see what
I can dig up.

By the way,  when you tuned down the number of ParallelGCThreads,
you saw little or no increase in the STW pause times?

You're using UseParallelGC?

Jon

> We would like to move to Java 7 for support reasons, also as we are on 
> Solaris the extra memory over head of J8 (64bit only) even with 
> compressed oops gives us another latency hit.
>
> Chris
>
> PS -XX:+AlwaysPreTouch is on.
>
> > Date: Thu, 29 May 2014 13:47:39 -0700
> > From: Peter.B.Kessler at Oracle.COM
> > To: christopherhurst at hotmail.com; hotspot-gc-use at openjdk.java.net
> > Subject: Re: Minor GC difference Java 7 vs Java 8
> >
> > Are the -XX:+PrintGCDetails "[Times: user=0.01 sys=0.00, real=0.03 
> secs]" reports for the long pauses different from the short pauses? 
> I'm hoping for some anomalous sys time, or user/real ratio, that would 
> indicate it was something happening on the machine that is interfering 
> with the collector. But you'd think that would show up as occasional 
> 15ms blips in your message processing latency outside of when the 
> collector goes off.
> >
> > Does -XX:+PrintHeapAtGC show anything anomalous about the space 
> occupancy after the long pauses? E.g., more objects getting copied to 
> the survivor space, or promoted to the old generation? You could infer 
> the numbers from -XX:+PrintGCDetails output if you didn't want to deal 
> with the volume produced by -XX:+PrintHeapAtGC.
> >
> > You don't say how large or how stable your old generation size is. 
> If you have to get new pages from the OS to expand the old generation, 
> or give pages back to the OS because the old generation can shrink, 
> that's extra work. You can infer this traffic from -XX:+PrintHeapAtGC 
> output by looking at the "committed" values for the generations. E.g., 
> in "ParOldGen total 43008K, used 226K [0xba400000, 0xbce00000, 
> 0xe4e00000)" those three hex numbers are the start address for the 
> generation, the end of the committed memory for that generation, and 
> the end of the reserved memory for that generation. There's a similar 
> report for the young generation. Running with -Xms equal to -Xmx 
> should prevent pages from being acquired from or returned to the OS 
> during the run.
> >
> > Are you running with -XX:+AlwaysPreTouch? Even if you've reserved 
> and committed the address space, the first time you touch new pages 
> the OS wants to zero them, which takes time. That flags forces all the 
> zeroing at initialization. If you know your page size, you should be 
> able to see the generations (mostly the old generation) crossing a 
> page boundary for the first time in the -XX:+PrintHeapAtGC output.
> >
> > Or it could be some change in the collector between JDK-6 and JDK-7.
> >
> > Posting some log snippets might let sharper eyes see something.
> >
> > ... peter
> >
> > On 04/30/14 07:58, Chris Hurst wrote:
> > > Hi,
> > >
> > > Has anyone seen anything similar to this ...
> > >
> > > On java 6 (range of versions 32bit Solaris) application , using 
> parallel old gc, non adapative. Using a very heavy test performance 
> load we see minor GC's around the 5ms mark and some very rare say 3or4 
> ish instances in 12 hours say 20ms pauses the number of pauses is 
> random (though always few compares with the total number of GC's) and 
> large ~20ms (this value appears the same for all such points.) We have 
> a large number of minor GC's in our runs, only a full GC at startup. 
> These freak GC's can be bunched or spread out and we can run for many 
> hours without one (though doing minor GC's).
> > >
> > > What's odd is that if I use Java 7 (range of versions 32bit) the 
> result is very close but the spikes (1 or 2 arguably less) are now 
> 30-40ms (depends on run arguably even rarer). Has anyone experienced 
> anything similar why would Java 7 up to double a minor GC / The GC 
> throughput is approximately the same arguably 7 is better throughput 
> just but that freak minor GC makes it usable due to latency.
> > >
> > > In terms of the change in spike height (20 (J6)vs40(J7)) this is 
> very reproducible though the number of points and when they occur 
> varies slightly. The over all GC graph , throughput is similar 
> otherwise as is the resultant memory dump at the end. The test should 
> be constant load, multiple clients just doing the same thing over and 
> over.
> > >
> > > Has anyone seen anything similar, I was hoping someone might have 
> seen a change in defaults, thread timeout, default data structure size 
> change that would account for this. I was hoping the marked increase 
> might be a give away to someone as its way off our average minor GC time.
> > >
> > > We have looked at gclogs, heap dumps, processor activity, 
> background processes, amount of disc access, safepoints etc etc. , we 
> trace message rate into out of the application for variation, compare 
> heap dumps at end etc. nothing stands out so far.
> > >
> > > Chris
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > hotspot-gc-use mailing list
> > > hotspot-gc-use at openjdk.java.net
> > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> > >

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140602/7b20f050/attachment-0001.html>