Minor GC difference Java 7 vs Java 8
Jon Masamitsu
jon.masamitsu at oracle.com
Tue Jun 3 17:37:24 UTC 2014
On 06/03/2014 06:51 AM, Chris Hurst wrote:
> Hi,
>
> Reducing parallel threads on the simple testbed sample code actually
> reduced minor GC STW's (presumably in this scenario multithreading is
> inefficient due to the simplicity of the task), however the reverse
> was true for the real application though for a signal thread reduction
> the difference is hard to measure sub ms if at all. I'd expect the
> difference to be because the real application will have a more complex
> object graph and be tenuring with a very small amount of promotion to
> old. If a spike does occur with parallel gc threads reduced ie
> increasing other process activity on the box the duration of the spike
> appears unaffected by the reduction in parallel GC threads which I
> guess I would expect.
>
> Any suggestions as to what value to use for
> WorkStealingYieldsBeforeSleep, otherwise I'll just trying doubling it
> ? If you have any additional notes , details on this change that would
> be most helpful.
WorkStealingYieldsBeforeSleep is one of those flags that
just depends on your needs so I don't have any good
suggestion.
Jon
>
> Chris
>
> PS This particular application uses the following (we regularly test
> other GC permutations including G1 / CMS, our goal is lower latency) ..
>
> -Xms1536m
> -Xmx1536m
> -XX:+PrintCommandLineFlags
> -XX:+UseParallelOldGC
> -XX:+UnlockDiagnosticVMOptions
> -XX:PermSize=50M
> -XX:-UseAdaptiveSizePolicy
> -XX:+AlwaysPreTouch
> -XX:MaxTenuringThreshold=15
> -XX:InitialTenuringThreshold=15
> -XX:+DTraceMonitorProbes
> -XX:+ExtendedDTraceProbes
> -Dsun.rmi.dgc.client.gcInterval=604800000
> -Dsun.rmi.dgc.server.gcInterval=604800000
>
> ------------------------------------------------------------------------
> Date: Mon, 2 Jun 2014 15:52:14 -0700
> From: jon.masamitsu at oracle.com
> To: christopherhurst at hotmail.com
> CC: hotspot-gc-use at openjdk.java.net
> Subject: Re: Minor GC difference Java 7 vs Java 8
>
>
> On 06/01/2014 05:14 PM, Chris Hurst wrote:
>
> Hi,
>
> Thanks for the replies, I've been stuck on this issue for about a
> year and had raised it with Oracle support but hadn't got anywhere
> but last weekend I managed to get a lot further with it ..
>
> I wrote a trivial program to continuously fill young gen and
> release the garbage for use with some DTrace tests and this showed
> a similar issue spikes wise from there I could work out the issue
> as the parallel GC threads, i.e. anything less than number of
> cores removed the spike (ie cores-1), for the test program normal
> young GC oscillated about 1ms but spiked at about 15ms(need to
> check) (Reducing the parallel threads worked on the real
> application in a similar way).
> We managed to identify some very minor tasks (they were so small
> they weren't showing up on some of our less fine grained CPU
> monitoring) that occasionally competed for CPU, the effect was
> surprising relatively but now we understand the cause we can tune
> the Java 6 GC better.
> The spikes were again larger than I would have expected and all
> appear to be every close in size, I wouldn't have predicted this
> from the issue but that's fine ;-)
>
> Currently the Java 7 version is still not quite as good on overall
> throughput when not spiking though I will recheck these results,
> as our most recent tests were around tuning J6 with the new info.
> We're happy with our Java 6 GC performance.
>
> Although we can now reduce these already rare spikes (potentially
> to zero), I can't 100% guarantee they won't occur so I would still
> like to understand why Java 7 appears to handle this scenario less
> efficiently.
>
> Using Dtrace we were mostly seeing yields and looking at a stack
> trace pointed us toward some JDK 7 changes and some newer java
> options that might be related ?? ...
>
> taskqueue.cpp
>
> libc.so.1`lwp_yield+0x15 libjvm.so`__1cCosFyield6F_v_+0x257
> libjvm.so`__1cWParallelTaskTerminatorFyield6M_v_+0x18
> libjvm.so`__1cWParallelTaskTerminatorRoffer_termination6MpnUTerminatorTerminator__b_+0xe8
> libjvm.so`__1cJStealTaskFdo_it6MpnNGCTaskManager_I_v_+0x378
> libjvm.so`__1cMGCTaskThreadDrun6M_v_+0x19f
> libjvm.so`java_start+0x1f2 libc.so.1`_thr_setup+0x4e
> libc.so.1`_lwp_start 17:29
>
> uintx WorkStealingHardSpins = 4096
> {experimental}
> uintx WorkStealingSleepMillis = 1
> {experimental}
> uintx WorkStealingSpinToYieldRatio = 10
> {experimental}
> uintx WorkStealingYieldsBeforeSleep = 5000
> {experimental}
>
> I haven't had a chance to play with these as yet but could these
> be involved eg j7 tuned to be more friendly to other applications
> at the cost of latency (spin to yield) ? Would that make sense ?
>
>
> Chris,
>
> My best recollection is that there was a performance regression
> reported internally and the change to 5000 was to fix
> that regression. Increasing the number of yield's done before
> a sleep made this code work more like the previous behavior.
> Let me know if you need better information and I can see what
> I can dig up.
>
> By the way, when you tuned down the number of ParallelGCThreads,
> you saw little or no increase in the STW pause times?
>
> You're using UseParallelGC?
>
> Jon
>
> We would like to move to Java 7 for support reasons, also as we
> are on Solaris the extra memory over head of J8 (64bit only) even
> with compressed oops gives us another latency hit.
>
> Chris
>
> PS -XX:+AlwaysPreTouch is on.
>
> > Date: Thu, 29 May 2014 13:47:39 -0700
> > From: Peter.B.Kessler at Oracle.COM <mailto:Peter.B.Kessler at Oracle.COM>
> > To: christopherhurst at hotmail.com
> <mailto:christopherhurst at hotmail.com>;
> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> > Subject: Re: Minor GC difference Java 7 vs Java 8
> >
> > Are the -XX:+PrintGCDetails "[Times: user=0.01 sys=0.00,
> real=0.03 secs]" reports for the long pauses different from the
> short pauses? I'm hoping for some anomalous sys time, or user/real
> ratio, that would indicate it was something happening on the
> machine that is interfering with the collector. But you'd think
> that would show up as occasional 15ms blips in your message
> processing latency outside of when the collector goes off.
> >
> > Does -XX:+PrintHeapAtGC show anything anomalous about the space
> occupancy after the long pauses? E.g., more objects getting copied
> to the survivor space, or promoted to the old generation? You
> could infer the numbers from -XX:+PrintGCDetails output if you
> didn't want to deal with the volume produced by -XX:+PrintHeapAtGC.
> >
> > You don't say how large or how stable your old generation size
> is. If you have to get new pages from the OS to expand the old
> generation, or give pages back to the OS because the old
> generation can shrink, that's extra work. You can infer this
> traffic from -XX:+PrintHeapAtGC output by looking at the
> "committed" values for the generations. E.g., in "ParOldGen total
> 43008K, used 226K [0xba400000, 0xbce00000, 0xe4e00000)" those
> three hex numbers are the start address for the generation, the
> end of the committed memory for that generation, and the end of
> the reserved memory for that generation. There's a similar report
> for the young generation. Running with -Xms equal to -Xmx should
> prevent pages from being acquired from or returned to the OS
> during the run.
> >
> > Are you running with -XX:+AlwaysPreTouch? Even if you've
> reserved and committed the address space, the first time you touch
> new pages the OS wants to zero them, which takes time. That flags
> forces all the zeroing at initialization. If you know your page
> size, you should be able to see the generations (mostly the old
> generation) crossing a page boundary for the first time in the
> -XX:+PrintHeapAtGC output.
> >
> > Or it could be some change in the collector between JDK-6 and JDK-7.
> >
> > Posting some log snippets might let sharper eyes see something.
> >
> > ... peter
> >
> > On 04/30/14 07:58, Chris Hurst wrote:
> > > Hi,
> > >
> > > Has anyone seen anything similar to this ...
> > >
> > > On java 6 (range of versions 32bit Solaris) application ,
> using parallel old gc, non adapative. Using a very heavy test
> performance load we see minor GC's around the 5ms mark and some
> very rare say 3or4 ish instances in 12 hours say 20ms pauses the
> number of pauses is random (though always few compares with the
> total number of GC's) and large ~20ms (this value appears the same
> for all such points.) We have a large number of minor GC's in our
> runs, only a full GC at startup. These freak GC's can be bunched
> or spread out and we can run for many hours without one (though
> doing minor GC's).
> > >
> > > What's odd is that if I use Java 7 (range of versions 32bit)
> the result is very close but the spikes (1 or 2 arguably less) are
> now 30-40ms (depends on run arguably even rarer). Has anyone
> experienced anything similar why would Java 7 up to double a minor
> GC / The GC throughput is approximately the same arguably 7 is
> better throughput just but that freak minor GC makes it usable due
> to latency.
> > >
> > > In terms of the change in spike height (20 (J6)vs40(J7)) this
> is very reproducible though the number of points and when they
> occur varies slightly. The over all GC graph , throughput is
> similar otherwise as is the resultant memory dump at the end. The
> test should be constant load, multiple clients just doing the same
> thing over and over.
> > >
> > > Has anyone seen anything similar, I was hoping someone might
> have seen a change in defaults, thread timeout, default data
> structure size change that would account for this. I was hoping
> the marked increase might be a give away to someone as its way off
> our average minor GC time.
> > >
> > > We have looked at gclogs, heap dumps, processor activity,
> background processes, amount of disc access, safepoints etc etc. ,
> we trace message rate into out of the application for variation,
> compare heap dumps at end etc. nothing stands out so far.
> > >
> > > Chris
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > hotspot-gc-use mailing list
> > > hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140603/7eefe743/attachment.html>
More information about the hotspot-gc-use
mailing list