Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC)

Sun Nov 4 01:14:49 PDT 2012

Hi Andreas,

UseParallelOldGC is turned on by default starting from 6679764 [1]. You
should find it working in JDK7u4 and above. Just ran a test with JDK7u6 and
it worked.
This change was never backported to JDK6.

- Kris

[1]:
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-February/004045.html

On Sun, Nov 4, 2012 at 3:13 PM, Andreas Müller
<Andreas.Mueller at mgm-tp.com>wrote:

>  Hi,****
>
> ** **
>
> I can confirm Vitaly’s observation that ParallelOldGC in many cases does
> not bring about much benefit.****
>
> Sometimes I saw usr/real time ratio stayed close to 1 and sometimes it was
> higher but with very little effect on the Full GC pause times. ****
>
> BTW, do you expect much effect with that option on a 2-CPU-machine? What
> percentage range? ****
>
> ** **
>
> I also found that presentation again which claimed that “-XX:+UseParallelOldGC
> (on by default with ParallelGC in JDK 6”:****
>
> http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf****
>
> which had confused me for a while because I could not get usr/real>1
> during Full GC runs without adding –XX:+UseParallelOldGC explicitly. ****
>
> ** **
>
> Best regards****
>
> Andreas****
>
> ** **
>
> ***Von:* Srinivas Ramakrishna [mailto:ysr1729 at gmail.com]
> *Gesendet:* Samstag, 3. November 2012 08:04
> *An:* Vitaly Davidovich
> *Cc:* Charlie Hunt; Andreas Müller; hotspot-gc-use; Simone Bordet
> *Betreff:* Parallel vs Serial Old (was Re: G1 issue: falling over to Full
> GC)****
>
> ** **
>
> [Edited subject line to show actual subject of discussion in last few
> emails in the thread]
>
> One issue I have found with ParallelOld vs Serial for sufficiently large
> heaps is that if there are large oop-rich objects,
> the deferred updates phase which is single-threaded and slow greatly
> dominates the pause time. There's discussion of this
> in an earlier thread (late last year or early this year), and I promised
> to work on a patch although never got around to it. We partially
> worked around it by preventing full compaction (i.e. compaction below
> dense prefix), but it doesn't work for all cases,
> for instance when an application churns large oop-rich objects (i.e.
> object arrays) through the old generation.
> Don't know if a CR was filed tracking that sighting and discussion.
>
> Other than those anomalies, I have usually seen user/elapsed time ratios
> of 10-12 using 18 worker threads in
> the cases I recall. That doesnot however mean a speed up of 10-12 versus
> serial. More like 5-6 x. YMMV of course.
>
> -- ramki****
>
> On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:****
>
> To be honest, I didn't dig in yet as I got the set up running in our plant
> towards the end of the day, and only casually looked at basic GC timestamps
> for the full GCs.****
>
> We do use some weak refs (no soft/phantom though), but I wouldn't call it
> heavy (or even medium) for that matter.  However, I'd have to look at what
> GC reports, as you mention, to make sure, but I'm pretty confident that
> it's not heavy. :)****
>
> The server is dedicated to this sole java process, and nothing else of
> significance (mem or cpu) is running on there.****
>
> I'll try to investigate next week to see if anything sticks out.  Regular
> old GC is sufficient for my use case now, so I'm merely trying to see if I
> can get some really cheap gains purely by enabling the parallel collector.
> :)****
>
> Generally speaking though, what sort of (ballpark) speedup is expected for
> parallel old vs single threaded? Let's say on a machine with a modest CPU
> count (8-16 hardware threads).  I'd imagine any contention would
> significantly reduce the speedup factor for hugely parallel machines, but
> curious about the modest space.  Are there any known issues/scenarios that
> would nullify its benefit, other than what you've already mentioned?****
>
> Thanks for all the advice and info.****
>
> Sent from my phone****
>
> On Nov 2, 2012 7:14 PM, "Charlie Hunt" <chunt at salesforce.com> wrote:****
>
> Do you have GC logs you could share?****
>
> ** **
>
> We probably are gonna need more info on what's going on within
> ParallelOld.   We might get some additional info from
>  +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes.  I don't recall
> how intrusive they are though.  If you've got a lot of threads, we'll
> probably get a lot of data too.  But, hopefully there's something in there
> that lends a clue as to issue.  If there's contention, that suggests to me
> some contention in work stealing.  IIRC, there's a way to get work stealing
> info in +ParallelOld GC.  But, my mind is drawing a blank. :-|****
>
> ** **
>
> Just off the top of my head, do you know if this app makes heavy use of
> Reference objects, i.e. < Weak | Soft | Phantom | Final > References?****
>
> ** **
>
> Adding +PrintReferenceGC will tell us what kind of overhead you're
> experiencing with reference processing.  If you're seeing high values of
> reference processing, then you'll probably want to add
> -XX:+ParallelRefProcEnabled.****
>
> ** **
>
> I'd look at reference processing first before looking at the
> +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps.****
>
> ** **
>
> Ooh, another thought, are there other Java apps running on the same
> system?  If so, how many GC threads and application threads tend to be
> active at any given time?****
>
> ** **
>
> hths,****
>
> ** **
>
> charlie ...****
>
> ** **
>
> On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote:****
>
>
>
> ****
>
> Thanks Charlie.  At a quick glance, I didn't see it benefit my case today
> (~5gb old) - wall clock time was roughly same as single threaded, but user
> time was quite high (7 secs wall, 37 sec user).  This is on an 8 way Xeon
> Linux server.****
>
> I seem to vaguely recall reading that parallel old sometimes performs
> worse than single threaded old in some cases, perhaps due to some
> contention between GC threads.****
>
> Anyway, I'll keep monitoring though.****
>
> Thanks****
>
> Sent from my phone****
>
> On Nov 2, 2012 10:15 AM, "Charlie Hunt" <chunt at salesforce.com> wrote:****
>
> Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not
> auto-enabled.****
>
> ** **
>
> hths,****
>
> ** **
>
> charlie ...****
>
> ** **
>
> On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote:****
>
>
>
> ****
>
> Hi Charlie,****
>
> Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by
> default, as you say, until 7u4 so I'm unsure if that's for some
> good/specific reason or not.****
>
> Thanks****
>
> Sent from my phone****
>
> On Nov 2, 2012 8:36 AM, "Charlie Hunt" <chunt at salesforce.com> wrote:****
>
> Jumping in a bit late ...
>
> Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4.
>  And, even better if you use (as of this writing) 7u9, or the latest
> production Java 7 HotSpot VM.
>
> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one
> issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858),
> which is currently slated to be back ported to a future Java 7, (thanks
> Monica, John Cuthbertson and Bengt tackling this!).
>
> >From looking at your observations and others comments thus far, my
> initial reaction is that with a 1G Java heap, you might get the best
> results with -XX:+UseParallelOldGC.  Are you using -XX:+UseParallelGC, or
> -XX:+UseParallelOldGC?  Or, are you not setting a GC?  Not until 7u4 is
> -XX:+UseParallelOldGC automatically set for what's called "server class"
> machines when you don't specify a GC.
>
> The lengthy concurrent mark could be the result of the implementation of
> G1 in 6u*, or it could be that your system is swapping. Could you check if
> your system is swapping?  On Solaris you can monitor this using vmstat and
> observing, not only just free memory, but also sr == scan rate along with
> pi == page in and po == page out.  Seeing sr (page scan activity) along
> with low free memory along with pi & po activity are strong suggestions of
> swapping.  Seeing low free memory and no sr activity is ok, i.e. no
> swapping.
>
> Additionally, you are right.  "partial" was changed to "mixed" in the GC
> logs.  For those interested in a bit of history .... this change was made
> since we felt "partial" was misleading.  What partial was intended to mean
> was a partial old gen collection, which did occur.  But, on that same GC
> event it also included a young gen GC.  As a result, we changed the GC
> event name to "mixed" since that GC event was really a combination of both
> a young gen GC and portion of old gen GC.
>
> Simone also has a good suggestion with including -XX:+PrintFlagsFinal and
> -showversion as part of the GC log data to collect, especially with G1
> continuing to be improve and evolve.
>
> Look forward to seeing your GC logs!
>
> hths,
>
> charlie ....
>
> On Nov 2, 2012, at 5:46 AM, Andreas Müller wrote:
>
> > Hi Simone,
> >
> >> 4972.437: [GC pause (partial), 1.89505180 secs]
> >> that I cannot decypher (to Monica - what "partial" means ?), and no
> mixed GCs, which seems unusual as well.
> > Oops, I understand that now: 'partial' used to be what 'mixed' is now!
> > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7
> because I had learned that G1 is far from mature in 6u33.
> > But automatic deployments can overwrite the start script and thus switch
> back to 6u33.
> >
> >> Are you sure you are actually using 1.7.0_u7 ?
> > I have checked that in the archived start scripts and the result,
> unfortunetaley, is: no.
> > The 'good case' was actually running on 7u7 (that's why it was good),
> but the 'bad case' was unwittingly run on 6u33 again.
> > That's the true reason why the results were so much worse and so
> incomprehensible.
> > Thank you very much for looking at the log and for asking good questions!
> >
> > I'll try to repeat the test and post the results on this list.
> >
> > Regards
> > Andreas
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use****
>
> ** **
>
> ** **
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use****
>
> ** **
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121104/a0d0681b/attachment-0001.html