JEP 248: Make G1 the Default Garbage Collector

Ben Evans ben at jclarity.com
Mon Jun 1 14:42:36 UTC 2015


Hi Vitaly,

(I've added hotspot-dev back on to the To: line as I think it's
important this discussion is had in public).

In general, Mark has outlined a design philosophy for the platform
that is conservative, and where, if features are not ready, then they
are slipped to the next major release. Features shouldn't be rushed or
releases delayed, instead production quality features should be
shipped when done.

So, to my mind, this issue comes down to whether the proposed benefit
is such that it outweighs the risks of changing the behaviour of
millions upon millions of installations. We don't have any systematic
data (which I argue should be a huge red flag in itself), and the
experience of consultants and performance engineers, including Kirk
and myself, is not exactly encouraging. So, does this change really
justify the risk?

I would also question the conclusion that all we can organise before
Java 10 is: "some reports from the field". For Java 8, the community
was able to engage with a pretty good group of F/OSS libraries & help
them to test on betas of 8, so they (& their users) could have
confidence that they would "just work" with 8 straight out of the box.

I see no reason why a similar approach could not work for G1 becoming
default - we can approach relevant partners in the ecosystem (e.g.
Cloudbees, Blazemeter, etc) and see if they can help, and we can
directly reach out and get people testing with G1. However, there is
an issue of timing and available resources here - there's a lot going
on for JDK 9 as it is, and I don't know how easy it would be to get
this programme running as well.

Finally, the other issue that I'd like to address is that of scope
creep. I'd always been under the impression that G1 was thought of as
the CMS replacement. However, (and admittedly a lot of the systems I
see are either financial or gaming) in its current state there is no
way that G1 is a general replacement for CMS. The pauses for G1 are
simply too long for a big class of low-latency systems.

Instead, G1 is now being talked of as a replacement for the default
collector. If that's the case, then I think we need to acknowledge it,
and have a conversation about where G1 is actually supposed to be
used. Are we saying we want a "reasonably high throughput with reduced
STW, but not low pause time" collector? If we are, that's fine, but
that's not where we started.

Thanks,

Ben

On Mon, Jun 1, 2015 at 3:05 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> Kirk,
>
> I don't dispute that some people aren't tuning/touching the GC controls, and
> may get negatively impacted (but perhaps positively too).  My main point,
> however, is I don't see waiting until java 10 as adding sufficient safety
> guards; certainly there will be more lab time and benchmarking at oracle,
> some reports from the field but inevitably there will be unknown workloads
> in the wild that still don't work well even after more "due diligence".  If
> G1 is truly the successor to CMS, kicking the can further down the road
> isn't helping achieve that.  Anyone seeing a regression has an easy way to
> opt out.  Any such change will always weed out some outliers, java 9, 10 or
> 15.  The longer we wait, the harder it may be to fix some of them.
>
> sent from my phone
>
> On Jun 1, 2015 9:43 AM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:
>>
>> Hi Vitaly,
>>
>> Ben has only re-iterated what I’ve already said but in a more concise way.
>> And, I don’t mean to be insulting but I don’t really buy into the argument
>> that people will be specifying a collector anyways because there are still a
>> significant number that use the parallel collector. In fact, just today, I
>> recommended that someone move away from G1 to the parallel collector as that
>> use case clearly favored the recommendation.
>>
>> And I should add, I’ve now backed a number of deployments off of
>> tiered-compilation as IME it is impacting performance in a negative way.
>>
>> Regards,
>> Kirk
>>
>> On Jun 1, 2015, at 3:05 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>
>> > Ben,
>> >
>> > The customers using CMS won't be impacted since they're explicitly
>> > specifying the GC.  Java 9 will already require extensive testing for
>> > people, and GC performance is luckily one of the more introspectable
>> > facilities.  Furthermore, people who are keen on staying with the
>> > default
>> > collector should/can lock that in before moving to Java 9 since
>> > presumably
>> > there will be enough visibility of this change in release notes and
>> > such.
>> >
>> > Personally, I find changing default JIT compilation policy to tiered in
>> > java 8 a more risky change, but I don't recall seeing such fervor around
>> > it
>> > :).
>> >
>> > sent from my phone
>> > On Jun 1, 2015 6:37 AM, "Ben Evans" <ben at jclarity.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> I'm somewhat late to this, having missed the original discussion
>> >> whilst travelling.
>> >>
>> >> Mark targeted this JEP to JDK 9 but has since put that on hold to
>> >> allow more discussion.
>> >>
>> >> I made this comment to Mark on jdk9-dev:
>> >>
>> >> "I have been working with G1 for ~5 years, ever since it was
>> >> experimental (& highly crash-prone in JDK 6).
>> >>
>> >> In the intervening time, I have seen dozens (if not hundreds) of
>> >> installations, across a wide range of customers. I have participated
>> >> in, or been consulted on at least a dozen direct trials of GC
>> >> alternatives.
>> >>
>> >> It is only in the last 18 months that I have seen *any* real-life
>> >> workload on G1 beat the alternatives, and only in the last 12 months
>> >> that I've had any customer prepared to go live with G1 in production.
>> >>
>> >> From my experience, I think that G1 is a fine collector, with a bright
>> >> future that should be pursued. However, I haven't seen anything that
>> >> would make a switch to it as default collector seem compelling in the
>> >> JDK 9 timeframe.
>> >>
>> >> Obviously, my experience is not universal, so I'd like to ask you /
>> >> Oracle:
>> >>
>> >> 1) Can you explain the survey methodology and customer testing that
>> >> you performed to arrive at the conclusion that G1 is ready to become
>> >> default?
>> >>
>> >> 2) Can you share aggregate results of the surveying ("We worked with X
>> >> customers and ran Y tests of G1 vs alternatives, and in Z% of cases,
>> >> G1 worked better by W margin")?
>> >>
>> >> 3) Can you ask some of the customers you worked with to speak publicly
>> >> about the trials you ran with them?"
>> >>
>> >> From reading this thread, am I right to conclude that no formal study
>> >> of this issue has been done?
>> >>
>> >> If that's the case, then are we really happy to make G1 default
>> >> without some more systematic efforts and attempts to obtain actual
>> >> numbers?
>> >>
>> >> The questions that I'd like to see answered are:
>> >>
>> >> a) How short a pause time can G1 support being tuned to? 50ms? 20?
>> >> Personally, I haven't seen it getting close to CMS in terms of STW
>> >> time.
>> >>
>> >> b) What is the impact on throughput due to G1?
>> >>
>> >> I do like G1 as a collector, but can we really organise enough field
>> >> tests in the pre-9 timeframe to justify such a large and potentially
>> >> breaking change? We managed to do some good community compatibility
>> >> testing for JDK 8, and we could think about a similar effort for "make
>> >> G1 default". However, with modules, HTTP/2 and JShell all happening
>> >> for 9, I question whether there is simply enough community bandwidth
>> >> to do a decent effort for G1 as well, whereas, if we were targeting
>> >> JDK 10 we'd have a lot more time to plan and to try to improve the
>> >> quality and range of the field data to hopefully de-risk a potential
>> >> large, high-profile failure.
>> >>
>> >> Thanks,
>> >>
>> >> Ben
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Apr 30, 2015 at 2:55 PM, Monica Beckwith
>> >> <monica at beckwithclan.com> wrote:
>> >>> I am also FOR the change in the default GC. Charlie and Mattis bring
>> >>> up
>> >>> great points. It's about time G1 gets put out there (as the default
>> >>> GC)
>> >>> since most of the development work is going into G1. As for
>> >> documentation,
>> >>> we not only need to document the change in the default collector but
>> >>> also
>> >>> the defaults for the collector; that are enabled as soon as G1 is
>> >> employed -
>> >>> e.g. MaxGCPauseMillis, IHOP, etc.
>> >>>
>> >>> With more and more input coming in, G1 is only going to get better and
>> >>> hopefully more adaptive :)
>> >>>
>> >>> And as for Charlie's question - I don't remember the last time that I
>> >> didn't
>> >>> see an explicit GC mentioned on the command line (even if it was the
>> >> default
>> >>> GC).
>> >>>
>> >>> These are just my two cents.
>> >>>
>> >>> -Monica
>> >>>
>> >>>
>> >>> On 4/30/15 8:17 AM, charlie hunt wrote:
>> >>>>
>> >>>> Fwiw, we should not forget that anyone who is currently specifying an
>> >>>> explicit GC to use in his or her JVM command line args will not
>> >> experience
>> >>>> any difference in behavior. They will still get the collector they
>> >> specify
>> >>>> to use. The (potential) impact will be on those who do not specify a
>> >>>> GC
>> >> to
>> >>>> use.
>> >>>>
>> >>>> What I would like to hear from Kirk and others who frequently work
>> >>>> with
>> >>>> customers on GC, what’s the percentage of Java applications they have
>> >> worked
>> >>>> with that do not explicitly specify a GC?  And, of those, what
>> >> percentage of
>> >>>> those apps fall into the categories of small heap and desire low
>> >> latency, or
>> >>>> desire high throughput even at the cost of frequent full GCs?
>> >>>>
>> >>>> thanks,
>> >>>>
>> >>>> charlie
>> >>>>
>> >>>>> On Apr 30, 2015, at 7:27 AM, Mattis Castegren
>> >>>>> <mattis.castegren at oracle.com> wrote:
>> >>>>>
>> >>>>> Hi.
>> >>>>>
>> >>>>> I also work with customers but I would like to give an argument FOR
>> >>>>> changing the default.
>> >>>>>
>> >>>>> I don't think we will ever come to a point where G1 is better for
>> >>>>> ALL
>> >>>>> users. Even with a near perfect G1 implementation there may be cases
>> >> where
>> >>>>> the parallel collector gives better throughput.
>> >>>>>
>> >>>>> Right now, I think G1 will be better for most users. There are
>> >>>>> probably
>> >>>>> also corner cases where G1 COULD be better, but where small issues
>> >> reduces
>> >>>>> performance. By changing the default to G1, we will be able to
>> >>>>> easier
>> >> find
>> >>>>> these as we will expose more users to G1.
>> >>>>>
>> >>>>> Finally, there will be a set of users who only care about
>> >>>>> throughput,
>> >> and
>> >>>>> who will see a performance regression. In those cases, they can go
>> >> back to
>> >>>>> using parallel. But hopefully, there will be far fewer users who
>> >>>>> need
>> >> to
>> >>>>> tune their application to run with parallel GC than there are users
>> >> who have
>> >>>>> to (or should) tune their application to run with G1.
>> >>>>>
>> >>>>> In the case of huge, business critical, applications, we will always
>> >>>>> introduce a risk by changing default collectors. This is true if we
>> >> change
>> >>>>> to G1 in JDK 9, 10 or 11. I prefer to just rip the band aid off. We
>> >> know
>> >>>>> that the collector we will focus on going forward is G1, so we
>> >>>>> should
>> >> let as
>> >>>>> many people use it as possible.
>> >>>>>
>> >>>>> Of course we should document this a lot, so that users who go up to
>> >> JDK 9
>> >>>>> and see performance regressions can at least try to run with
>> >>>>> Parallel
>> >> to see
>> >>>>> if it is due to the GC.
>> >>>>>
>> >>>>> Kind Regards
>> >>>>> /Mattis
>> >>>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Kirk Pepperdine [mailto:kirk at kodewerk.com]
>> >>>>> Sent: den 30 april 2015 13:18
>> >>>>> To: Stefan Johansson
>> >>>>> Cc: hotspot-dev at openjdk.java.net Source Developers
>> >>>>> Subject: Re: JEP 248: Make G1 the Default Garbage Collector
>> >>>>>
>> >>>>> Hi Stefan,
>> >>>>>
>> >>>>> Indeed, the improvements have been amazing. I have been getting many
>> >>>>> clients to bench with it and although the results have been mixed,
>> >> overall
>> >>>>> many have been able to move forward. However I still would not
>> >> recommend G1
>> >>>>> to anyone who can't move to 1.8.0_40. Of course this change will
>> >> obviously
>> >>>>> come post _40 but still, the recent emergence of the G1 as a viable
>> >>>>> production ready collector suggests that making it a default maybe a
>> >> wee bit
>> >>>>> optimistic.
>> >>>>>
>> >>>>> The change is based on the assumption that limiting latency is often
>> >> more
>> >>>>> important than maximizing throughput. If this assumption is
>> >>>>> incorrect
>> >> then
>> >>>>> this change might need to be reconsidered.
>> >>>>>
>> >>>>> I would agree with this assumption. In most cases latency is more
>> >>>>> important. However G1 doesn't always provide lowest latency
>> >>>>> especially
>> >> in
>> >>>>> smaller heaps.
>> >>>>>
>> >>>>>
>> >>>>> G1 is seen as a robust and well-tested collector. It is not expected
>> >>>>> to
>> >>>>> have stability problems, but becoming the default collector will
>> >> increase
>> >>>>> its visibility and may reveal previously-unknown issues.
>> >>>>> I not sure it's prudent to treat the entire Java eco-system as
>> >>>>> guinea
>> >>>>> pigs. I believe it's more prudent to have the willing take that
>> >>>>> first
>> >> step
>> >>>>> rather than have it unwittingly dropped on everyone
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> At the end of the day, I don't have any say in any of this (as it
>> >> should
>> >>>>> be). All I can do is let you know what I'm seeing through my straw
>> >> with the
>> >>>>> hope that you'll find the information useful. From what I see, there
>> >> is not
>> >>>>> nearly enough experience in the tuning the G1 in that is especially
>> >> true in
>> >>>>> the general population to make this type of change at this point in
>> >> time.
>> >>>>> I'm also not sure that we have all the tuning options we need to
>> >>>>> ensure
>> >>>>> "happy apps" in the wild. For example, I think the incremental
>> >> accumulated
>> >>>>> waste in tenured regions is a problem that I'm not sure we have the
>> >> tools to
>> >>>>> solve. I'm not even sure if it's a recognized problem. In fact I'm
>> >>>>> not
>> >> even
>> >>>>> sure it's a real problem as at the moment it's only a theory based
>> >>>>> on
>> >>>>> observations I'm making by looking at numbers of GC logs produced by
>> >>>>> applications using recent releases of the G1.
>> >>>>>
>> >>>>> I would suggest that for Tiered the default config for 8 is was also
>> >>>>> a
>> >>>>> bit premature. I've had to have a number of clients have to roll
>> >>>>> back
>> >> off of
>> >>>>> it.
>> >>>>>
>> >>>>> - Kirk
>> >>>>>
>> >>>>> On Apr 29, 2015, at 3:03 PM, Stefan Johansson
>> >>>>> <stefan.johansson at oracle.com> wrote:
>> >>>>>
>> >>>>>> Hi Kirk,
>> >>>>>>
>> >>>>>> A lot of effort is put into G1, it has been continuously improving
>> >> over
>> >>>>>> the last couple of years and we now believe that G1 is ready to
>> >> become the
>> >>>>>> default. G1 will not improve all use case, but the same is true for
>> >> the
>> >>>>>> other collectors. For users where throughput is the main concern,
>> >> Parallel
>> >>>>>> GC can still be used by specifying -XX:+UseParallelGC on the
>> >> command-line.
>> >>>>>>
>> >>>>>> Regards,
>> >>>>>> Stefan
>> >>>>>>
>> >>>>>> On 2015-04-29 09:10, Kirk Pepperdine wrote:
>> >>>>>>>
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> Is the G1 ready for this? I see many people moving to G1 but also
>> >>>>>>> I'm
>> >>>>>>> not sure that we've got the tunable correct. I've been sorting
>> >> through a
>> >>>>>>> number of recent tuning engagements and my  conclusion is that I
>> >> would like
>> >>>>>>> the collector to be aggressive about collecting tenured regions at
>> >> the
>> >>>>>>> beginning of a JVM's life time but then become less aggressive
>> >>>>>>> over
>> >> time.
>> >>>>>>> The reason is the residual waste that I see left behind because
>> >> certain
>> >>>>>>> regions never hit the threshold needed to be included in the CSET.
>> >> But, on
>> >>>>>>> aggregate, the number of regions in this state does start to
>> >>>>>>> retain a
>> >>>>>>> significant about of dead data. The only way to see the effects is
>> >> to run
>> >>>>>>> regular Full GCs.. which of course you don't really want to do.
>> >> However, the
>> >>>>>>> problem seems to settle down a wee bit over time which is why I
>> >>>>>>> was
>> >> thinking
>> >>>>>>> that being aggressive about what is collected in the early stages
>> >>>>>>> of
>> >> a JVMs
>> >>>>>>> life should lead to better packing and hence less waste.
>> >>>>>>>
>> >>>>>>> Note, I don't really care about the memory waste, only it's effect
>> >>>>>>> on
>> >>>>>>> cycle frequencies and pause times.
>> >>>>>>>
>> >>>>>>> Sorry but I don't have anything formal about this as I (and I
>> >>>>>>> believe
>> >>>>>>> many others) are still sorting out what to make of the G1 in prod.
>> >> Generally
>> >>>>>>> the overall results are good but sometimes it's not that way up
>> >> front and
>> >>>>>>> how to improve things is sometimes challenging.
>> >>>>>>>
>> >>>>>>> On a side note, the move to Tiered in 8 has also caused a bit of
>> >> grief.
>> >>>>>>> Metaspace has caused a bit of grief and even parallelStream, which
>> >> works,
>> >>>>>>> has come with some interesting side effect. Everyone has been so
>> >> enamored
>> >>>>>>> with Lambdas (rightfully so) that the other stuff has been
>> >>>>>>> completely
>> >>>>>>> forgotten and some of it has surprised people. I guess I'll be
>> >> submitting a
>> >>>>>>> talk for J1 on some of the field experience I've had with the
>> >>>>>>> other
>> >> stuff.
>> >>>>>>>
>> >>>>>>> Regards,
>> >>>>>>> Kirk
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Apr 28, 2015, at 11:02 PM, mark.reinhold at oracle.com wrote:
>> >>>>>>>
>> >>>>>>>> New JEP Candidate: http://openjdk.java.net/jeps/248
>> >>>>>>>>
>> >>>>>>>> - Mark
>> >>>>>>
>> >>>>>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Ben Evans, Co-founder jClarity @jclarity
>> >>
>>
>



-- 
Ben Evans, Co-founder jClarity @jclarity


More information about the hotspot-dev mailing list