JEP 248: Make G1 the Default Garbage Collector

Mon Jun 1 19:16:50 UTC 2015

>
> I suppose it is worth mentioning that the population of apps that don’t
> stress GC is pretty small compared to those that do. ;-)

Sadly, that's true :).

On Mon, Jun 1, 2015 at 3:12 PM, charlie hunt <charlie.hunt at oracle.com>
wrote:

> Yep, that’s right.
>
> I suppose it is worth mentioning that the population of apps that don’t
> stress GC is pretty small compared to those that do. ;-)
>
> On Jun 1, 2015, at 2:01 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
> Also, G1 also has heavier write barriers than parallel gc, so some
> existing workloads that don't stress the GC (e.g. code written purposely to
> avoid GC during uptime by, say, object pooling) and wouldn't have tweaked
> the default may experience some degradation.
>
> On Mon, Jun 1, 2015 at 2:53 PM, charlie hunt <charlie.hunt at oracle.com>
> wrote:
>
>> Hi Jenny,
>>
>> A couple questions and comments below.
>>
>> thanks,
>>
>> charlie
>>
>> > On Jun 1, 2015, at 1:28 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
>> >
>> > Hi,
>> >
>> > I have done some performance comparison g1/cms/parallelgc internally at
>> Oracle.  I would like to post my observations here to get some feedback, as
>> I have limited benchmarks and hardware.  These are out of box performance.
>> >
>> > Memory footprint/startup:
>> > g1 has bigger memory footprint and longer start up time. The overhead
>> comes from more gc threads, and internal data structures to keep track of
>> remember set.
>>
>> This is the memory footprint of the JVM itself when using the same size
>> Java heap, right?
>>
>> I don’t recall if it has been your observation?  One observation I have
>> had with G1 is that it tends to be able to operate within tolerable
>> throughput and latency with a smaller Java heap than with Parallel GC.  I
>> have seen cases where G1 may not use the entire Java heap because it was
>> able to keep enough free regions available yet still meet pause time goals.
>> But, Parallel GC always use the entire Java heap, and once its occupancy
>> reach capacity, it would GC. So they are cases where between the JVM’s
>> footprint overhead, and taking into account the amount of Java heap
>> required, G1 may actually require less memory.
>>
>> >
>> > g1 vs parallelgc:
>> > If the workload involves young gc only, g1 could be slightly slower.
>> Also g1 can consume more cpu, which might slow down the benchmark if SUT is
>> cpu saturated.
>> >
>> > If there are promotions from young to old gen and leads to full gc with
>> parallelgc, for smaller heap, parallel full gc can finish within some range
>> of pause time, still out performs g1.  But for bigger heap, g1 mixed gc can
>> clean the heap with pause times a fraction of parallel full gc time, so
>> improve both throughput and response time.  Extreme cases are big data
>> workloads(for example ycsb) with 100g heap.
>>
>> I think what you are saying here is that it looks like if one can tune
>> Parallel GC such that you can avoid a lengthy collection of old generation,
>> or the live occupancy of old gen is small enough that the time to collect
>> is small enough to be tolerated, then Parallel GC will offer a better
>> experience.
>>
>> However, if the live data in old generation at the time of its collection
>> is large enough such that the time it takes to collect it exceeds a
>> tolerable pause time, then G1 will offer a better experience.
>>
>> Would also say that G1 offers a better experience in the presences of
>> (wide) swings in object allocation rates since there would likely be a
>> larger number of promotions during the allocation spikes?  In other words,
>> G1 may offer more predictable pauses.
>>
>> >
>> > g1 vs cms:
>> > I will focus on response time type of workloads.
>> > Ben mentioned
>> >
>> > "Having said that, there is definitely a decent-sized class of systems
>> > (not just in finance) that cannot really tolerate any more than about
>> > 10-15ms of STW. So, what usually happens is that they live with the
>> > young collections, use CMS and tune out the CMFs as best they can (by
>> > clustering, rolling restart, etc, etc). I don't see any possibility of
>> > G1 becoming a viable solution for those systems any time soon."
>> >
>> > Can you give more details, like what is the live data set size, how big
>> is the heap, etc?  I did some cache tests (Oracle coherence) to compare cms
>> vs g1. g1 is better than cms when there are fragmentations. If you tune cms
>> well to have little fragmentation, then g1 is behind cms.  But for those
>> cases, they have to tune CMS very well, changing default to g1 won't impact
>> them.
>> >
>> > For big data kind of workloads (ycsb, spark in memory computing), g1 is
>> much better than cms.
>> >
>> > Thanks,
>> > Jenny
>> >
>> > On 6/1/2015 10:06 AM, Ben Evans wrote:
>> >> Hi Vitaly,
>> >>
>> >>>> Instead, G1 is now being talked of as a replacement for the default
>> >>>> collector. If that's the case, then I think we need to acknowledge
>> it,
>> >>>> and have a conversation about where G1 is actually supposed to be
>> >>>> used. Are we saying we want a "reasonably high throughput with
>> reduced
>> >>>> STW, but not low pause time" collector? If we are, that's fine, but
>> >>>> that's not where we started.
>> >>> That's a fair point, and one I'd be interesting in hearing an answer
>> to as
>> >>> well.  FWIW, the only GC I know of that's actually used in low latency
>> >>> systems is Azul's C4, so I'm not even sure Oracle is trying to target
>> the
>> >>> same use cases.  So when we talk about "low latency" GCs, we should
>> probably
>> >>> also be clear on what "low" actually means.
>> >> Well, when I started playing with them, "low latency" meant a
>> >> sub-10-ms transaction time with 100ms STW as acceptable, if not ideal.
>> >>
>> >> These days, the same sort of system needs a sub 500us transaction
>> >> time, and ideally no GC pause at all. But that leads to Zing, or
>> >> non-JVM solutions, and I think takes us too far into a specialised use
>> >> case.
>> >>
>> >> Having said that, there is definitely a decent-sized class of systems
>> >> (not just in finance) that cannot really tolerate any more than about
>> >> 10-15ms of STW. So, what usually happens is that they live with the
>> >> young collections, use CMS and tune out the CMFs as best they can (by
>> >> clustering, rolling restart, etc, etc). I don't see any possibility of
>> >> G1 becoming a viable solution for those systems any time soon.
>> >>
>> >> Thanks,
>> >>
>> >> Ben
>> >
>>
>>
>
>