JEP 291: Deprecate the Concurrent Mark Sweep (CMS) Garbage Collector

Ben Evans benjamin.john.evans at gmail.com
Sun Apr 16 11:38:59 UTC 2017


Hi Thomas,

I'm very glad the community has been able to provide an example of an
experimental configuration that's able to meet your publicised claims
- and all it took was unlocking experimental options and several
months of tuning work!

However, if I'm reading your mail correctly, your point is that the
version of G1 that will be delivered & made default in Java 9 is
substantially different from anything that has been seen so far in
shipping versions of Java 8?

Is that correct? Oracle are planning to make default a GC which has
had, essentially, zero testing on actual production workloads?

Thanks,

Ben


On Sat, Apr 15, 2017 at 4:57 PM, Thomas Schatzl
<thomas.schatzl at oracle.com> wrote:
> Hi Daniel,
>
> On Tue, 2017-04-11 at 22:04 -0400, Daniel Ennis wrote:
>> Hello,
>> I've pretty much been the lead on GC research as it applies to
>> Minecraft servers. I spent a few months researching everything I
>> could on GC tuning, and analyzing my server using VisualGC inside of
>> VisualVM.
>>
>> I chose to study G1 as the CMS fragmentation issue constantly causes
>> MC servers problems with large spikes (which is extremely visible for
>> a realtime game) when the full collections trigger.
>>
>> In the end, I was able to optimize G1 to provide 25ms pause times on
>> lower sized heaps such as 2G, and 40 to 100ms on 10G heap.
>>
> [...]
>
>> I use 10GB heap on my production server and average 50-100ms pauses
>> with the following flags
>>
>>  -Dfile.encoding=UTF-8 -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions
>> -XX:MaxGCPauseMillis=50 -XX:+DisableExplicitGC
>> -XX:G1HeapRegionSize=8M
>> -XX:TargetSurvivorRatio=90 -XX:+AggressiveOpts
>> -XX:HeapDumpPath=crash-reports/ -XX:+HeapDumpOnOutOfMemoryError
>> -Xloggc:logs/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
>> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=20M
>> -XX:LargePageSizeInBytes=2M
>> -XX:+UseLargePages -XX:+AlwaysPreTouch -XX:+UseLargePagesInMetaspace
>> -XX:G1NewSizePercent=50 -XX:G1MaxNewSizePercent=80
>> -XX:InitiatingHeapOccupancyPercent=20
>> -XX:G1MixedGCLiveThresholdPercent=40
>> -XX:ParallelGCThreads=6 -DprintSaveStats=60 -Xmx10G -Xms10
>>
>> The key here really is the Experimental options G1NewSizePercent and
>> G1MaxNewSizePercent. These instructions let me constrain G1's
>> predictive calculations. You give it a closer to normal window, and
>> it still does its predictive calculations inside of that window.
>
> From the logs, it may be that G1NewSizePercent may be just a tad too
> high to keep the 50ms pause time all the time, but I may be wrong.
>
> Maybe -XX:+PrintGCDetails could give details about the reason.
>
>> With this setup, most collections are young. A few are mixed to keep
>> old somewhat maintained.
>>
>> I'm not confident I have the best G1MixedGCLiveThresholdPercent or
>> InitiatingHeapOccupancyPercent configurations for maintaining old,
>> but things are so buttery smooth for me now I haven't had
>> justification to mess with them.
>
> If you have, particularly when you move to JDK 9, feel free to drop a
> note about your experience on hotspot-gc-use, or just want to discuss
> your thoughts.
>
>> My ultimate focus is on Young Generation. Minecraft servers allocate
>> a ton of short lived memory.
>>
>> My end goal is to avoid EVER triggering a full collection. I focus on
>> keeping the intervals between young collections as long as possible,
>> giving old only enough room that it ultimately needs + room for
>> unexpected load (if I gain more than expected, then full collections
>> are triggered)
>>
>> If your allocation rate is high, and the memory is short lived,
>> having less space on Eden results in survivor filling up fast, and
>> premature promotions.
>>
>> Eden is cheap to collect, so my tuning aims to let the memory die
>> while still in Eden and before age 15 of survivor, providing these
>> consistently low pause times.
>
> Unfortunately the logs do not show the tenuring distribution, if you
> enabled -XX:+PrintTenuringDistribution we would know more.
>
> Note that G1 can manage allocation in old gen, and particularly one
> that just dies off quickly can also be very quickly reclaimed. The
> concurrent marking on the 10g heap only takes half a second and you are
> very far from going out of memory.
>
> So some throughput may be reclaimed by increasing
> the InitiatingHeapOccupancyPercent.
>
> Note that JDK 9 will try to automatically determine one, most of the
> time giving a very good baseline.
>
>> I believe that is the key to maintaining the low pauses, is to keep
>> most focus in young and avoid having to ever do much work in old.
>
> Thanks a lot for stepping in,
>   Thomas
>


More information about the jdk9-dev mailing list