G1 GC for 100GB+ heaps

Thu Jul 23 14:41:41 UTC 2015

Dear Thomas,

Thank you for the helpful response and for the links.

>> Marcus Lagergren suggested I post these questions on this list. We
>> are considering switching to using the G1 GC for a decently sized
>> HBase cluster, and ran into some questions. Hope you can help me
>> our, or point me to the place where I should ask.
> 
> This place is fine, although hotspot-gc-use might be more appropriate.

Moved CC there.

> However you do not mention what your goals are (throughput or latency or
> a mix of that), so it is hard to say whether G1 can meet your
> expectations.

Our goals are to limit pause times. Most traffic on the HBase cluster are from background jobs such as generating indexes and searching, but occasionally we retrieve a document synchronously from the web front-end, which we want to serve quickly.

Max pause times we aim for is 100ms, which looks to be entirely doable. Maybe we should set our goals a little more aggressively. ;-)

We have a test cluster running with -XX:MaxGCPauseMillis=100 but I found that this actually results in an *average* of 100ms and not a max. Is that observation correct? What am I misinterpreting?

>> What kind of region sizing
>> should we use, or should we just let G1 do what it does?
> 
> Initially we recommend just setting heap size (Xms/Xmx) and pause time
> goals (-XX:MaxGCPauseMillis). Depending on your results decreasing
> G1NewSizePercent and increasing the number of marking threads (see the
> first few links above).

Right, that’s what I heard from a few sources: set the size and pause time target and just leave it alone.

> Consider that the G1 needs some extra space for operation. So at 100G
> Java heap, and 128G RAM, the system might start to swap/thrash,
> particular if other stuff is running there. I.e. monitor that using e.g.
> vmstat. Should be avoided :)

Yes, these are dedicated machines and should never hit swap. We’ll keep an eye out to avoid the system hitting swap. Today the machines are running with 64GB heaps for that reason.

> If you are running on Linux, completely disable Transparent Huge Pages
> on Linux (use a search engine to get to know how it is done on your
> particular distro). Always, we have found no exceptions.

Thank you for that advice. I got the same advice from Kirk Pepperdine this week. Our systems actually run with transparent huge pages enabled and I’ll ask the guys to switch that off.

$ cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
$ _

> Other than that the above recommendations should be okay. If there are
> particular issues you may want to come back with a log of a problematic
> run with at least -XX:+PrintGCTimeStamps -XX:+PrintGCDetails set.

Thank you for the kind offer. It will be a few weeks before we get into the thick of this as the summer holidays are settling over us.

--
Kees Jan

kjkoster at java-monitor.com
http://java-monitor.com/
+31651838192

The secret of success lies in the stability of the goal. -- Benjamin Disraeli

--
Kees Jan

http://java-monitor.com/
kjkoster at kjkoster.org
+31651838192

Change is good. Granted, it is good in retrospect, but change is good.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150723/ddc25ad4/signature.asc>