AW: Why G1 doesn't cut it for our application

Thu Apr 10 13:11:23 UTC 2014

Hi,

On Thu, 2014-04-10 at 13:17 +0200, Cornelius Riemenschneider wrote:
> Hey,
>
> first, we run two different types of servers – one of them is very
> allocation heavy, with the possibility of huge objects whcih live
> longer.
>
> The other one usually has only small allocations, but some requests
> may trigger the allocation of up to ~150MB of data at once.
>
> This data is used at whole, but very short-lived. Explanation:

That's the perfect case for the change in JDK-8027959.

> We sometimes ask the question: Given two unsorted sets of integer IDs,
> how many common ids are in both?
>
> Aka the cardinal number of the intersection of the two sets.
>
> The fastest way we found to determine is this:
>
[...]

> I even found this gem in our log:
> 
> 485471.317: [G1Ergonomics (Heap Sizing) attempt heap expansion,
> reason: humongous allocation request failed, allocation request:
> 81000016 bytes]
> 485471.317: [G1Ergonomics (Heap Sizing) expand the heap, requested
> expansion amount: 67108864 bytes, attempted expansion amount: 67108864
> bytes]
> 485471.317: [G1Ergonomics (Heap Sizing) did not expand the heap,
> reason: heap expansion operation failed]
> 485471.319: [G1Ergonomics (Heap Sizing) attempt heap expansion,
> reason: humongous allocation request failed, allocation request:
> 81000016 bytes]
> 485471.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
> expansion amount: 67108864 bytes, attempted expansion amount: 67108864
> bytes]
> 485471.319: [G1Ergonomics (Heap Sizing) did not expand the heap,
> reason: heap expansion operation failed]
> 485471.319: [G1Ergonomics (Heap Sizing) attempt heap expansion,
> reason: allocation request failed, allocation request: 81000016 bytes]
> 485471.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
> expansion amount: 81000016 bytes, attempted expansion amount: 83886080
> bytes]
> 485471.319: [G1Ergonomics (Heap Sizing) did not expand the heap,
> reason: heap expansion operation failed]

Note that these log messages come from different threads, so an
"allocation request: 81000016 bytes" followed by "attempted expansion
amount: 67108864 bytes]" is perfectly natural :)

> 2014-04-10T11:58:26.028+0200: 485471.319: [Full GC 11G->7884M(20G),
> 20.8011100 secs]
>    [Eden: 0.0B(1584.0M)->0.0B(1600.0M) Survivors: 8192.0K->0.0B Heap:
> 11.3G(20.0G)->7884.1M(20.0G)]
> [Times: user=37.88 sys=0.00, real=20.81 secs]

Not sure why G1 immediately tries to do a full GC here.

> about 80MB allocation, 9GB free but it still requires a fullgc (though
> still with 7u51).

Imo there is a bug (or at least some documentation issue) in total size
accounting for large objects: only the space that is actually occupied
by the object is counted as such, not the actual space it takes (object
size rounded up to regions).

So, if, for example you have lots of objects that are region_size + 1
byte large, the value shown sums up these region_size + 1 bytes, but in
reality 2 * region_size bytes are taken up.

So in extreme cases, this value can be off by almost 50% :) This seems
like one of these cases. You can find out with the PrintHeapAtGCExtended
flag by looking if the HC (humongous object) regions are almost empty.

Unfortunately typical sizing rules-of-thumb tend to create these
objects: i.e. start with an element size that is a power of two, and
double that every time a reallocation is required.

Taking the object header into account, you end up with objects of
region_size + 16 bytes size :/

But, this is what JDK-8031381 is about :)

> The settings used there are:
[...]
> I can’t increase the heap further without deploying more RAM.
>  
> I tested a lot more different options with out allocation heavy
> servers, because restarting these has no direct impact on our
> customers.
> 
> The configuration which worked best used the following: A high pause
> time goal (600ms), but a limit to the newsize (6g), so that we still
> have enough oldspace for the big
> objects.                                                                                                                
> 
> I also tried setting the regionsize to 32mb, but this didn’t cause a
> noticable difference from the default. Setting InitiatingHeapSize to
> 0, 1 or 45 didn’t impact the amount of fullgcs either (at least not
> noticable).

The allocation rate is too high so that InitiatingHeapSize makes a
difference. Not exactly sure why CMS would work better in this case, but
probably because it just allocates (most of) these objects into the
young gen (not sure if there is some large object threshold, if so it is
probably higher than 32M), getting rid of them the next young collection
easily.

> When i’ll test with 8u20, I’ll run with +PrintHeapAtGCExtended.

You also need to add -XX:+PrintHeapAtGC so that the "extended" output is
shown btw.

Thomas