[G1] Why is max. region size restricted to 32m?

Mon Aug 31 11:50:18 UTC 2020

Hi,

On 25.08.20 07:46, Vishal Chand wrote:
> Hi,
> 
> While running G1 with 128g heap, I am observing # of regions close to 4k,
> which is twice the target number of regions G1 tries to achieve (2048). I
> would like to know any major reason/challenges for keeping the max region
> size restricted to 32m, apart from what is mentioned in* src
> <http://ssdbang9:8080/source/xref/jdk14/jdk14/src/>/hotspot
> <http://ssdbang9:8080/source/xref/jdk14/jdk14/src/hotspot/>/share
> <http://ssdbang9:8080/source/xref/jdk14/jdk14/src/hotspot/share/>/gc
> <http://ssdbang9:8080/source/xref/jdk14/jdk14/src/hotspot/share/gc/>/g1
> <http://ssdbang9:8080/source/xref/jdk14/jdk14/src/hotspot/share/gc/g1/>/heapRegionBounds.hpp.
> *

   the main reason is that 32m regions have been found a good tradeoff 
between region size and space usage in the remembered sets.

G1 stores the locations where references to other regions are as card 
indices relative to the start of a region. With a card size of 512 bytes 
(the currently compiled-in default), in a 32m region there are exactly 
2^16 such cards, which means that g1 can store such an index as 16 bit 
integer instead of something larger, saving (lots of) space.

Saving space not only saves that space for other purposes, but most of 
the time you're waiting on loading memory into registers/caches during 
gc pause.

There are several options here to work around this off the top of my 
head, no guarantees that these changes will not trip up somewhere else:

1) use a larger data type for card indices, i.e. change the line

   typedef uint16_t card_elem_t;

in sparsePRT.hpp to something larger, i.e. uint32_t ;)

2) use a larger card size, for every doubling here you can double the 
region size.

See CardTable::card_shift and related. Particularly with larger heaps a 
larger card size might even be advantageous for several reasons.

1024 bytes per card seems certainly okay with nowadays' heap sizes, 
maybe even 2048 is still good.

   2a) decouple "remembered set cards" from "card table cards" - afaik 
for historical reasons they have been set to the same value which 
resulted in this problem in the first place.

Note that while 1) is fairly quick to do, it will make the memory 
consumption not only of the sparse remembered set representation larger 
(effectively doubling it) which may be sort-of-okay'ish (these are like 
~100 entries for every region having references into the remembered set 
owner's region), but also the fine remembered set representation will 
double (e.g. from 8k/per fine remembered set table to 16k/per fine 
remembered set table when going from 32 to 64m regions).

That one will hurt *a lot*.

Unless you want to overhaul remembered set data structures completely 
(there is some long-ongoing internal effort - always something else 
getting in the way), my personal opinion to get the most impact with 
"least" effort with the current data structures seems to be 2a above. 
I.e. as soon as you untangled card table cards and remembered set cards 
usage, you can then scale region size quite some ways without affecting 
lots of other things. Option 2 would be interesting and likely much less 
effort to get only 64m regions cheaply. (Not sure that 2k+ card table 
size is still "good" for 128m+ regions, but without measurements...)

Thanks,
   Thomas