RFR: JDK-8310111: Shenandoah wastes memory when running with very large page sizes [v2]

Fri Jun 30 09:40:54 UTC 2023

On Tue, 20 Jun 2023 15:21:24 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> This proposal changes the reservation of bitmaps and region storage to reduce the wastage associated with running with very large page sizes (e.g. 1GB on x64, 512M on arm64) for both non-THP- and THP-mode.
>> 
>> This patch does:
>> - introducing the notion of "allowed overdraft factor" - allocations for a given page size are rejected if they would cause more wastage than the factor allows
>> - if it makes sense, it places mark- and aux-bitmap into a contiguous region to let them share a ginourmous page. E.g. for a heap of 16G, both bitmaps would now share a single GB page.
>> 
>> Examples:
>> 
>> Note: annoyingly, huge page usage does not show up in RSS. I therefore use a script that parses /proc/pid/smaps and counts hugepage usage to count cost for the following examples:
>> 
>> Example 1:
>> 
>> A machine configured with 1G pages (and nothing else). Heap is allocated with 1G pages, the bitmaps fall back to 4K pages because JVM figures 1GB would be wasted:
>> 
>> 
>> thomas at starfish$ ./images/jdk/bin/java -Xmx4600m -Xlog:pagesize -XX:+UseShenandoahGC -XX:+UseLargePages
>> ...
>> [0.028s][info][pagesize] Mark Bitmap: req_size=160M req_page_size=4K base=0x00007f8149fff000 size=160M page_size=4K
>> [0.028s][info][pagesize] Aux Bitmap: req_size=160M req_page_size=4K base=0x00007f813fffe000 size=160M page_size=4K
>> [0.028s][info][pagesize] Region Storage: req_size=320K req_page_size=4K base=0x00007f817c06f000 size=320K page_size=4K
>> 
>> 
>> Cost before: 8GB. Cost now: 5GB + (2*160M)
>> 
>> Example 2: JVM with 14GB heap: mark and aux bitmap together are large enough to justify another 1G page, so they share it. Notice how we also place the region storage on this page:
>> 
>> 
>> thomas at starfish:/shared/projects/openjdk/jdk-jdk/output-release$ ./images/jdk/bin/java -Xmx14g -Xlog:pagesize 
>> -XX:+UseShenandoahGC -XX:+UseLargePages -cp $REPROS_JAR de.stuefe.repros.Simple
>> [0.003s][info][pagesize] Heap: req_size=14G req_page_size=1G base=0x0000000480000000 size=14G page_size=1G
>> [0.003s][info][pagesize] Mark+Aux Bitmap: req_size=896M req_page_size=1G base=0x00007fee00000000 size=1G page_size=1G
>> [0.003s][info][pagesize] Region Storage: piggy-backing on Mark Bitmap: base=0x00007fee38000000 size=1792
>> <press key>
>> 
>> 
>> Cost before: 17GB. Cost now: 15GB.
>> 
>> From a bang-for-hugepages-buck multiples of 16GB are a sweet spot here since (on x64 with 1GB pages) since this allows us to put both 512m bitmaps onto a single huge page.
>> 
>> -----------
>> 
>> No test yet, since I wanted to se...
>
> Thomas Stuefe has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   start

I do wonder if we want to cobble together all Shenandoah bulk memory structures in a single `ReservedSpace` unconditionally, which could then use the single (large) page size, if needed. It feels redundant to try and allocate some data-structures separately and then solve the overdraft factor, splitting and piggy-backing? I think that logic would be even more complicated when GenShen comes in with additional RemSet data structure.

I.e. do a ReservedSpace, which _always_ gets the largest page size, and always contains:
  1. Collection set bitmap
  2. Regions storage
  3. Remembered set (from GenShen)
  4. Mark bitmap
  5. Aux bitmap

The RS base selection would be driven by (1), which wants a specific address.
(1)..(4) are commited at init. (5) is commited on demand, if page sizes allow.
The "waste" on large pages for smaller (1) and (2) would be subsumed by (4) and (5) that are in the same RS.

So if we are lucky, we would be able to fit everything in 1G page, like one of our example shows, but in a simpler way.

The problem I would see is that mark bitmap uncommit code would need to be adjusted a bit, if mark bitmap bases would not be aligned to page sizes anymore, but since they are the last in RS, we can do a bit of internal alignment to fit bulk uncommits better.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14559#issuecomment-1614402279