RFR: 8329203: Parallel: Investigate Mark-Compact for Full GC to decrease memory usage

Roman Kennke rkennke at openjdk.org
Mon May 6 11:17:52 UTC 2024


On Mon, 6 May 2024 10:31:48 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> Refactor Parallel full-gc to use the same algorithm (mark-compact) as Serial and G1 full-GC. This removes the obj-end bitmap. When GC threads are few, the old implementation can be more efficient because it requires fewer heap iterations. The new full-GC implementation, on the other hand, is more scalable because it introduces more phases (`forward_to_new_addr` and `adjust_pointers`) that can partition work effectively.
> 
> The diff is rather large, so reading the new code directly from `invoke_no_policy` is probably easier.
> 
> Test: tier1-6; some improvement in Dacapo-h2, CacheStresser, but no difference in specjbb2015, specjvm2008.

Thank you, that's a great change. I actually intended to do something similar soon, because the existing Parallel Full GC would not be compatible with some stuff that I need for Lilliput 2 (namely, the Compact Identity Hashcode).

Before I go into reviewing, could you describe a bit deeper how the implementation works? In particular, G1 (and Shenandoah) parallelize work by assigning regions to GC workers. (It does not matter for Serial, obviously.) Parallel GC has no such conceps of regions. How does your implementation parallelize the work? (My idea was to divide into regions for the purpose of the full-GC and deal with overlapping objects somehow, maybe by using a block-offset-table. But I haven't spent too much thought on it, so far.)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19101#issuecomment-2095773182


More information about the hotspot-gc-dev mailing list