RFR: 8338440: Parallel: Improve fragmentation mitigation in Full GC
Guoxiong Li
gli at openjdk.org
Thu Aug 22 23:28:07 UTC 2024
On Thu, 15 Aug 2024 06:59:35 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:
> Extend `SplitInfo` to support more fine-grained splitting to mitigate the fragmentation issue during full GC. Added comments and diagrams in the process.
>
> For easier review, it's best to start with `SplitInfo` and then proceed to see how it is constructed in `summarize_split_space` and consumed in `first_src_addr`. The accompanying diagrams should help create a clear mental image.
>
> With this patch, the exec time of `runtime/ClassInitErrors/TestOutOfMemoryDuringInit.java` using Parallel drops from ~30s to ~8s, the same as other GCs, and gc-log shows similar number of GC cycles as well.
>
> Test: tier1-8, systemgc micro bm, CacheStress, dacapo, specjbb2005, specjvm2008
Nice improvement. Several suggestions/questions.
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 204:
> 202: assert(_split_point == nullptr, "inv");
> 203: assert(_preceding_live_words == 0, "inv");
> 204: assert(_split_destination_count == 0, "inv");
May be better to use `not clear` uniformly?
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 309:
> 307: // The total live words on src_region would overflow the target space, so find
> 308: // the overflowing object and recorde the split point. The invariant is that an
> 309: // obj should not cross space boundary.
Typo `recorde`.
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 364:
> 362: HeapWord* new_top = destination - pointer_delta(src_region_start, overflowing_obj);
> 363:
> 364: // If the overflowing obj were to relocated to its original destination,
Typo `were to relocated to`.
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 393:
> 391: if (cur_addr >= region_end) {
> 392: break;
> 393: }
What about using a `for` statement here? For example:
--- a/src/hotspot/share/gc/parallel/psParallelCompact.cpp
+++ b/src/hotspot/share/gc/parallel/psParallelCompact.cpp
@@ -382,15 +382,11 @@ HeapWord* ParallelCompactData::summarize_split_space(size_t src_region,
// Obj-iteration to locate the overflowing obj
HeapWord* region_start = region_to_addr(src_region);
HeapWord* region_end = region_start + RegionSize;
- HeapWord* cur_addr = region_start + partial_obj_size;
size_t live_words = partial_obj_size;
- while (true) {
- assert(cur_addr < region_end, "inv");
- cur_addr = PSParallelCompact::mark_bitmap()->find_obj_beg(cur_addr, region_end);
- if (cur_addr >= region_end) {
- break;
- }
+ for (HeapWord* cur_addr = region_start + partial_obj_size;
+ cur_addr < region_end;
+ cur_addr = PSParallelCompact::mark_bitmap()->find_obj_beg(cur_addr, region_end)) {
oop obj = cast_to_oop(cur_addr);
size_t obj_size = obj->size();
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 482:
> 480: assert(source_next != nullptr, "source_next is null when splitting");
> 481: *source_next = summarize_split_space(cur_region, split_info, dest_addr,
> 482: target_end, target_next);
Actually, the last parameter `target_next` of the method `summarize_split_space` won't be used at any other places. The bottom of the new space will be used instead (see `PSParallelCompact::summary_phase`). So I think the last parameter `target_next` of the method `summarize_split_space` can be **removed**. Then in `PSParallelCompact::summary_phase`, we can pass a null pointer in this situation (See code below) so that the meaning becomes clearer.
// method `PSParallelCompact::summary_phase`
} else if (live > 0) {
// Attempt to fit part of the source space into the target space.
HeapWord* next_src_addr = nullptr;
bool done = _summary_data.summarize(_space_info[id].split_info(),
space->bottom(), space->top(),
&next_src_addr,
*new_top_addr, dst_space_end,
new_top_addr); // <--- here, can pass `nullptr`
assert(!done, "space should not fit into old gen");
assert(next_src_addr != nullptr, "sanity");
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 524:
> 522: log_warning(gc)("Uncleared Region: %u", cur_idx);
> 523: region(cur_idx)->verify_clear();
> 524: }
In `PSParallelCompact::clear_data_covering_space --> ParallelCompactData::clear_range`, the `ParallelCompactData::_region_data` is set to `0` directly. I don't know whether it is worth adding two methods `ParallelCompactData::RegionData::is_clear/verify_clear` to verify. And the previous implementation can verify the field `RegionData::_pushed` as well.
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1560:
> 1558: HeapWord* end,
> 1559: HeapWord* destination,
> 1560: size_t live_words) {
We can pass the precise destination and then the last parameter `live_words` of the method `PSParallelCompact::forward_to_new_addr` can be removed. Just like:
--- a/src/hotspot/share/gc/parallel/psParallelCompact.cpp
+++ b/src/hotspot/share/gc/parallel/psParallelCompact.cpp
@@ -1556,8 +1556,7 @@ void PSParallelCompact::forward_to_new_addr() {
static void forward_objs_in_range(ParCompactionManager* cm,
HeapWord* start,
HeapWord* end,
- HeapWord* destination,
- size_t live_words) {
+ HeapWord* destination) {
HeapWord* cur_addr = start;
while (cur_addr < end) {
@@ -1566,14 +1565,13 @@ void PSParallelCompact::forward_to_new_addr() {
return;
}
assert(mark_bitmap()->is_marked(cur_addr), "inv");
- HeapWord* new_addr = destination + live_words;
oop obj = cast_to_oop(cur_addr);
- if (new_addr != cur_addr) {
+ if (destination != cur_addr) {
cm->preserved_marks()->push_if_necessary(obj, obj->mark());
- obj->forward_to(cast_to_oop(new_addr));
+ obj->forward_to(cast_to_oop(destination));
}
size_t obj_size = obj->size();
- live_words += obj_size;
+ destination += obj_size;
cur_addr += obj_size;
}
}
@@ -1613,14 +1611,14 @@ void PSParallelCompact::forward_to_new_addr() {
// Part 1: will be relocated to space-1
HeapWord* split_destination = split_info.split_destination();
HeapWord* split_point = split_info.split_point();
- forward_objs_in_range(cm, region_start + live_words, split_point, split_destination, live_words);
+ forward_objs_in_range(cm, region_start + live_words, split_point, split_destination + live_words);
// Part 2: will be relocated to space-2
HeapWord* destination = region_ptr->destination();
- forward_objs_in_range(cm, split_point, region_end, destination, 0);
+ forward_objs_in_range(cm, split_point, region_end, destination);
} else {
HeapWord* destination = region_ptr->destination();
- forward_objs_in_range(cm, region_start + live_words, region_end, destination, live_words);
+ forward_objs_in_range(cm, region_start + live_words, region_end, destination + live_words);
}
}
}
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1611:
> 1609: HeapWord* region_end = region_start + ParallelCompactData::RegionSize;
> 1610:
> 1611: const SplitInfo& split_info = _space_info[space_id(region_start)].split_info();
Could the variable `SplitInfo` be moved outside/before the `for` loop to avoid duplicated allocations?
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1998:
> 1996: // source-region contains this location. This location is retrieved by calling
> 1997: // `first_src_addr` on a dest-region.
> 1998: // Conversely, a source-region has a dest-region which holds the destinatino of
Typo `destinatino`.
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 2271:
> 2269: if (partial_obj_start == obj_start) {
> 2270: // This obj extends to next region.
> 2271: obj_end = partial_obj_end(next_region_start);
Question: We know the object start position in this branch. Why can't we use object size (in new line 2271) directly (like new line 2274 shown below)? Why is it not safe?
// Completely contained in this region; safe to use size().
obj_end = obj_start + cast_to_oop(obj_start)->size();
src/hotspot/share/gc/parallel/psParallelCompact.hpp line 152:
> 150: HeapWord* _split_point;
> 151: size_t _preceding_live_words;
> 152: uint _split_destination_count;
The names `_split_destination` and `_split_destination_count` may be ambiguous. In `_split_destination`, the prefix `split` means two parts whose destinations locate in different spaces. But in `_split_destination_count`, the prefix `split` means two parts whose destinations locate in different regions but in the same space. I suggest to change the field `_split_destination_count` to `_preceding_destination_count` (or other more appropriate name).
-------------
Changes requested by gli (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/20590#pullrequestreview-2253703015
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1728036224
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1726498140
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1726911573
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1726908715
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1726876009
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1726965885
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1727427367
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1727325953
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1727453862
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1727998329
PR Review Comment: https://git.openjdk.org/jdk/pull/20590#discussion_r1727773965
More information about the hotspot-gc-dev
mailing list