RFR: Convert old-gen single threaded pretouch to multi-threaded during
Amit Pawar
amith.pawar at gmail.com
Mon Jan 18 15:46:20 UTC 2021
On Fri, Jan 8, 2021 at 6:38 PM Amit Pawar <amith.pawar at gmail.com> wrote:
> Hi
>
> I am trying to improve the pre-touch time taken during old-gen resizing.
> Need your suggestions whether following change will be accepted or not.
>
> What is happening ?
> Every GC thread resizes the old-gen during object promotion if there is no
> enough room for the object. After expanding GC thread will pre-touch the
> pages alone and cant pre-touch in parallel using PretouchTask task as it is
> already executing a GC task. The total GC pause time depends upon resize
> size and number of resizes.
>
> What is fix?
> Create another WorkGang and then GC thread can execute pre-touch task with
> this new WorkGang to reduce the pre-touch time taken. The code change is
> given below.
>
> Improvement:
> 1. Pre-touch improved by 50-70% for SPECjbb composite test.
> 2. This depends upon number of resize request and resize size. SPECJbb
> composite testing shows old-gen resized with sizes like 2MB-32MB with G1GC
> and up-to 64MB with ParallelGC. Also number of resizes are more than
> 100-200.
> 3. PretouchTask class uses PreTouchParallelChunkSize and current default
> is 4MB for x86 to split the pre-touch task. So time taken depends upon
> old-gen resize and this change wont help if it lesser than
> PreTouchParallelChunkSize value.
> 4. Please refer excel file from bug report for more details on improvement
> for different sizes. https://bugs.openjdk.java.net/browse/JDK-8254699
>
> Though it helps to reduce the pre-touch time taken but not sure whether
> adding another WorkGang is allowed. Please suggest.
>
> diff --git a/src/hotspot/share/gc/shared/gc_globals.hpp
> b/src/hotspot/share/gc/shared/gc_globals.hpp
> index aca8d6b6c34..b5d40b47480 100644
> --- a/src/hotspot/share/gc/shared/gc_globals.hpp
> +++ b/src/hotspot/share/gc/shared/gc_globals.hpp
> @@ -200,6 +200,12 @@
> product(bool, AlwaysPreTouch, false,
> \
> "Force all freshly committed pages to be pre-touched")
> \
>
> \
> + product(size_t, OldGenPreTouchWorkers, 1,
> \
> + "During object promotion old-gen can be expanded as required
> by" \
> + "ParallelGCThreads. OldGenPreTouchWorkers can be used to "
> \
> + "pre-touch the pages by ParallelGCThreads")
> \
> + range(1, 1024)
> \
> +
> \
> product_pd(size_t, PreTouchParallelChunkSize,
> \
> "Per-thread chunk size for parallel memory pre-touch.")
> \
> range(4*K, SIZE_MAX / 2)
> \
> diff --git a/src/hotspot/share/gc/shared/pretouchTask.cpp
> b/src/hotspot/share/gc/shared/pretouchTask.cpp
> index 4398d3924cc..435ec2ee76f 100644
> --- a/src/hotspot/share/gc/shared/pretouchTask.cpp
> +++ b/src/hotspot/share/gc/shared/pretouchTask.cpp
> @@ -27,6 +27,7 @@
> #include "runtime/atomic.hpp"
> #include "runtime/globals.hpp"
> #include "runtime/os.hpp"
> +#include "utilities/ticks.hpp"
>
> PretouchTask::PretouchTask(const char* task_name,
> char* start_address,
> @@ -62,6 +63,8 @@ void PretouchTask::work(uint worker_id) {
> }
> }
>
> +#define TIME_FORMAT "%0.3lfms"
> +
> void PretouchTask::pretouch(const char* task_name, char* start_address,
> char* end_address,
> size_t page_size, WorkGang* pretouch_gang) {
>
> @@ -83,14 +86,30 @@ void PretouchTask::pretouch(const char* task_name,
> char* start_address, char* en
> size_t num_chunks = (total_bytes + chunk_size - 1) / chunk_size;
>
> uint num_workers = (uint)MIN2(num_chunks,
> (size_t)pretouch_gang->total_workers());
> - log_debug(gc, heap)("Running %s with %u workers for " SIZE_FORMAT "
> work units pre-touching " SIZE_FORMAT "B.",
> - task.name(), num_workers, num_chunks,
> total_bytes);
> -
> + Ticks mark_start = Ticks::now();
> pretouch_gang->run_task(&task, num_workers);
> + Ticks mark_end = Ticks::now();
> + log_debug(gc, heap)("Running %s with %u workers for " SIZE_FORMAT "
> work units pre-touching " SIZE_FORMAT "B. " TIME_FORMAT ,
> + task.name(), num_workers, num_chunks,
> total_bytes, (mark_end-mark_start).seconds());
> +
> } else {
> - log_debug(gc, heap)("Running %s pre-touching " SIZE_FORMAT "B.",
> - task.name(), total_bytes);
> - task.work(0);
> + if(OldGenPreTouchWorkers > 1) {
> + const char *oldgen_workers="Old-gen Pre-touch workers";
> + static WorkGang *pretouch_workers= NULL ;
> + if (! pretouch_workers) {
> + // pretouch_workers are used when pretouch_gang is null. This usually
> happens during old-gen
> + // resizing due to object promotion.
> + pretouch_workers = new WorkGang(oldgen_workers,
> OldGenPreTouchWorkers, true, false);
> + pretouch_workers->initialize_workers();
> + }
> + pretouch(oldgen_workers, start_address, end_address, page_size,
> pretouch_workers);
> + } else {
> + Ticks mark_start = Ticks::now();
> + task.work(0);
> + Ticks mark_end = Ticks::now();
> + log_debug(gc, heap)("Running %s pre-touching " SIZE_FORMAT "B. "
> TIME_FORMAT,
> + task.name(), total_bytes,
> (mark_end-mark_start).seconds());
> + }
> }
> }
>
>
>
>
Ping!
More information about the hotspot-gc-dev
mailing list