RFR: Convert old-gen single threaded pretouch to multi-threaded during

Amit Pawar amith.pawar at gmail.com
Fri Jan 8 13:08:54 UTC 2021


Hi

I am trying to improve the pre-touch time taken during old-gen resizing.
Need your suggestions whether following change will be accepted or not.

What is happening ?
Every GC thread resizes the old-gen during object promotion if there is no
enough room for the object. After expanding GC thread will pre-touch the
pages alone and cant pre-touch in parallel using PretouchTask task as it is
already executing a GC task. The total GC pause time depends upon resize
size and number of resizes.

What is fix?
Create another WorkGang and then GC thread can execute pre-touch task with
this new WorkGang to reduce the pre-touch time taken. The code change is
given below.

Improvement:
1. Pre-touch improved by 50-70% for SPECjbb composite test.
2. This depends upon number of resize request and resize size. SPECJbb
composite testing shows old-gen resized with sizes like 2MB-32MB with G1GC
and up-to 64MB with ParallelGC. Also number of resizes are more than
100-200.
3. PretouchTask class uses PreTouchParallelChunkSize and current default is
4MB for x86 to split the pre-touch task. So time taken depends upon old-gen
resize and this change wont help if it lesser than
PreTouchParallelChunkSize value.
4. Please refer excel file from bug report for more details on improvement
for different sizes. https://bugs.openjdk.java.net/browse/JDK-8254699

Though it helps to reduce the pre-touch time taken but not sure whether
adding another WorkGang is allowed. Please suggest.

diff --git a/src/hotspot/share/gc/shared/gc_globals.hpp
b/src/hotspot/share/gc/shared/gc_globals.hpp
index aca8d6b6c34..b5d40b47480 100644
--- a/src/hotspot/share/gc/shared/gc_globals.hpp
+++ b/src/hotspot/share/gc/shared/gc_globals.hpp
@@ -200,6 +200,12 @@
   product(bool, AlwaysPreTouch, false,
 \
           "Force all freshly committed pages to be pre-touched")
 \

 \
+  product(size_t, OldGenPreTouchWorkers, 1,
  \
+          "During object promotion old-gen can be expanded as required by"
 \
+          "ParallelGCThreads. OldGenPreTouchWorkers can be used to "
 \
+          "pre-touch the pages by ParallelGCThreads")
  \
+          range(1,  1024)
  \
+
 \
   product_pd(size_t, PreTouchParallelChunkSize,
  \
           "Per-thread chunk size for parallel memory pre-touch.")
  \
           range(4*K, SIZE_MAX / 2)
 \
diff --git a/src/hotspot/share/gc/shared/pretouchTask.cpp
b/src/hotspot/share/gc/shared/pretouchTask.cpp
index 4398d3924cc..435ec2ee76f 100644
--- a/src/hotspot/share/gc/shared/pretouchTask.cpp
+++ b/src/hotspot/share/gc/shared/pretouchTask.cpp
@@ -27,6 +27,7 @@
 #include "runtime/atomic.hpp"
 #include "runtime/globals.hpp"
 #include "runtime/os.hpp"
+#include "utilities/ticks.hpp"

 PretouchTask::PretouchTask(const char* task_name,
                            char* start_address,
@@ -62,6 +63,8 @@ void PretouchTask::work(uint worker_id) {
   }
 }

+#define TIME_FORMAT "%0.3lfms"
+
 void PretouchTask::pretouch(const char* task_name, char* start_address,
char* end_address,
                             size_t page_size, WorkGang* pretouch_gang) {

@@ -83,14 +86,30 @@ void PretouchTask::pretouch(const char* task_name,
char* start_address, char* en
     size_t num_chunks = (total_bytes + chunk_size - 1) / chunk_size;

     uint num_workers = (uint)MIN2(num_chunks,
(size_t)pretouch_gang->total_workers());
-    log_debug(gc, heap)("Running %s with %u workers for " SIZE_FORMAT "
work units pre-touching " SIZE_FORMAT "B.",
-                        task.name(), num_workers, num_chunks, total_bytes);
-
+    Ticks mark_start = Ticks::now();
     pretouch_gang->run_task(&task, num_workers);
+    Ticks mark_end = Ticks::now();
+    log_debug(gc, heap)("Running %s with %u workers for " SIZE_FORMAT "
work units pre-touching " SIZE_FORMAT "B. " TIME_FORMAT ,
+                        task.name(), num_workers, num_chunks, total_bytes,
(mark_end-mark_start).seconds());
+
   } else {
-    log_debug(gc, heap)("Running %s pre-touching " SIZE_FORMAT "B.",
-                        task.name(), total_bytes);
-    task.work(0);
+    if(OldGenPreTouchWorkers > 1) {
+      const char *oldgen_workers="Old-gen Pre-touch workers";
+      static WorkGang *pretouch_workers= NULL ;
+      if (! pretouch_workers) {
+ // pretouch_workers are used when pretouch_gang is null. This usually
happens during old-gen
+ // resizing due to object promotion.
+        pretouch_workers = new WorkGang(oldgen_workers,
OldGenPreTouchWorkers, true, false);
+        pretouch_workers->initialize_workers();
+      }
+      pretouch(oldgen_workers, start_address, end_address, page_size,
pretouch_workers);
+    } else {
+      Ticks mark_start = Ticks::now();
+      task.work(0);
+      Ticks mark_end = Ticks::now();
+      log_debug(gc, heap)("Running %s pre-touching " SIZE_FORMAT "B. "
TIME_FORMAT,
+                          task.name(), total_bytes,
(mark_end-mark_start).seconds());
+    }
   }
 }


Thanks,
Amit Pawar



More information about the hotspot-gc-dev mailing list