RFR: JDK-8306441: Two phase segmented heap dump [v27]

Kevin Walls kevinw at openjdk.org
Tue Aug 8 11:12:40 UTC 2023


On Tue, 8 Aug 2023 09:22:07 GMT, Yi Yang <yyang at openjdk.org> wrote:

>> ### Motivation and proposal
>> Hi, heap dump brings about pauses for application's execution(STW), this is a well-known pain. JDK-8252842 have added parallel support to heapdump in an attempt to alleviate this issue. However, all concurrent threads competitively write heap data to the same file, and more memory is required to maintain the concurrent buffer queue. In experiments, we did not feel a significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-phase segmented heap dump:
>> 
>> - Phase 1(STW): Concurrent threads directly write data to multiple heap files.
>> - Phase 2(Non-STW): Merge multiple heap files into one complete heap dump file. This process can happen outside safepoint.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, which would result in more memory overhead, nor do they need to compete for locks. The changes in the overall design are as follows:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/77e4764a-62b5-4336-8b45-fc880ba14c4a)
>> <p align="center">Fig1. Before</p>
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/931ab874-64d1-4337-ae32-3066eed809fc)
>> <p align="center">Fig2. After this patch</p>
>> 
>> ### Performance evaluation
>> | memory | numOfThread | CompressionMode | STW | Total |
>> | -------| ----------- | --------------- | --- | ---- |
>> | 8g | 1 T | N | 15.612 | 15.612 |
>> | 8g | 32 T | N | 2.561725 | 14.498 |
>> | 8g | 32 T | C1 | 2.3084878 | 14.198 |
>> | 8g | 32 T | C2 | 10.9355128 | 21.882 |
>> | 8g | 96 T | N | 2.6790452 | 14.012 |
>> | 8g | 96 T | C1 | 2.3044796 | 3.589 |
>> | 8g | 96 T | C2 | 9.7585151 | 20.219 |
>> | 16g | 1 T | N | 26.278 | 26.278 |
>> | 16g | 32 T | N | 5.231374 | 26.417 |
>> | 16g | 32 T | C1 | 5.6946983 | 6.538 |
>> | 16g | 32 T | C2 | 21.8211105 | 41.133 |
>> | 16g | 96 T | N | 6.2445556 | 27.141 |
>> | 16g | 96 T | C1 | 4.6007096 | 6.259 |
>> | 16g | 96 T | C2 | 19.2965783 | 39.007 |
>> | 32g | 1 T | N | 48.149 | 48.149 |
>> | 32g | 32 T | N | 10.7734677 | 61.643 |
>> | 32g | 32 T | C1 | 10.1642097 | 10.903 |
>> | 32g | 32 T | C2 | 43.8407607 | 88.152 |
>> | 32g | 96 T | N | 13.1522042 | 61.432 |
>> | 32g | 96 T | C1 | 9.0954641 | 9.885 |
>> | 32g | 96 T | C2 | 38.9900931 | 80.574 |
>> | 64g | 1 T | N | 100.583 | 100.583 |
>> | 64g | 32 T | N | 20.9233744 | 134.701 |
>> | 64g | 32 T | C1 | 18.5023784 | 19.358 |
>> | 64g | 32 T | C2 | 86.4748377 | 172.707 |
>> | 64g | 96 T | N | 26.7374116 | 126.08 |
>> | 64g | ...
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use OutputAnalyzer in HeapDumpParallelTest

Thanks for being responsive to all these points.  
(I did worry also about starting so many test apps but I don't see a better way right now.  We can revisit if this is a problem.)

I think we should get this integrated but first, sorry, the test fails with UseSerialGC, which will happen in some testing tier.
We do want it to run with all different options that the test harness may request, and all GCs.
So the test needs to realize it's not parallel, we can make it check the options and force expectSerial to true.  This can do it, I tested with the different collectors:


bash-4.2$ git diff
diff --git a/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpParallelTest.java b/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpParallelTest.java
index 247916c93db..5a9d40fbb30 100644
--- a/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpParallelTest.java
+++ b/test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpParallelTest.java
@@ -24,10 +24,12 @@
 import java.io.File;
 import java.io.IOException;
 import java.nio.file.Files;
+import java.util.Arrays;
 import java.util.List;

 import jdk.test.lib.Asserts;
 import jdk.test.lib.JDKToolLauncher;
+import jdk.test.lib.Utils;
 import jdk.test.lib.apps.LingeredApp;
 import jdk.test.lib.dcmd.PidJcmdExecutor;
 import jdk.test.lib.process.OutputAnalyzer;
@@ -40,7 +42,7 @@ import jdk.test.lib.hprof.HprofParser;
  * @bug 8306441
  * @summary Verify the integrity of generated heap dump and capability of parallel dump
  * @library /test/lib
- * @run driver HeapDumpParallelTest
+ * @run main HeapDumpParallelTest
  */

 public class HeapDumpParallelTest {
@@ -50,6 +52,11 @@ public class HeapDumpParallelTest {
         dcmdOut.shouldContain("Heap dump file created");
         OutputAnalyzer appOut = new OutputAnalyzer(app.getProcessStdout());
         appOut.shouldContain("[heapdump]");
+        String opts = Arrays.asList(Utils.getTestJavaOpts()).toString();
+        if (opts.contains("-XX:+UseSerialGC")) {
+            System.out.println("UseSerialGC detected.");
+            expectSerial = true;
+        }
         if (!expectSerial && Runtime.getRuntime().availableProcessors() > 1) {
             appOut.shouldContain("Dump heap objects in parallel");
             appOut.shouldContain("Merge heap files complete");
@@ -137,4 +144,5 @@ public class HeapDumpParallelTest {
             Asserts.fail("Could not parse dump file " + dump.getAbsolutePath());
         }
     }
-}
\ No newline at end of file
+}
+
bash-4.2$

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1669411337


More information about the hotspot-runtime-dev mailing list