From ksakata at openjdk.java.net Thu Apr 1 02:34:23 2021 From: ksakata at openjdk.java.net (Koichi Sakata) Date: Thu, 1 Apr 2021 02:34:23 GMT Subject: RFR: 8176026: SA: Huge heap sizes cause a negative value to be displayed in the jhisto heap total [v2] In-Reply-To: References: <-OxCoBAqyBOqOS5DP3C4NPkq8PFLwwCL6wBXmiqeEzU=.f62e8e0f-418e-4129-a91a-c10e1571466e@github.com>

Message-ID: On Fri, 26 Mar 2021 13:57:00 GMT, Koichi Sakata wrote: >> Marked as reviewed by mgkwill at github.com (no known OpenJDK username). > > I just updated the copyright. Could someone sponsor this pull request? It is ready. ------------- PR: https://git.openjdk.java.net/jdk/pull/3087 From ysuenaga at openjdk.java.net Thu Apr 1 04:09:25 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 1 Apr 2021 04:09:25 GMT Subject: RFR: 8176026: SA: Huge heap sizes cause a negative value to be displayed in the jhisto heap total [v3] In-Reply-To: <2nwnR_h9DuOC_2rdT9z7j3ickPchjK9ozWv9obWCP_M=.018c90ea-1fc8-4ee5-b1a6-6b1a65a308e4@github.com> References: <2nwnR_h9DuOC_2rdT9z7j3ickPchjK9ozWv9obWCP_M=.018c90ea-1fc8-4ee5-b1a6-6b1a65a308e4@github.com> Message-ID: On Fri, 26 Mar 2021 13:51:40 GMT, Koichi Sakata wrote: >> When a heap is used more than about 2.1GB, clhsdb jhisto shows a negative number in the total field. >> >> $ java -Xmx20g Sample >> >> $ jhsdb clhsdb --pid 5773 >> Attaching to process 5773, please wait... >> hsdb> jhisto >> ... >> 299: 1 16 jdk.internal.misc.Unsafe >> 300: 3402 10737610256 byte[] >> Total : 15823 -2146661280 >> Heap traversal took 1.793 seconds. >> (Incidentally, the Sample is a program that only allocates many objects.) >> >> #### Details >> This is because in ObjectHistogram class the totalSize variable is int type. >> >> The total size is the total of ObjectHistogramElement#getSize() and getSize() returns long. So I changed int to long in the ObjectHistogram class. >> >> Additionally, I changed the type of the totalCount. This doesn't cause a bug, but ObjectHistogramElement#getCount() also returns long. So it doesn't need to treat it as int, I think. ObjectHistogramElement#compare() also do the same thing, so a class that is a huge size show the bottom of the list. >> >> #### Tests >> The jtreg test was successful. >> $ sudo make run-test TEST=serviceability/sa/ClhsdbJhisto.java >> >> $ cat build/linux-x86_64-server-fastdebug/test-results/jtreg_test_hotspot_jtreg_serviceability_sa_ClhsdbJhisto_java/text/summary.txt >> serviceability/sa/ClhsdbJhisto.java Passed. Execution successful >> >> I confirmed the output with the same program. >> >> $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/java -Xmx20g Sample >> $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/jhsdb clhsdb --pid 10196 >> Attaching to process 10196, please wait... >> hsdb> jhisto >> Object Histogram: >> >> num #instances #bytes Class description >> -------------------------------------------------------------------------- >> 1: 3405 13958838400 byte[] >> 2: 887 109032 java.lang.Class >> ... >> 300: 1 16 jdk.internal.misc.Unsafe >> Total : 15827 13959470288 >> Heap traversal took 1.72 seconds. > > Koichi Sakata has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright Looks good. I will sponsor you. ------------- Marked as reviewed by ysuenaga (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3087 From ksakata at openjdk.java.net Thu Apr 1 04:13:30 2021 From: ksakata at openjdk.java.net (Koichi Sakata) Date: Thu, 1 Apr 2021 04:13:30 GMT Subject: Integrated: 8176026: SA: Huge heap sizes cause a negative value to be displayed in the jhisto heap total In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 11:49:39 GMT, Koichi Sakata wrote: > When a heap is used more than about 2.1GB, clhsdb jhisto shows a negative number in the total field. > > $ java -Xmx20g Sample > > $ jhsdb clhsdb --pid 5773 > Attaching to process 5773, please wait... > hsdb> jhisto > ... > 299: 1 16 jdk.internal.misc.Unsafe > 300: 3402 10737610256 byte[] > Total : 15823 -2146661280 > Heap traversal took 1.793 seconds. > (Incidentally, the Sample is a program that only allocates many objects.) > > #### Details > This is because in ObjectHistogram class the totalSize variable is int type. > > The total size is the total of ObjectHistogramElement#getSize() and getSize() returns long. So I changed int to long in the ObjectHistogram class. > > Additionally, I changed the type of the totalCount. This doesn't cause a bug, but ObjectHistogramElement#getCount() also returns long. So it doesn't need to treat it as int, I think. ObjectHistogramElement#compare() also do the same thing, so a class that is a huge size show the bottom of the list. > > #### Tests > The jtreg test was successful. > $ sudo make run-test TEST=serviceability/sa/ClhsdbJhisto.java > > $ cat build/linux-x86_64-server-fastdebug/test-results/jtreg_test_hotspot_jtreg_serviceability_sa_ClhsdbJhisto_java/text/summary.txt > serviceability/sa/ClhsdbJhisto.java Passed. Execution successful > > I confirmed the output with the same program. > > $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/java -Xmx20g Sample > $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/jhsdb clhsdb --pid 10196 > Attaching to process 10196, please wait... > hsdb> jhisto > Object Histogram: > > num #instances #bytes Class description > -------------------------------------------------------------------------- > 1: 3405 13958838400 byte[] > 2: 887 109032 java.lang.Class > ... > 300: 1 16 jdk.internal.misc.Unsafe > Total : 15827 13959470288 > Heap traversal took 1.72 seconds. This pull request has now been integrated. Changeset: 39f0b27a Author: Koichi Sakata Committer: Yasumasa Suenaga URL: https://git.openjdk.java.net/jdk/commit/39f0b27a Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod 8176026: SA: Huge heap sizes cause a negative value to be displayed in the jhisto heap total Reviewed-by: cjplummer, kevinw, ysuenaga ------------- PR: https://git.openjdk.java.net/jdk/pull/3087 From shade at openjdk.java.net Thu Apr 1 07:22:25 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 1 Apr 2021 07:22:25 GMT Subject: RFR: 8264484: Replace uses of StringBuffer with StringBuilder in jdk.hotspot.agent [v3] In-Reply-To: References:

Message-ID: On Wed, 31 Mar 2021 23:47:41 GMT, Yasumasa Suenaga wrote: >> Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8264484: Replace uses of StringBuffer with StringBuilder in jdk.hotspot.agent >> >> fix indentation > > Looks good! > BTW have you found sponsor? I can sponsor you if you need. @YaSuenag, you can just sponsor without asking, I think. ------------- PR: https://git.openjdk.java.net/jdk/pull/3275 From github.com+741251+turbanoff at openjdk.java.net Thu Apr 1 07:27:08 2021 From: github.com+741251+turbanoff at openjdk.java.net (Andrey Turbanov) Date: Thu, 1 Apr 2021 07:27:08 GMT Subject: Integrated: 8264484: Replace uses of StringBuffer with StringBuilder in jdk.hotspot.agent In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 18:58:24 GMT, Andrey Turbanov wrote: > Found by IntelliJ IDEA inspection `Java | Java language level migration aids | Java 5 | 'StringBuffer' may be 'StringBuilder'` > As suggested in https://github.com/openjdk/jdk/pull/1507#issuecomment-757369003 I've created separate PR for module `jdk.hotspot.agent` > Similar cleanups in the past: https://bugs.openjdk.java.net/browse/JDK-8041679 https://bugs.openjdk.java.net/browse/JDK-8264029 This pull request has now been integrated. Changeset: 6cf10950 Author: Andrey Turbanov Committer: Yasumasa Suenaga URL: https://git.openjdk.java.net/jdk/commit/6cf10950 Stats: 116 lines in 37 files changed: 0 ins; 7 del; 109 mod 8264484: Replace uses of StringBuffer with StringBuilder in jdk.hotspot.agent Reviewed-by: kevinw, amenkov, ysuenaga ------------- PR: https://git.openjdk.java.net/jdk/pull/3275 From pliden at openjdk.java.net Thu Apr 1 07:59:18 2021 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 1 Apr 2021 07:59:18 GMT Subject: RFR: 8260267: vmTestbase/gc/gctests/FinalizeLock/FinalizeLock.java fails with fatal error: Mark stack space exhausted In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 10:58:56 GMT, ?? wrote: > ZGC mark stack will be exhausted when the live object graph is very large, which due to the pseudo-BFS style visiting of the object graph (ZGC gc worker will push children to other stripes which behave like BFS, so the maximum mark stack size is O(num_objects).). But other gc like g1 or Shenandoah use DFS to visiting the object graph, and the mark stack usage merely exceeds 32M, which may due to the maximum mark stack size is related to the depth of the object graph O(depth_of_object_graph). > > The following is the test case used to reproduce the crash, > import java.util.HashMap; > import java.util.HashSet; > > class Test2 { > public static int NODE_COUNT = 25000000; > public static int NODE_COUNT_UB = 1 << 25; > public static int STRING_GEN_INDEX = 1; > public static int WATCHER_COUNT = 100; > public static int MAP_COUNT = 8; > public static int BIN_COUNT = 16; > public static Object[] watchTable = new Object[MAP_COUNT]; > > public static class Watcher { > public static long counter = 0; > public long index; > Watcher() { > index = counter; > counter++; > } > } > > public static int gen_name() { > int cur = STRING_GEN_INDEX++; > int reminder = cur % BIN_COUNT; > cur = cur - reminder + NODE_COUNT_UB * reminder; > return cur; > } > > public static HashMap> get_map(int index) { > return (HashMap>)watchTable[index]; > } > > public static void gen_watcher() { > int name = gen_name(); > HashSet set = new HashSet(); > for (int i = 0; i < WATCHER_COUNT; i++) { > set.add(new Watcher()); > } > for (int i = 0; i < MAP_COUNT; i++) { > get_map(i).put(name, set); > } > } > > public static void setup() { > for (int i = 0; i < MAP_COUNT; i++) { > watchTable[i] = new HashMap>(); > } > for (int i = 0; i < NODE_COUNT; i++) { > gen_watcher(); > } > } > > public static void main(String[] args) { > setup(); > for (int i = 0; i < 10; i++) { > System.gc(); > } > System.out.println("pass"); > } > } > > The command to run the testcase is, > java -XX:+UseZGC -Xmx280g -XX:ZCollectionInterval=10 -Xlog:gc*=debug:file=gc.log:time,level,tags:filesize=5g Test2 As you know, there's an ongoing discussion what a proper fix for this could look like here https://bugs.openjdk.java.net/browse/JDK-8260267, please participate in that discussion. I tested the reproducer provided in this PR, and it seems like https://github.com/openjdk/jdk/compare/master...pliden:reduce_mark_stack_usage_v3 handles it better than the patch proposed here. The patch proposed here seems to roughly doubles the marking time for the provided reproducer. I've updated the patch I proposed, to also take mark stack usage into account. Please see https://bugs.openjdk.java.net/browse/JDK-8260267?focusedCommentId=14410719&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14410719 ------------- PR: https://git.openjdk.java.net/jdk/pull/3262 From mli at openjdk.java.net Thu Apr 1 14:43:42 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Thu, 1 Apr 2021 14:43:42 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: References: Message-ID: > Summary > ----------- > > Improve G1 Full GC by skip compaction for regions with high survival ratio. > > Backgroud > ----------- > > There are 4 steps in full gc of G1 GC. > - mark live objects > - prepare forwardee > - adjust pointers > - compact > > When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy. > > Description > ----------- > > We enhance the full gc implementation for the above situation through following steps: > - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495) > - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase; > - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step; > - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase; > > VM options related > ----------- > > - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1. > - default value of MarkSweepDeadRatio: 5 > > Test > ----------- > > - specjbb2015: no regression > - dacapo: (Attachment is the dacapo h2 full gc pause.) > - 95% of full gc pauses: 10%-19% improvement. > - 5% of full gc pauses: 1.2% improvement. > - 0.1% of full gc pauses: -6.16% improvement. > > $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2 Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: minor code improvement. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2760/files - new: https://git.openjdk.java.net/jdk/pull/2760/files/1f4ead84..63ab0f9c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2760&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2760&range=11-12 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2760.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2760/head:pull/2760 PR: https://git.openjdk.java.net/jdk/pull/2760 From hshi at openjdk.java.net Fri Apr 2 02:56:27 2021 From: hshi at openjdk.java.net (Hui Shi) Date: Fri, 2 Apr 2021 02:56:27 GMT Subject: RFR: 8260267: vmTestbase/gc/gctests/FinalizeLock/FinalizeLock.java fails with fatal error: Mark stack space exhausted In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 10:58:56 GMT, ?? wrote: > ZGC mark stack will be exhausted when the live object graph is very large, which due to the pseudo-BFS style visiting of the object graph (ZGC gc worker will push children to other stripes which behave like BFS, so the maximum mark stack size is O(num_objects).). But other gc like g1 or Shenandoah use DFS to visiting the object graph, and the mark stack usage merely exceeds 32M, which may due to the maximum mark stack size is related to the depth of the object graph O(depth_of_object_graph). > > The following is the test case used to reproduce the crash, > import java.util.HashMap; > import java.util.HashSet; > > class Test2 { > public static int NODE_COUNT = 25000000; > public static int NODE_COUNT_UB = 1 << 25; > public static int STRING_GEN_INDEX = 1; > public static int WATCHER_COUNT = 100; > public static int MAP_COUNT = 8; > public static int BIN_COUNT = 16; > public static Object[] watchTable = new Object[MAP_COUNT]; > > public static class Watcher { > public static long counter = 0; > public long index; > Watcher() { > index = counter; > counter++; > } > } > > public static int gen_name() { > int cur = STRING_GEN_INDEX++; > int reminder = cur % BIN_COUNT; > cur = cur - reminder + NODE_COUNT_UB * reminder; > return cur; > } > > public static HashMap> get_map(int index) { > return (HashMap>)watchTable[index]; > } > > public static void gen_watcher() { > int name = gen_name(); > HashSet set = new HashSet(); > for (int i = 0; i < WATCHER_COUNT; i++) { > set.add(new Watcher()); > } > for (int i = 0; i < MAP_COUNT; i++) { > get_map(i).put(name, set); > } > } > > public static void setup() { > for (int i = 0; i < MAP_COUNT; i++) { > watchTable[i] = new HashMap>(); > } > for (int i = 0; i < NODE_COUNT; i++) { > gen_watcher(); > } > } > > public static void main(String[] args) { > setup(); > for (int i = 0; i < 10; i++) { > System.gc(); > } > System.out.println("pass"); > } > } > > The command to run the testcase is, > java -XX:+UseZGC -Xmx280g -XX:ZCollectionInterval=10 -Xlog:gc*=debug:file=gc.log:time,level,tags:filesize=5g Test2 src/hotspot/share/gc/z/zMarkStackAllocator.cpp line 181: > 179: > 180: for (;;) { > 181: if (nmagazine == 0) { It's confusing here _nmagazine can be negative after dec. might add some comment on this method. ------------- PR: https://git.openjdk.java.net/jdk/pull/3262 From github.com+25214855+casparcwang at openjdk.java.net Fri Apr 2 03:02:19 2021 From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=) Date: Fri, 2 Apr 2021 03:02:19 GMT Subject: Withdrawn: 8260267: vmTestbase/gc/gctests/FinalizeLock/FinalizeLock.java fails with fatal error: Mark stack space exhausted In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 10:58:56 GMT, ?? wrote: > ZGC mark stack will be exhausted when the live object graph is very large, which due to the pseudo-BFS style visiting of the object graph (ZGC gc worker will push children to other stripes which behave like BFS, so the maximum mark stack size is O(num_objects).). But other gc like g1 or Shenandoah use DFS to visiting the object graph, and the mark stack usage merely exceeds 32M, which may due to the maximum mark stack size is related to the depth of the object graph O(depth_of_object_graph). > > The following is the test case used to reproduce the crash, > import java.util.HashMap; > import java.util.HashSet; > > class Test2 { > public static int NODE_COUNT = 25000000; > public static int NODE_COUNT_UB = 1 << 25; > public static int STRING_GEN_INDEX = 1; > public static int WATCHER_COUNT = 100; > public static int MAP_COUNT = 8; > public static int BIN_COUNT = 16; > public static Object[] watchTable = new Object[MAP_COUNT]; > > public static class Watcher { > public static long counter = 0; > public long index; > Watcher() { > index = counter; > counter++; > } > } > > public static int gen_name() { > int cur = STRING_GEN_INDEX++; > int reminder = cur % BIN_COUNT; > cur = cur - reminder + NODE_COUNT_UB * reminder; > return cur; > } > > public static HashMap> get_map(int index) { > return (HashMap>)watchTable[index]; > } > > public static void gen_watcher() { > int name = gen_name(); > HashSet set = new HashSet(); > for (int i = 0; i < WATCHER_COUNT; i++) { > set.add(new Watcher()); > } > for (int i = 0; i < MAP_COUNT; i++) { > get_map(i).put(name, set); > } > } > > public static void setup() { > for (int i = 0; i < MAP_COUNT; i++) { > watchTable[i] = new HashMap>(); > } > for (int i = 0; i < NODE_COUNT; i++) { > gen_watcher(); > } > } > > public static void main(String[] args) { > setup(); > for (int i = 0; i < 10; i++) { > System.gc(); > } > System.out.println("pass"); > } > } > > The command to run the testcase is, > java -XX:+UseZGC -Xmx280g -XX:ZCollectionInterval=10 -Xlog:gc*=debug:file=gc.log:time,level,tags:filesize=5g Test2 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3262 From akozlov at azul.com Fri Apr 2 09:02:51 2021 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 2 Apr 2021 12:02:51 +0300 Subject: [jdk13u-dev] RFR: 8264640: CMS ParScanClosure misses a barrier In-Reply-To: References: Message-ID: Adding hotspot-gc-dev. It will be great to receive comments from GC experts, even the fix does not make sense for mainline jdk. Thanks, Anton On 4/2/21 11:51 AM, Anton Kozlov wrote: > Hi, please review an original fix for a GC crash. The jdk13u is the latest supported version that still has buggy code, it was deleted in jdk14 as a part of JEP 363: Remove the Concurrent Mark Sweep (CMS) Garbage Collector. So I'm proposing it here. > > The fix is low-risk, on x86-64 it just introduces a compiler barrier to prevent two reads to be reordered as intended by surrounding comments. On CPUs with weaker memory models it introduces CPU barriers as well. > > ------------- > > Commit messages: > - Add missing barriers > > Changes: https://git.openjdk.java.net/jdk13u-dev/pull/165/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk13u-dev&pr=165&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8264640 > Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod > Patch: https://git.openjdk.java.net/jdk13u-dev/pull/165.diff > Fetch: git fetch https://git.openjdk.java.net/jdk13u-dev pull/165/head:pull/165 > > PR: https://git.openjdk.java.net/jdk13u-dev/pull/165 > From github.com+168222+mgkwill at openjdk.java.net Fri Apr 2 16:38:02 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Fri, 2 Apr 2021 16:38:02 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v27] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: - Merge branch 'master' into update_hlp - Rebase on pull/3073 Signed-off-by: Marcus G K Williams - Merge branch 'pull/3073' into update_hlp - Thomas review. Changed commit_memory_special to return bool to signal if the request succeeded or not. - Self review. Update helper name to better match commit_memory_special(). - Marcus review. Updated comments. - Ivan review Renamed helper to commit_memory_special and updated the comments. - 8262291: Refactor reserve_memory_special_huge_tlbfs - Merge branch 'master' into update_hlp - Addressed kstefanj review suggestions Signed-off-by: Marcus G K Williams - ... and 34 more: https://git.openjdk.java.net/jdk/compare/f60e81bf...f5bfe224 ------------- Changes: https://git.openjdk.java.net/jdk/pull/1153/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=26 Stats: 282 lines in 4 files changed: 80 ins; 101 del; 101 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Fri Apr 2 16:55:30 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Fri, 2 Apr 2021 16:55:30 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References:

<_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com>

Message-ID: On Thu, 18 Mar 2021 12:47:40 GMT, Stefan Johansson wrote: > > > This would also give us to to think about a question I haven't made up my mind around. What will `LargePageSizeInBytes` mean after this change? Should the JVM use 1g pages (on a system where 2m i the default) even if `LargePageSizeInBytes` is not set? > > > > > > I see two valid scenarios: > > Me too. > > > a) either use huge pages as best as possible; remove fine-grained control from user hands. So, if we have 1G pages, prefer those over 2M over 4K. Then, we could completely remove LargePageSizeInBytes. There is no need for this switch anymore. > > I agree, preferably we can make it so that the upper layers can use something like `page_size_for_region*` and request a certain page size, but fall back to smaller ones. > > > b) or keep the current behavior. In that case, UseLargePageSize means "use at most default huge page size" even if larger pages are available; and LargePageSizeInBytes can be used to override the large page size to use. > > So it becomes more like a maximium value right? Or at least this is how I've thought about this second scenario. On a system with both 2M (the default size) and 1G pages available you would have to set `LargePageSizeInBytes=1g` to use the 1G pages, but the 2M could still be used for smaller mappings. > > > (a): Its elegant, and efficiently uses system resources when available. But its an incompatible change, and the VM being grabby means we could end up using pages meant for a different process. > > (b): downward compatible. In a sense. With Marcus change, we break downward compatibility already: where "UseLargePageSizeInBytes" before meant "use that or nothing", it now means "use that or whatever smaller page sizes you find". > > I prefer (a), honestly. > > I would also prefer (a). @kstefanj @tstuefe 1. Would (a) [removing fine grained control of large page [Huge Page] sizes with LargePageSizeInBytes and relying on automatic selection of sizes from available] require JEP process? 2. Can we put LargePageSizeInBytes flag in DIAGNOSTIC or EXPERIMENTAL and set it otherwise to the largest available page size? This way we have the default (a) and if users want to change page size they can use diagnostic or experimental and use (b). Also this would lower the amount of change needed. I'm a bit concerned with removing LargePageSizeInBytes altogether. I don't know if some use cases would break or need more fine tuned control of page sizes. I've incorporated https://github.com/openjdk/jdk/pull/3073 into this change, as a good base to build on, this also seemed to be Stefan's optimal path for this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From dcubed at openjdk.java.net Fri Apr 2 20:48:33 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 20:48:33 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Message-ID: A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC ------------- Commit messages: - 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Changes: https://git.openjdk.java.net/jdk/pull/3331/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3331&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264662 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3331.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3331/head:pull/3331 PR: https://git.openjdk.java.net/jdk/pull/3331 From hseigel at openjdk.java.net Fri Apr 2 20:52:17 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 2 Apr 2021 20:52:17 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References: Message-ID: On Fri, 2 Apr 2021 20:41:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java > on win-x64 with ZGC Looks good and trivial. Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3331 From dcubed at openjdk.java.net Fri Apr 2 21:35:20 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 21:35:20 GMT Subject: RFR: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References:

Message-ID: On Fri, 2 Apr 2021 20:49:35 GMT, Harold Seigel wrote: >> A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java >> on win-x64 with ZGC > > Looks good and trivial. > Thanks, Harold @hseigel - Thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/3331 From dcubed at openjdk.java.net Fri Apr 2 21:35:21 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 2 Apr 2021 21:35:21 GMT Subject: Integrated: 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC In-Reply-To: References: Message-ID: On Fri, 2 Apr 2021 20:41:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java > on win-x64 with ZGC This pull request has now been integrated. Changeset: 9c283da1 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/9c283da1 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8264662: ProblemList vmTestbase/jit/escape/AdaptiveBlocking/AdaptiveBlocking001/AdaptiveBlocking001.java on win-x64 with ZGC Reviewed-by: hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/3331 From yyang at openjdk.java.net Mon Apr 5 07:44:04 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 5 Apr 2021 07:44:04 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 Message-ID: Trivial fix for JDK-8264682. `Universe::heap()->used()` calls G1Allocator::used_in_alloc_regions() when G1 enabled, it checks whether Heap_lock was owned on this thread's behalf. ------------- Commit messages: - memprofiling_crash Changes: https://git.openjdk.java.net/jdk/pull/3340/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3340&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264682 Stats: 59 lines in 2 files changed: 59 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3340.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3340/head:pull/3340 PR: https://git.openjdk.java.net/jdk/pull/3340 From kbarrett at openjdk.java.net Mon Apr 5 14:31:24 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 5 Apr 2021 14:31:24 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 07:34:30 GMT, Yi Yang wrote: > Trivial fix for JDK-8264682. No, not trivial at all. Dropping a GC-specific blob like this into shared code is highly questionable. I'm not sure how this issue should be addressed, but I don't think the proposed change is the right approach. ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From github.com+71722661+earthling-amzn at openjdk.java.net Mon Apr 5 17:00:44 2021 From: github.com+71722661+earthling-amzn at openjdk.java.net (earthling-amzn) Date: Mon, 5 Apr 2021 17:00:44 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report Message-ID: This extra space doesn't look like it was intended to improve any alignment of text in the report. ------------- Commit messages: - Remove extraneous whitespace from phase timings report Changes: https://git.openjdk.java.net/jdk/pull/3342/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3342&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264727 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3342.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3342/head:pull/3342 PR: https://git.openjdk.java.net/jdk/pull/3342 From shade at openjdk.java.net Mon Apr 5 17:00:45 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 5 Apr 2021 17:00:45 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 16:45:43 GMT, earthling-amzn wrote: > This extra space doesn't look like it was intended to improve any alignment of text in the report. Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3342 From shade at openjdk.java.net Mon Apr 5 17:00:45 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 5 Apr 2021 17:00:45 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report In-Reply-To: References:

Message-ID: On Mon, 5 Apr 2021 16:49:47 GMT, Aleksey Shipilev wrote: >> This extra space doesn't look like it was intended to improve any alignment of text in the report. > > Marked as reviewed by shade (Reviewer). Created: https://bugs.openjdk.java.net/browse/JDK-8264727 -- rename this PR to "8264727: Shenandoah: Remove extraneous whitespace from phase timings report" to get it hooked up. ------------- PR: https://git.openjdk.java.net/jdk/pull/3342 From shade at openjdk.java.net Mon Apr 5 17:06:13 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 5 Apr 2021 17:06:13 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 16:45:43 GMT, earthling-amzn wrote: > This extra space doesn't look like it was intended to improve any alignment of text in the report. Looks fine and trivial. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3342 From zgu at openjdk.java.net Mon Apr 5 21:50:49 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 5 Apr 2021 21:50:49 GMT Subject: RFR: 8264718: Shenandoah: enable string deduplication during root scanning Message-ID: Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. Test: - [x] hotspot_gc_shenandoah ------------- Commit messages: - Merge branch 'master' into enable_dedup_roots - init Changes: https://git.openjdk.java.net/jdk/pull/3348/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3348&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264718 Stats: 69 lines in 5 files changed: 40 ins; 25 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3348.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3348/head:pull/3348 PR: https://git.openjdk.java.net/jdk/pull/3348 From iklam at openjdk.java.net Mon Apr 5 22:02:42 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 5 Apr 2021 22:02:42 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap Message-ID: When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). ------------- Commit messages: - Added -Xlog:cds=debug in case this test fails again - 8214455: Relocate CDS archived regions to the top of the G1 heap Changes: https://git.openjdk.java.net/jdk/pull/3349/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3349&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8214455 Stats: 143 lines in 8 files changed: 139 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3349.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3349/head:pull/3349 PR: https://git.openjdk.java.net/jdk/pull/3349 From iklam at openjdk.java.net Mon Apr 5 22:48:25 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 5 Apr 2021 22:48:25 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References:

Message-ID: On Mon, 5 Apr 2021 14:29:11 GMT, Kim Barrett wrote: > > Trivial fix for JDK-8264682. > > No, not trivial at all. Dropping a GC-specific blob like this into shared code is highly questionable. I'm not sure how this issue should be addressed, but I don't think the proposed change is the right approach. It looks like the following APIs can be reliably called without holding the Heap_lock. However, I am not sure if they would give the same result. I.e., can you get `Universe::heap()->unused()` by calling `Universe::heap()->capacity() - Universe::heap()->unused()`. JVM_ENTRY_NO_ENV(jlong, JVM_TotalMemory(void)) size_t n = Universe::heap()->capacity(); return convert_size_t_to_jlong(n); JVM_END JVM_ENTRY_NO_ENV(jlong, JVM_FreeMemory(void)) size_t n = Universe::heap()->unused(); return convert_size_t_to_jlong(n); JVM_END ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From dholmes at openjdk.java.net Tue Apr 6 02:08:36 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 6 Apr 2021 02:08:36 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 07:34:30 GMT, Yi Yang wrote: > Trivial fix for JDK-8264682. > > `Universe::heap()->used()` calls G1Allocator::used_in_alloc_regions() when G1 enabled, it checks whether Heap_lock was owned on this thread's behalf. Sorry but I have to agree with Kim that this is the wrong fix for this problem. And the test needs adjusting. Thanks, David src/hotspot/share/runtime/memprofiler.cpp line 120: > 118: // used() calls G1Allocator::used_in_alloc_regions() when G1 enabled, > 119: // it checks whether Heap_lock was owned on this thread's behalf. > 120: G1GC_ONLY(MutexLocker ml(Heap_lock);) This seems really out of place - this code should not need to know anything about the specific locking requirements of different GC's. That should be handled inside the appropriate chunk of GC code. test/hotspot/jtreg/runtime/MemProfiler/MemProfilingWithGC.java line 44: > 42: "UseShenandoahGC", > 43: "UseZGC", > 44: "UseEpsilonGC", You have to allow for the fact that these GC's may not have all been built into the JVM under test. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3340 From shade at openjdk.java.net Tue Apr 6 06:40:28 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 6 Apr 2021 06:40:28 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report In-Reply-To: References:

Message-ID: On Mon, 5 Apr 2021 17:02:55 GMT, Aleksey Shipilev wrote: >> This extra space doesn't look like it was intended to improve any alignment of text in the report. > > Looks fine and trivial. You need to resolve the integration blocker before integration: the PR title and bug synopsis are mismatched. You need "Shenandoah: " in the title. ------------- PR: https://git.openjdk.java.net/jdk/pull/3342 From tschatzl at openjdk.java.net Tue Apr 6 09:44:28 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 6 Apr 2021 09:44:28 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 21:48:43 GMT, Ioi Lam wrote: > When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. > > The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). > > Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. > > In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). Lgtm. src/hotspot/share/memory/filemap.cpp line 1846: > 1844: _heap_pointers_need_patching = true; > 1845: } else if (header()->heap_end() != CompressedOops::end()) { > 1846: log_info(cds)("CDS heap data need to be relocated to the end of the runtime heap to reduce fragmentation"); s/need/needs test/hotspot/jtreg/runtime/cds/appcds/cacheObject/HeapFragmentationTest.java line 30: > 28: * @bug 8214455 > 29: * @requires vm.cds.archived.java.heap > 30: * @requires (sun.arch.data.model == "64" & os.maxMemory > 4g) additional space before the & ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3349 From ayang at openjdk.java.net Tue Apr 6 09:54:13 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Tue, 6 Apr 2021 09:54:13 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 11:55:59 GMT, Kim Barrett wrote: > Please review this change to OopStorage to support bulk allocation. A new overload for allocate is provided: > size_t allocate(oop** entries, size_t size) > The approach taken is to claim all of the available entries in the first available block, add as many of those entries as will fit in the buffer, and release any remaining entries back to the block. Only the claim part needs to be done while holding the allocation mutex. This is is optimized for clients that want more than just a couple entries, and minimizes time under the lock. The maximum number of entries (the number of entries in a block) is provided as a constant for sizing requests, to avoid the release of unrequested entries. An application that wants more than that, or a specific number that might not be available from the next block, needs to make multiple bulk allocation calls, but that's still going to be much faster than one at a time allocations. > > Testing: > mach5 tier1 > New gtest for bulk allocation. > I've been using this for a while as part of another in-development change that makes heavy use of this feature. Marked as reviewed by ayang (Committer). src/hotspot/share/gc/shared/oopStorage.cpp line 478: > 476: block = block_for_allocation(); > 477: if (block == NULL) return 0; // Block allocation failed. > 478: // Take exclusive use of this block for allocation. Maybe "Taking all remaining entries, so remove from list."? The "exclusive use" is ensured by the above `MutexLocker`, not `unlink`. src/hotspot/share/gc/shared/oopStorage.hpp line 119: > 117: // Allocates multiple entries, returning them in the ptrs buffer. The > 118: // number of entries that will be allocated is never more than the minimum > 119: // of size and bulk_allocate_limit, but may be less than either. Possibly It took me a few seconds to understand this sentence. Maybe replace it with "postcondition: result <= min(size, bulk_allocate_limit)" below? ------------- PR: https://git.openjdk.java.net/jdk/pull/3264 From tschatzl at openjdk.java.net Tue Apr 6 10:07:26 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 6 Apr 2021 10:07:26 GMT Subject: RFR: 8264027: Refactor "CLEANUP" region printing In-Reply-To: <6NLEJmvE7zRDNprrDS5B3suYSfloxZAdI2zPIPj3yu0=.286cde8e-4429-416c-a62b-2d0f0cf24627@github.com> References: <6NLEJmvE7zRDNprrDS5B3suYSfloxZAdI2zPIPj3yu0=.286cde8e-4429-416c-a62b-2d0f0cf24627@github.com> Message-ID: On Tue, 30 Mar 2021 03:14:41 GMT, Kim Barrett wrote: >> Hi all, >> >> can I have reviews for this small cleanup that refactors "CLEANUP" region status printing? >> >> - where managing a `FreeRegionList` is not necessary, inline the call to the method (which is guarded by a check that if logging is not enabled, we do not print anything anyway) >> - for the single remaining case, create a helper method in `G1HRPrinter` >> >> The first item has been done as per the suggestion from @kstefanj [here](https://github.com/openjdk/jdk/pull/3154#pullrequestreview-619061961) ; the alternative would be to keep the free region lists and call `G1HRPrinter::cleanup` on it later, which I actually initially implemented. I could not find a difference performance-wise. >> >> Testing: checked if there is some perf difference on a heap with 50k regions; tier1-3 > > Looks good. Thanks @kimbarrett @albertnetymk for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/3243 From tschatzl at openjdk.java.net Tue Apr 6 10:07:28 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 6 Apr 2021 10:07:28 GMT Subject: Integrated: 8264027: Refactor "CLEANUP" region printing In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 12:43:12 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this small cleanup that refactors "CLEANUP" region status printing? > > - where managing a `FreeRegionList` is not necessary, inline the call to the method (which is guarded by a check that if logging is not enabled, we do not print anything anyway) > - for the single remaining case, create a helper method in `G1HRPrinter` > > The first item has been done as per the suggestion from @kstefanj [here](https://github.com/openjdk/jdk/pull/3154#pullrequestreview-619061961) ; the alternative would be to keep the free region lists and call `G1HRPrinter::cleanup` on it later, which I actually initially implemented. I could not find a difference performance-wise. > > Testing: checked if there is some perf difference on a heap with 50k regions; tier1-3 This pull request has now been integrated. Changeset: bf26a255 Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/bf26a255 Stats: 71 lines in 4 files changed: 44 ins; 21 del; 6 mod 8264027: Refactor "CLEANUP" region printing Reviewed-by: kbarrett, ayang ------------- PR: https://git.openjdk.java.net/jdk/pull/3243 From tschatzl at openjdk.java.net Tue Apr 6 10:49:32 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 6 Apr 2021 10:49:32 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 11:55:59 GMT, Kim Barrett wrote: > Please review this change to OopStorage to support bulk allocation. A new overload for allocate is provided: > size_t allocate(oop** entries, size_t size) > The approach taken is to claim all of the available entries in the first available block, add as many of those entries as will fit in the buffer, and release any remaining entries back to the block. Only the claim part needs to be done while holding the allocation mutex. This is is optimized for clients that want more than just a couple entries, and minimizes time under the lock. The maximum number of entries (the number of entries in a block) is provided as a constant for sizing requests, to avoid the release of unrequested entries. An application that wants more than that, or a specific number that might not be available from the next block, needs to make multiple bulk allocation calls, but that's still going to be much faster than one at a time allocations. > > Testing: > mach5 tier1 > New gtest for bulk allocation. > I've been using this for a while as part of another in-development change that makes heavy use of this feature. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3264 From tschatzl at openjdk.java.net Tue Apr 6 14:04:21 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 6 Apr 2021 14:04:21 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v12] In-Reply-To: References:

Message-ID: On Wed, 31 Mar 2021 09:59:51 GMT, Hamlin Li wrote: >> [Here](https://github.com/tschatzl/jdk/tree/bot-fixes) is a potential fix for this. This needs some discussion on whether an alternatives would be better or not; more of them could be: >> * just initialize a dummy BOT for not-compacted young region >> * exclude young regions from this dead-wood scheme >> In any case this fix should probably go in separately. > >> [Here](https://github.com/tschatzl/jdk/tree/bot-fixes) is a potential fix for this. > > Thanks for the fix. >> In any case this fix should probably go in separately. > > Do you mean we push this pr, then discuss the fix in another thread? If yes, I will file a bug to track the issue and initialize discussion on this bug. This change should go in after PR#3356 which fixes this issue. I think it is a change that is worth pointing out and discussing separately - as you might see from the long description. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From coleenp at openjdk.java.net Tue Apr 6 14:34:27 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 6 Apr 2021 14:34:27 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 02:04:57 GMT, David Holmes wrote: >> Trivial fix for JDK-8264682. >> >> `Universe::heap()->used()` calls G1Allocator::used_in_alloc_regions() when G1 enabled, it checks whether Heap_lock was owned on this thread's behalf. > > Sorry but I have to agree with Kim that this is the wrong fix for this problem. > > And the test needs adjusting. > > Thanks, > David Unless we can imagine a reason why this code is needed, I think the old memprofiler should be removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From iklam at openjdk.java.net Tue Apr 6 17:04:19 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 6 Apr 2021 17:04:19 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap [v2] In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 09:35:01 GMT, Thomas Schatzl wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @tschatzl review -- fix typos and whitespace > > src/hotspot/share/memory/filemap.cpp line 1846: > >> 1844: _heap_pointers_need_patching = true; >> 1845: } else if (header()->heap_end() != CompressedOops::end()) { >> 1846: log_info(cds)("CDS heap data need to be relocated to the end of the runtime heap to reduce fragmentation"); > > s/need/needs Fixed. > test/hotspot/jtreg/runtime/cds/appcds/cacheObject/HeapFragmentationTest.java line 30: > >> 28: * @bug 8214455 >> 29: * @requires vm.cds.archived.java.heap >> 30: * @requires (sun.arch.data.model == "64" & os.maxMemory > 4g) > > additional space before the & Fixed. Thanks for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/3349 From iklam at openjdk.java.net Tue Apr 6 17:04:18 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 6 Apr 2021 17:04:18 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap [v2] In-Reply-To: References: Message-ID: > When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. > > The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). > > Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. > > In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @tschatzl review -- fix typos and whitespace ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3349/files - new: https://git.openjdk.java.net/jdk/pull/3349/files/def7db7b..bedd676d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3349&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3349&range=00-01 Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3349.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3349/head:pull/3349 PR: https://git.openjdk.java.net/jdk/pull/3349 From mli at openjdk.java.net Wed Apr 7 01:17:22 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Wed, 7 Apr 2021 01:17:22 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v12] In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 14:01:15 GMT, Thomas Schatzl wrote: >>> [Here](https://github.com/tschatzl/jdk/tree/bot-fixes) is a potential fix for this. >> >> Thanks for the fix. >>> In any case this fix should probably go in separately. >> >> Do you mean we push this pr, then discuss the fix in another thread? If yes, I will file a bug to track the issue and initialize discussion on this bug. > > This change should go in after PR#3356 which fixes this issue. I think it is a change that is worth pointing out and discussing separately - as you might see from the long description. I see, Thanks Thomas! ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From david.holmes at oracle.com Wed Apr 7 05:21:06 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 7 Apr 2021 15:21:06 +1000 Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References:

Message-ID: On 7/04/2021 12:34 am, Coleen Phillimore wrote: > On Tue, 6 Apr 2021 02:04:57 GMT, David Holmes wrote: > >>> Trivial fix for JDK-8264682. >>> >>> `Universe::heap()->used()` calls G1Allocator::used_in_alloc_regions() when G1 enabled, it checks whether Heap_lock was owned on this thread's behalf. >> >> Sorry but I have to agree with Kim that this is the wrong fix for this problem. >> >> And the test needs adjusting. >> >> Thanks, >> David > > Unless we can imagine a reason why this code is needed, I think the old memprofiler should be removed. Why was it added? I've no idea who might be using this, or for what. But I think removing it is a separate issue. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3340 > From ccheung at openjdk.java.net Wed Apr 7 05:38:25 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 7 Apr 2021 05:38:25 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap [v2] In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 17:04:18 GMT, Ioi Lam wrote: >> When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. >> >> The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). >> >> Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. >> >> In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @tschatzl review -- fix typos and whitespace Looks good. Just one minor nit in the test case. test/hotspot/jtreg/runtime/cds/appcds/cacheObject/HeapFragmentationTest.java line 84: > 82: // Run with CDS. The archived heap regions should be relocated to avoid fragmentation. > 83: TestCommon.run(TestCommon.concat(execArgs, runTimeHeapSize, mainClass, BUFF_SIZE)) > 84: .assertNormalExit(successOutput); The assertNormalExit line should be indented four spaces relative to the previous line. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3349 From github.com+25214855+casparcwang at openjdk.java.net Wed Apr 7 08:03:27 2021 From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=) Date: Wed, 7 Apr 2021 08:03:27 GMT Subject: RFR: 8264718: Shenandoah: enable string deduplication during root scanning In-Reply-To: References: Message-ID: <0XUFNDhmkB4OgUxbtnadAcg9daAjdppc0VlVWJMZVPI=.69dcbac1-bbbd-45ba-97ac-2ad8b5c18d52@github.com> On Mon, 5 Apr 2021 21:42:32 GMT, Zhengyu Gu wrote: > Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. > > Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. > > Test: > - [x] hotspot_gc_shenandoah src/hotspot/share/gc/shenandoah/shenandoahSTWMark.cpp line 134: > 132: ShenandoahInitMarkRootsClosure init_mark(task_queues()->queue(worker_id)); > 133: _root_scanner.roots_do(&init_mark, worker_id); > 134: extra empty line ------------- PR: https://git.openjdk.java.net/jdk/pull/3348 From tschatzl at openjdk.java.net Wed Apr 7 08:34:48 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 7 Apr 2021 08:34:48 GMT Subject: RFR: 8264783: G1 BOT verification should not verify beyond allocation threshold Message-ID: The G1 BOT contains an allocation threshold which basically acts as a "last known valid entry" index for the per-region BOT (which are views on the global BOT table). Currently G1 BOT verification actually "verifies" the BOT within a region past that BOT index. This causes issues with young regions; actually there is already code that prevents their BOT verification. This is perfectly fine, allocations in young regions do not update the BOTs. With JDK-8262068/PR #2760 this existing filtering of young regions (such that their BOT is not verified) does not work because there may be young regions that are not compacted (so with a BOT that have that last known valid entry at the start of the region) are still labelled as old. This change proposes to not try to verify the BOT beyond the last known valid index (which is arguably not worth doing), which also covers the existing young filtering. There are alternatives that may work in particular for JDK-8262068: a) always compact young regions (which recreates the BOT) b) create a "dummy" BOT that spans the entire part of the region containing live objects However they were rejected by me because option a) takes time and directly counters that optimization for no reason option b) makes finding the start of an object within these regions slow (they are not refined when there is at least *some* bot) and finally I do not think there is much point in trying to be clever about areas in the BOT that are known to not contain useful values. Testing: tier1-3 ------------- Commit messages: - Bot fixes for 2760 Changes: https://git.openjdk.java.net/jdk/pull/3356/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3356&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264783 Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3356.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3356/head:pull/3356 PR: https://git.openjdk.java.net/jdk/pull/3356 From mli at openjdk.java.net Wed Apr 7 08:34:49 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Wed, 7 Apr 2021 08:34:49 GMT Subject: RFR: 8264783: G1 BOT verification should not verify beyond allocation threshold In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 13:59:12 GMT, Thomas Schatzl wrote: > The G1 BOT contains an allocation threshold which basically acts as a "last known valid entry" index for the per-region BOT (which are views on the global BOT table). > > Currently G1 BOT verification actually "verifies" the BOT within a region past that BOT index. > > This causes issues with young regions; actually there is already code that prevents their BOT verification. This is perfectly fine, allocations in young regions do not update the BOTs. > > With JDK-8262068/PR #2760 this existing filtering of young regions (such that their BOT is not verified) does not work because there may be young regions that are not compacted (so with a BOT that have that last known valid entry at the start of the region) are still labelled as old. > > This change proposes to not try to verify the BOT beyond the last known valid index (which is arguably not worth doing), which also covers the existing young filtering. > > There are alternatives that may work in particular for JDK-8262068: > a) always compact young regions (which recreates the BOT) > b) create a "dummy" BOT that spans the entire part of the region containing live objects > > However they were rejected by me because > option a) takes time and directly counters that optimization for no reason > option b) makes finding the start of an object within these regions slow (they are not refined when there is at least *some* bot) > > and finally I do not think there is much point in trying to be clever about areas in the BOT that are known to not contain useful values. > > Testing: tier1-3 Marked as reviewed by mli (Reviewer). The change looks good to me. Thanks Thomas! ------------- PR: https://git.openjdk.java.net/jdk/pull/3356 From tschatzl at openjdk.java.net Wed Apr 7 08:38:51 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 7 Apr 2021 08:38:51 GMT Subject: RFR: 8264788: Make SequentialSubTasksDone use-once Message-ID: Hi all, can I have reviews for this change that simplifies code ahead? SequentialSubTasksDone has some machinery to reset itself automatically after all tasks were claimed. This is not used at all and can/should be removed. Test: tier1-3 ------------- Commit messages: - Initial commit Changes: https://git.openjdk.java.net/jdk/pull/3372/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3372&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264788 Stats: 86 lines in 5 files changed: 2 ins; 67 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/3372.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3372/head:pull/3372 PR: https://git.openjdk.java.net/jdk/pull/3372 From tschatzl at openjdk.java.net Wed Apr 7 08:54:29 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 7 Apr 2021 08:54:29 GMT Subject: RFR: 8264788: Make SequentialSubTasksDone use-once In-Reply-To: References: Message-ID: On Wed, 7 Apr 2021 08:32:11 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that simplifies code ahead? > > SequentialSubTasksDone has some machinery to reset itself automatically after all tasks were claimed. > > This is not used at all and can/should be removed. > > Test: tier1-3 Note that it is possible to remove the CAS loop in `SequentialSubTasksDone` with an atomic add. I wasn't sure to do this, but if somebody thinks it is a good idea (and probably faster in many implementations) similar to the following: bool SequentialSubTasksDone::try_claim_task(uint& t) { if (t < _num_tasks) { t = Atomic::add(&_num_claimed, 1) - 1; } return t < _num_tasks; } (And slightly fixing that assert in the destructor) ------------- PR: https://git.openjdk.java.net/jdk/pull/3372 From yyang at openjdk.java.net Wed Apr 7 09:36:29 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 7 Apr 2021 09:36:29 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 14:31:00 GMT, Coleen Phillimore wrote: >> Sorry but I have to agree with Kim that this is the wrong fix for this problem. >> >> And the test needs adjusting. >> >> Thanks, >> David > > Unless we can imagine a reason why this code is needed, I think the old memprofiler should be removed. I'd like to propose a new PR to remove the memprofiler if there are no strong objections. ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From ayang at openjdk.java.net Wed Apr 7 09:39:37 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 7 Apr 2021 09:39:37 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: References:

Message-ID: <47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> On Thu, 1 Apr 2021 14:43:42 GMT, Hamlin Li wrote: >> Summary >> ----------- >> >> Improve G1 Full GC by skip compaction for regions with high survival ratio. >> >> Backgroud >> ----------- >> >> There are 4 steps in full gc of G1 GC. >> - mark live objects >> - prepare forwardee >> - adjust pointers >> - compact >> >> When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy. >> >> Description >> ----------- >> >> We enhance the full gc implementation for the above situation through following steps: >> - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495) >> - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase; >> - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step; >> - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase; >> >> VM options related >> ----------- >> >> - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1. >> - default value of MarkSweepDeadRatio: 5 >> >> Test >> ----------- >> >> - specjbb2015: no regression >> - dacapo: (Attachment is the dacapo h2 full gc pause.) >> - 95% of full gc pauses: 10%-19% improvement. >> - 5% of full gc pauses: 1.2% improvement. >> - 0.1% of full gc pauses: -6.16% improvement. >> >> $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > minor code improvement. Changes requested by ayang (Committer). src/hotspot/share/gc/g1/g1FullCollector.cpp line 238: > 236: if (hr->is_closed_archive()) { > 237: _region_attr_table.set_closed_archive(hr->hrm_index()); > 238: } else if (hr->is_pinned()) { Changing the condition to `hr->is_pinned() || force_pinned` is enough for this method, right? src/hotspot/share/gc/g1/g1FullCollector.hpp line 101: > 99: G1CMBitMap* mark_bitmap(); > 100: ReferenceProcessor* reference_processor(); > 101: size_t live_words(uint region_index) { return _live_stats[region_index]._live_words; } Could you add range check assertion for `region_index`? The extra spaces can be removed; no need to align with previous methods. src/hotspot/share/gc/g1/heapRegion.hpp line 174: > 172: void reset_compacted_after_full_gc(); > 173: // Update pinned heap region (not compacted) to be consistent after Full GC. > 174: void reset_not_compacted_after_full_gc(); Now that this version uses the existing "pinned" mechanism to skip high-live-ratio regions, this method can retain its original name/implementation, right? ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From tschatzl at openjdk.java.net Wed Apr 7 10:32:21 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 7 Apr 2021 10:32:21 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: <47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> References:

<47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> Message-ID: On Wed, 7 Apr 2021 09:33:48 GMT, Albert Mingkun Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> minor code improvement. > > src/hotspot/share/gc/g1/heapRegion.hpp line 174: > >> 172: void reset_compacted_after_full_gc(); >> 173: // Update pinned heap region (not compacted) to be consistent after Full GC. >> 174: void reset_not_compacted_after_full_gc(); > > Now that this version uses the existing "pinned" mechanism to skip high-live-ratio regions, this method can retain its original name/implementation, right? That is true, full gc reuses the mechanism originally implemented for pinned regions. However "pinned" is something very specific in G1 context so I think it is better to use a different, more generic name. The regions that are not compacted (and not yet pinned) are not really temporarily pinned. There has been an earlier discussion (not sure if here in this PR) to actually rename the use of "pinned" in G1 full gc to use "not compacted" too, resulting in a CR to rename this (in [JDK-8264423](https://bugs.openjdk.java.net/browse/JDK-8264423)). However I was going to wait for this change to go in before sending out a PR. Also this "not_compacted" name better matches the "compacted" method name above. So I would prefer to keep this as is. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From tschatzl at openjdk.java.net Wed Apr 7 10:37:19 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 7 Apr 2021 10:37:19 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: References:

Message-ID: On Thu, 1 Apr 2021 14:43:42 GMT, Hamlin Li wrote: >> Summary >> ----------- >> >> Improve G1 Full GC by skip compaction for regions with high survival ratio. >> >> Backgroud >> ----------- >> >> There are 4 steps in full gc of G1 GC. >> - mark live objects >> - prepare forwardee >> - adjust pointers >> - compact >> >> When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy. >> >> Description >> ----------- >> >> We enhance the full gc implementation for the above situation through following steps: >> - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495) >> - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase; >> - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step; >> - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase; >> >> VM options related >> ----------- >> >> - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1. >> - default value of MarkSweepDeadRatio: 5 >> >> Test >> ----------- >> >> - specjbb2015: no regression >> - dacapo: (Attachment is the dacapo h2 full gc pause.) >> - 95% of full gc pauses: 10%-19% improvement. >> - 5% of full gc pauses: 1.2% improvement. >> - 0.1% of full gc pauses: -6.16% improvement. >> >> $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > minor code improvement. src/hotspot/share/gc/g1/g1FullCollector.cpp line 229: > 227: > 228: void G1FullCollector::update_attribute_table(HeapRegion* hr) { > 229: if (hr->is_free()) { Another item that has been noted in a recent discussion with @albertnetymk is that with this change "Free" regions are also marked as `normal` in the table. It would be better to keep them as "Invalid". I.e. something like (incorporating @albertnetymk other suggestion): if (hr->is_free()) { return; } else if (hr->is_closed_archive(...) { [...] } else if (hr->is_pinned() || force_pinned) { [...] } else { [...] } There is no real difference as "Free" regions should never be referenced anywhere and the code should assert elsewhere. It's still nice to also have "Free" regions as `Invalid` in that table though. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From ayang at openjdk.java.net Wed Apr 7 11:49:56 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 7 Apr 2021 11:49:56 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: References:

<47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> Message-ID: On Wed, 7 Apr 2021 10:29:26 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/heapRegion.hpp line 174: >> >>> 172: void reset_compacted_after_full_gc(); >>> 173: // Update pinned heap region (not compacted) to be consistent after Full GC. >>> 174: void reset_not_compacted_after_full_gc(); >> >> Now that this version uses the existing "pinned" mechanism to skip high-live-ratio regions, this method can retain its original name/implementation, right? > > That is true, full gc reuses the mechanism originally implemented for pinned regions. However "pinned" is something very specific in G1 context so I think it is better to use a different, more generic name. The regions that are not compacted (and not yet pinned) are not really temporarily pinned. > > There has been an earlier discussion (not sure if here in this PR) to actually rename the use of "pinned" in G1 full gc to use "not compacted" too, resulting in a CR to rename this (in [JDK-8264423](https://bugs.openjdk.java.net/browse/JDK-8264423)). > > However I was going to wait for this change to go in before sending out a PR. > > Also this "not_compacted" name better matches the "compacted" method name above. > > So I would prefer to keep this as is. I see; agree. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From zgu at openjdk.java.net Wed Apr 7 12:11:18 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 7 Apr 2021 12:11:18 GMT Subject: RFR: 8264718: Shenandoah: enable string deduplication during root scanning [v2] In-Reply-To: References: Message-ID: > Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. > > Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. > > Test: > - [x] hotspot_gc_shenandoah Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Removed excessive empty lines ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3348/files - new: https://git.openjdk.java.net/jdk/pull/3348/files/45344ee4..d3d63b38 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3348&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3348&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3348.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3348/head:pull/3348 PR: https://git.openjdk.java.net/jdk/pull/3348 From coleenp at openjdk.java.net Wed Apr 7 13:04:32 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 7 Apr 2021 13:04:32 GMT Subject: RFR: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References:

Message-ID: On Wed, 7 Apr 2021 09:33:39 GMT, Yi Yang wrote: >> Unless we can imagine a reason why this code is needed, I think the old memprofiler should be removed. > > I'd like to propose a new PR to remove the memprofiler if there are no strong objections. Sure, open a new issue if you want, and link this issue to it. The "feature" was added here: ^As 00018/00000/00000 ^Ad D 1.1 98/04/14 13:18:32 renes 1 0 ^Ac date and time created 98/04/14 13:18:32 by renes ^Ae ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From shade at openjdk.java.net Wed Apr 7 16:28:36 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 7 Apr 2021 16:28:36 GMT Subject: RFR: 8264718: Shenandoah: enable string deduplication during root scanning [v2] In-Reply-To: References:

Message-ID: On Wed, 7 Apr 2021 12:11:18 GMT, Zhengyu Gu wrote: >> Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. >> >> Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. >> >> Test: >> - [x] hotspot_gc_shenandoah > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Removed excessive empty lines Looks fine, with a minor nit. Fix it before push. src/hotspot/share/gc/shenandoah/shenandoahSTWMark.cpp line 132: > 130: void ShenandoahSTWMark::mark_roots(uint worker_id) { > 131: if (ShenandoahStringDedup::is_enabled()) { > 132: ShenandoahInitMarkRootsClosure init_mark(task_queues()->queue(worker_id)); Excess space here and on the other branch, between `` and `init_mark`. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3348 From zgu at openjdk.java.net Wed Apr 7 17:25:09 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 7 Apr 2021 17:25:09 GMT Subject: RFR: 8264718: Shenandoah: enable string deduplication during root scanning [v3] In-Reply-To: References: Message-ID: > Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. > > Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. > > Test: > - [x] hotspot_gc_shenandoah Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Removed excess spaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3348/files - new: https://git.openjdk.java.net/jdk/pull/3348/files/d3d63b38..33862ac2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3348&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3348&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3348.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3348/head:pull/3348 PR: https://git.openjdk.java.net/jdk/pull/3348 From manc at google.com Wed Apr 7 18:19:27 2021 From: manc at google.com (Man Cao) Date: Wed, 7 Apr 2021 11:19:27 -0700 Subject: Work-in-progress: 8236485: Epoch synchronization protocol for G1 concurrent refinement In-Reply-To: References:

<87blr3vc4t.fsf@oldenburg2.str.redhat.com> Message-ID: Has anyone got a chance to take a look? I also implemented the fence removal in interpreter and C1/C2. The patch is quite small, so perhaps it is better to merge them into one pull request: https://github.com/caoman/jdk/tree/8226731fenceRemoval -Man On Tue, Mar 30, 2021 at 7:43 PM Man Cao wrote: > Hi all, > > I finally managed to allocate more time to make progress on this, and > resolved most issues since the last discussion. > I've updated the description in > https://bugs.openjdk.java.net/browse/JDK-8236485, and the current > prototype is the HEAD commit at > https://github.com/caoman/jdk/tree/g1EpochSync. > Notable changes include: > - The protocol uses async handshake from JDK-8238761 > to resolve the > blocking issue from normal handshake. > - In order to support async refinement due to async handshake, added > support for _deferred global queue to G1DirtyCardQueueSet. Buffers rarely > get enqueued to _deferred at run-time. > - The async handshake only executes for a subset of threads. > > I have a couple of questions: > > 1. Code review and patch size. > Should I start a pull request for this change, so it is easier to give > feedback? > > What is the recommended approach to deal with large changes? Currently the > patch is about 1200 lines, without changing the write barrier itself > (JDK-8226731). > Since pushing only this patch will add some overhead to refinement, but > not bring any performance improvement from removing the write barrier's > fence, do you recommend I also implement the write barrier change in the > same patch? > Keeping the epoch sync patch separate from the write barrier patch has > some benefit for testing, in case the epoch patch introduces any bugs. > Currently the new code is mostly guarded by a flag > -XX:+G1TestEpochSyncInConcRefinement, and it will be removed after the > write barrier change. It could be used as an emergency flag to workaround > bugs, instead of backing out of the entire change. We probably cannot have > such a flag if we bundle the changes in one patch (it's too ugly to have a > flag in interpreter and compilers). > > 2. Checking if a remote thread is in _thread_in_Java state. > eosterlund@ pointed out it was incorrect to do JavaThread::thread_state() > == _thread_in_Java. > I looked into thread state transition, and revised it to also compare with > _thread_in_native_trans and _thread_in_vm_trans. > I think it is correct now for the purpose of epoch synchronization. I.e., > it never "misses" a remote thread that is actually in in_Java state. > A detailed comment is here: > https://github.com/caoman/jdk/blob/2047ecefceb074e80d73e0d521d64a220fdc5779/src/hotspot/share/gc/g1/g1EpochSynchronizer.cpp#L67-L90 > . > *Erik, could you take a look and decide if it is correct? If it is still > incorrect, could you advise a proper way to do this?* > > 3. Native write barrier (G1BarrierSet::write_ref_field_post). > The epoch synchronization protocol does not synchronize with threads in > _thread_in_native or _thread_in_vm state. It is much slower if we > synchronize with such threads. > Moreover, there are non-Java threads (e.g. concurrent mark worker) that > could execute the native write barrier. > As a result, it's probably best to keep the StoreLoad fence in the native > write barrier. > The final write post-barrier for JDK-8226731 would be: > Given: > x.a = q > and > p = @x.a > > For Interpreter/C1/C2: > if (p and q in same regions or q == NULL) -> exit > if (card(p) == Dirty) -> exit > card(p) = Dirty; > enqueue(card(p)) > > For the native barrier: > if (p and q in same regions or q == NULL) -> exit > StoreLoad; > if (card(p) == Dirty) -> exit > card(p) = Dirty; > enqueue(card(p)) > > Does the above look good? Do we need to micro-benchmark the potential > overhead added to the native barrier? Or are typical macro-benchmarks > sufficient? > > 4. Performance benchmarking. > I did some preliminary benchmarking with DaCapo and BigRamTester, without > changing the write barrier. This is to measure the overhead added by the > epoch synchronization protocol. > Under the default JVM setting, I didn't see any performance difference. > When I tuned the JVM to refine cards aggressively, there was > still no difference in BigRamTester (probably because it only has 2 > threads). Some DaCapo benchmarks saw 10-15% more CPU usage due to doing > more work in the refinement threads, and 2-5% total throughput regression > for these benchmarks. > The aggressive refinement flags are "-XX:-G1UseAdaptiveConcRefinement > -XX:G1UpdateBufferSize=4 -XX:G1ConcRefinementGreenZone=0 > -XX:G1ConcRefinementYellowZone=1". > > I wonder how important we should treat the aggressive refinement case. > Regression in this case is likely unavoidable, so how much regression is > tolerable? > Also, does anyone know a better benchmark to test refinement with default > JVM settings? Ideally it (1) has many mutator threads; (2) triggers > concurrent refinement frequently; (3) runs with a sizable Xmx (e.g., 2GiB > or above). > > -Man > > > On Thu, Jan 16, 2020 at 4:06 AM Florian Weimer wrote: > >> * Man Cao: >> >> > We had an offline discussion on this. To keep the community in the loop, >> > here is what we discussed. >> > >> > a. Using Linux membarrier syscall or equivalent on other OSes seems a >> > cleaner solution than thread-local handshake (TLH). But we need to have >> a >> > backup mechanism for OSes and older Linuxes that do not have such a >> > syscall. >> >> Can you do with a membarrier call that doesn't require registration? >> >> The usual fallback for membarrier is sending a special signal to all >> threads, and make sure that they have run code in a signal handler >> (possibly using a CPU barrier there). But of course this is rather >> slow. >> >> membarrier has seen some backporting activity, but as far as I can see, >> that hasn't been consistent across architectures. >> >> Thanks, >> Florian >> >> From kbarrett at openjdk.java.net Wed Apr 7 19:31:15 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 7 Apr 2021 19:31:15 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation [v2] In-Reply-To: References: Message-ID: > Please review this change to OopStorage to support bulk allocation. A new overload for allocate is provided: > size_t allocate(oop** entries, size_t size) > The approach taken is to claim all of the available entries in the first available block, add as many of those entries as will fit in the buffer, and release any remaining entries back to the block. Only the claim part needs to be done while holding the allocation mutex. This is is optimized for clients that want more than just a couple entries, and minimizes time under the lock. The maximum number of entries (the number of entries in a block) is provided as a constant for sizing requests, to avoid the release of unrequested entries. An application that wants more than that, or a specific number that might not be available from the next block, needs to make multiple bulk allocation calls, but that's still going to be much faster than one at a time allocations. > > Testing: > mach5 tier1 > New gtest for bulk allocation. > I've been using this for a while as part of another in-development change that makes heavy use of this feature. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: improve comments per ayang ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3264/files - new: https://git.openjdk.java.net/jdk/pull/3264/files/3ac5fec7..a0c7f91c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3264&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3264&range=00-01 Stats: 8 lines in 2 files changed: 1 ins; 1 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3264.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3264/head:pull/3264 PR: https://git.openjdk.java.net/jdk/pull/3264 From kbarrett at openjdk.java.net Wed Apr 7 19:31:16 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 7 Apr 2021 19:31:16 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation [v2] In-Reply-To: References:

Message-ID: <5fhtiPtFJqo7Js27D3sAdSl6leSgQIyt2ZfpurN3q_E=.9bbf6a12-69d3-43ab-afdb-ab3ac82c713c@github.com> On Tue, 6 Apr 2021 09:46:52 GMT, Albert Mingkun Yang wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> improve comments per ayang > > src/hotspot/share/gc/shared/oopStorage.cpp line 478: > >> 476: block = block_for_allocation(); >> 477: if (block == NULL) return 0; // Block allocation failed. >> 478: // Take exclusive use of this block for allocation. > > Maybe "Taking all remaining entries, so remove from list."? The "exclusive use" is ensured by the above `MutexLocker`, not `unlink`. done > src/hotspot/share/gc/shared/oopStorage.hpp line 119: > >> 117: // Allocates multiple entries, returning them in the ptrs buffer. The >> 118: // number of entries that will be allocated is never more than the minimum >> 119: // of size and bulk_allocate_limit, but may be less than either. Possibly > > It took me a few seconds to understand this sentence. Maybe replace it with "postcondition: result <= min(size, bulk_allocate_limit)" below? done ------------- PR: https://git.openjdk.java.net/jdk/pull/3264 From iklam at openjdk.java.net Wed Apr 7 19:46:44 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 7 Apr 2021 19:46:44 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap [v3] In-Reply-To: References: Message-ID: > When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. > > The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). > > Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. > > In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8214455-relocated-archived-regions-to-heap-top - fixed indentation (review comment by @calvinccheung) - @tschatzl review -- fix typos and whitespace - Added -Xlog:cds=debug in case this test fails again - 8214455: Relocate CDS archived regions to the top of the G1 heap ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3349/files - new: https://git.openjdk.java.net/jdk/pull/3349/files/bedd676d..43ee9680 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3349&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3349&range=01-02 Stats: 4956 lines in 137 files changed: 2977 ins; 1596 del; 383 mod Patch: https://git.openjdk.java.net/jdk/pull/3349.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3349/head:pull/3349 PR: https://git.openjdk.java.net/jdk/pull/3349 From kbarrett at openjdk.java.net Wed Apr 7 19:47:34 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 7 Apr 2021 19:47:34 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation [v3] In-Reply-To: References: Message-ID: > Please review this change to OopStorage to support bulk allocation. A new overload for allocate is provided: > size_t allocate(oop** entries, size_t size) > The approach taken is to claim all of the available entries in the first available block, add as many of those entries as will fit in the buffer, and release any remaining entries back to the block. Only the claim part needs to be done while holding the allocation mutex. This is is optimized for clients that want more than just a couple entries, and minimizes time under the lock. The maximum number of entries (the number of entries in a block) is provided as a constant for sizing requests, to avoid the release of unrequested entries. An application that wants more than that, or a specific number that might not be available from the next block, needs to make multiple bulk allocation calls, but that's still going to be much faster than one at a time allocations. > > Testing: > mach5 tier1 > New gtest for bulk allocation. > I've been using this for a while as part of another in-development change that makes heavy use of this feature. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into bulk_alloc - improve comments per ayang - oopstorage bulk allocation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3264/files - new: https://git.openjdk.java.net/jdk/pull/3264/files/a0c7f91c..6643db84 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3264&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3264&range=01-02 Stats: 26753 lines in 510 files changed: 20294 ins; 3517 del; 2942 mod Patch: https://git.openjdk.java.net/jdk/pull/3264.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3264/head:pull/3264 PR: https://git.openjdk.java.net/jdk/pull/3264 From kbarrett at openjdk.java.net Wed Apr 7 19:47:37 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 7 Apr 2021 19:47:37 GMT Subject: RFR: 8264424: Support OopStorage bulk allocation [v3] In-Reply-To: References:

Message-ID: On Tue, 6 Apr 2021 09:51:47 GMT, Albert Mingkun Yang wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into bulk_alloc >> - improve comments per ayang >> - oopstorage bulk allocation > > Marked as reviewed by ayang (Committer). Thanks for reviews @albertnetymk and @tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/3264 From kbarrett at openjdk.java.net Wed Apr 7 19:47:38 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 7 Apr 2021 19:47:38 GMT Subject: Integrated: 8264424: Support OopStorage bulk allocation In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 11:55:59 GMT, Kim Barrett wrote: > Please review this change to OopStorage to support bulk allocation. A new overload for allocate is provided: > size_t allocate(oop** entries, size_t size) > The approach taken is to claim all of the available entries in the first available block, add as many of those entries as will fit in the buffer, and release any remaining entries back to the block. Only the claim part needs to be done while holding the allocation mutex. This is is optimized for clients that want more than just a couple entries, and minimizes time under the lock. The maximum number of entries (the number of entries in a block) is provided as a constant for sizing requests, to avoid the release of unrequested entries. An application that wants more than that, or a specific number that might not be available from the next block, needs to make multiple bulk allocation calls, but that's still going to be much faster than one at a time allocations. > > Testing: > mach5 tier1 > New gtest for bulk allocation. > I've been using this for a while as part of another in-development change that makes heavy use of this feature. This pull request has now been integrated. Changeset: 22b20f8e Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/22b20f8e Stats: 132 lines in 4 files changed: 118 ins; 1 del; 13 mod 8264424: Support OopStorage bulk allocation Reviewed-by: ayang, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/3264 From iklam at openjdk.java.net Thu Apr 8 00:44:14 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 8 Apr 2021 00:44:14 GMT Subject: RFR: 8214455: Relocate CDS archived regions to the top of the G1 heap [v3] In-Reply-To: References:

Message-ID: <7FUdIE0NHT29sCEmGSXe0FYD_99Xq0HpO6JP5uLoJpA=.68ec144f-4d34-48c0-a6d2-89760d2e7617@github.com> On Tue, 6 Apr 2021 09:41:25 GMT, Thomas Schatzl wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'master' into 8214455-relocated-archived-regions-to-heap-top >> - fixed indentation (review comment by @calvinccheung) >> - @tschatzl review -- fix typos and whitespace >> - Added -Xlog:cds=debug in case this test fails again >> - 8214455: Relocate CDS archived regions to the top of the G1 heap > > Lgtm. Thanks @tschatzl and @calvinccheung for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/3349 From iklam at openjdk.java.net Thu Apr 8 00:49:20 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 8 Apr 2021 00:49:20 GMT Subject: Integrated: 8214455: Relocate CDS archived regions to the top of the G1 heap In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 21:48:43 GMT, Ioi Lam wrote: > When the runtime heap and dump time heap have different sizes, it's possible that the archived heap regions are mapped to the middle of the runtime heap. Because the archived heap regions are pinned, this could reduce the size of the largest "humongous" array allocation. In the worst case, the maximum allocatable array length may be half of the optimal value. > > The fix is to relocate the archived regions to the top of the G1 heap (currently archived regions are supported only by G1 on 64-bit JVM). > > Note that usually the top of the G1 heap is placed just below the 32GB boundary. As a result, the archived heap regions are at the same location between run time and dump time, so no relocation is necessary. > > In mach5 testing, we occasionally run into this problem (see https://bugs.openjdk.java.net/browse/JDK-8239089), probably because we fail to reserve the heap below 32GB due to address space layout randomization (ASLR). This pull request has now been integrated. Changeset: 78d1164c Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/78d1164c Stats: 147 lines in 8 files changed: 139 ins; 0 del; 8 mod 8214455: Relocate CDS archived regions to the top of the G1 heap Reviewed-by: tschatzl, ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/3349 From ayang at openjdk.java.net Thu Apr 8 09:05:14 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 8 Apr 2021 09:05:14 GMT Subject: RFR: 8264788: Make SequentialSubTasksDone use-once In-Reply-To: References: Message-ID: <0QDQXiibSb7nKsN3Cjf1dsIkX4tlr6Wc7nhKfpS4DlE=.5e8189c8-3926-4061-a148-aba01bf23bea@github.com> On Wed, 7 Apr 2021 08:32:11 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that simplifies code ahead? > > SequentialSubTasksDone has some machinery to reset itself automatically after all tasks were claimed. > > This is not used at all and can/should be removed. > > Test: tier1-3 Looking at how `SequentialSubTasksDone` is used, I don't really see the benefit of having `assert(_num_claimed == _num_tasks)` in the destructor. If we drop that assertion, `Atomic::add` is more readable. I prefer `Atomic::add` with the assertion removed, but this version is fine as well. ------------- Marked as reviewed by ayang (Committer). PR: https://git.openjdk.java.net/jdk/pull/3372 From mli at openjdk.java.net Thu Apr 8 10:06:54 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Thu, 8 Apr 2021 10:06:54 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References: Message-ID: <2wLngv3n2wALz1ZbNgSTNaMKMYKTnh6ISlWEdNzBa9o=.f0996e4c-3bd5-42b3-9cc0-0ec8a600f1dc@github.com> > Summary > ----------- > > Improve G1 Full GC by skip compaction for regions with high survival ratio. > > Backgroud > ----------- > > There are 4 steps in full gc of G1 GC. > - mark live objects > - prepare forwardee > - adjust pointers > - compact > > When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy. > > Description > ----------- > > We enhance the full gc implementation for the above situation through following steps: > - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495) > - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase; > - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step; > - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase; > > VM options related > ----------- > > - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1. > - default value of MarkSweepDeadRatio: 5 > > Test > ----------- > > - specjbb2015: no regression > - dacapo: (Attachment is the dacapo h2 full gc pause.) > - 95% of full gc pauses: 10%-19% improvement. > - 5% of full gc pauses: 1.2% improvement. > - 0.1% of full gc pauses: -6.16% improvement. > > $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2 Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: add sanity check; refine code in update_attribute_table. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2760/files - new: https://git.openjdk.java.net/jdk/pull/2760/files/63ab0f9c..5cca26b5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2760&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2760&range=12-13 Stats: 12 lines in 3 files changed: 5 ins; 2 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/2760.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2760/head:pull/2760 PR: https://git.openjdk.java.net/jdk/pull/2760 From mli at openjdk.java.net Thu Apr 8 10:06:58 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Thu, 8 Apr 2021 10:06:58 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References:

Message-ID: On Wed, 7 Apr 2021 10:33:32 GMT, Thomas Schatzl wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> add sanity check; refine code in update_attribute_table. > > src/hotspot/share/gc/g1/g1FullCollector.cpp line 229: > >> 227: >> 228: void G1FullCollector::update_attribute_table(HeapRegion* hr) { >> 229: if (hr->is_free()) { > > Another item that has been noted in a recent discussion with @albertnetymk is that with this change "Free" regions are also marked as `normal` in the table. It would be better to keep them as "Invalid". > > I.e. something like (incorporating @albertnetymk other suggestion): > > if (hr->is_free()) { > return; > } else if (hr->is_closed_archive(...) { > [...] > } else if (hr->is_pinned() || force_pinned) { > [...] > } else { > [...] > } > > There is no real difference as "Free" regions should never be referenced anywhere and the code should assert elsewhere. It's still nice to also have "Free" regions as `Invalid` in that table though. not exactly. please check previous discussion at https://github.com/openjdk/jdk/pull/2760#discussion_r602144952. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From mli at openjdk.java.net Thu Apr 8 10:07:01 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Thu, 8 Apr 2021 10:07:01 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v13] In-Reply-To: <47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> References:

<47rveNWYV7e6_mdxd3EaiTdsxo1SEjtYwmVviLxVo80=.17d8b0ce-4afd-418e-b2ea-e6099a3e7199@github.com> Message-ID: On Wed, 7 Apr 2021 09:35:19 GMT, Albert Mingkun Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> minor code improvement. > > src/hotspot/share/gc/g1/g1FullCollector.cpp line 238: > >> 236: if (hr->is_closed_archive()) { >> 237: _region_attr_table.set_closed_archive(hr->hrm_index()); >> 238: } else if (hr->is_pinned()) { > > Changing the condition to `hr->is_pinned() || force_pinned` is enough for this method, right? not exactly. please check previous discussion at https://github.com/openjdk/jdk/pull/2760#discussion_r602144952. I will merge the 2 conditions as you suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From tschatzl at openjdk.java.net Thu Apr 8 11:01:25 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 8 Apr 2021 11:01:25 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References:

Message-ID: On Thu, 8 Apr 2021 10:02:49 GMT, Hamlin Li wrote: >> src/hotspot/share/gc/g1/g1FullCollector.cpp line 229: >> >>> 227: >>> 228: void G1FullCollector::update_attribute_table(HeapRegion* hr) { >>> 229: if (hr->is_free()) { >> >> Another item that has been noted in a recent discussion with @albertnetymk is that with this change "Free" regions are also marked as `normal` in the table. It would be better to keep them as "Invalid". >> >> I.e. something like (incorporating @albertnetymk other suggestion): >> >> if (hr->is_free()) { >> return; >> } else if (hr->is_closed_archive(...) { >> [...] >> } else if (hr->is_pinned() || force_pinned) { >> [...] >> } else { >> [...] >> } >> >> There is no real difference as "Free" regions should never be referenced anywhere and the code should assert elsewhere. It's still nice to also have "Free" regions as `Invalid` in that table though. > > not exactly. please check previous discussion at https://github.com/openjdk/jdk/pull/2760#discussion_r602144952. Some more investigation and discussion about the BOT handling showed that we need to update the BOT for these Survivor-turned-to-Old regions after all. The reason is that contrary to what I thought, while BOT can handle queries for object start addresses above the "last known valid entry" mentioned in PR#3356, it only does so slowly, and while updating the BOT itself, not updating that "last known valid entry". So every time it queries for an object start in such regions, it starts walking from the bottom of that region. See the call chain `G1BlockOffsetTablePart::block_start` -> `forward_to_block_containing_addr` -> `forward_to_block_containing_addr_slow` where the call to `alloc_block_work` in `g1BlockOffsetTable.cpp:236` only updates local boundary and index (not `_next_offset_threshold` and `_next_offset_index`). This is a problem for young gcs as this behavior will make them slower than expected. Since it is impossible to make the updates to both `_next_offset_threshold` and `_next_offset_index` atomic, and another issue anyway) I would prefer to penalize full gc (or at least keep old behavior) for that. Could you add code that walks such full young regions and does the `cross_threshold` thing? It certainly does not need to actually compact these regions. Thanks, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From zgu at openjdk.java.net Thu Apr 8 12:13:25 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 8 Apr 2021 12:13:25 GMT Subject: Integrated: 8264718: Shenandoah: enable string deduplication during root scanning In-Reply-To: References: Message-ID: <_5JKrTptBZMdek4mggI4ovb80P9BdSBcGmMtZDXozek=.0366e58b-1d22-4bb5-86b0-afb390e4a88a@github.com> On Mon, 5 Apr 2021 21:42:32 GMT, Zhengyu Gu wrote: > Shenandoah used to scan roots at pauses, so it deliberately disables string deduplication during root scanning to avoid extra pause times. > > Now, Shenandoah scans roots in concurrent phase, it is no longer a concern, we should enable it. > > Test: > - [x] hotspot_gc_shenandoah This pull request has now been integrated. Changeset: 3aec2d96 Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/3aec2d96 Stats: 69 lines in 5 files changed: 39 ins; 26 del; 4 mod 8264718: Shenandoah: enable string deduplication during root scanning Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/3348 From mli at openjdk.java.net Thu Apr 8 13:47:29 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Thu, 8 Apr 2021 13:47:29 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References:

Message-ID: <3qu7jf8dJ_jTlNcs_gzjXE_k-ylkQTnVveXoMBUqhAA=.3e22c73a-98cd-4f46-a663-a7cc5bacd276@github.com> On Thu, 8 Apr 2021 10:58:04 GMT, Thomas Schatzl wrote: >> not exactly. please check previous discussion at https://github.com/openjdk/jdk/pull/2760#discussion_r602144952. > > Some more investigation and discussion about the BOT handling showed that we need to update the BOT for these Survivor-turned-to-Old regions after all. > > The reason is that contrary to what I thought, while BOT can handle queries for object start addresses above the "last known valid entry" (materialized in `_next_offset_threshold` and `_next_offset_index` ) mentioned in PR#3356, it only does so slowly, and while updating the BOT itself, not updating that "last known valid entry". > So every time it queries for an object start in such regions, it starts walking from the bottom of that region. > > See the call chain `G1BlockOffsetTablePart::block_start` -> `forward_to_block_containing_addr` -> `forward_to_block_containing_addr_slow` where the call to `alloc_block_work` in `g1BlockOffsetTable.cpp:236` only updates local boundary and index (not `_next_offset_threshold` and `_next_offset_index`). > > This is a problem for young gcs as this behavior will make them slower than expected. Since it is impossible to make the updates to both `_next_offset_threshold` and `_next_offset_index` atomic, and another issue anyway; feel free to file an issue) I would prefer to penalize full gc (that is, keep old behavior) for that. > > The alternative would be to just `alloc_block` the whole region (so that `next_offset_index` and friend are at the top), but I think given the rarity of this case and full gc in general it is better to do the extra work in the full gc. > > Could you add code that walks such full young regions and does the `cross_threshold` thing? This additional code certainly does not need to actually compact these regions. > > Thanks, > Thomas Thanks for discussion! I had the same concern (BOTs of these Survivor-turned-to-Old regions contains no useful info, so might make it very slow when finding block start in these regions), but I was not sure if there is some "lazy" mechanism to fill valid BOTs info for these regions when they are accessed subsequently. I just not have time to do further investigation, now I got the answer from you. Sure I will add code to fill valid BOTs info for these young regions by the end of full gc. I'm not sure if it's OK for me to do this BOTs filling action in a separate issue? As we already had such a long discussion, I think it's might be better for us to initialize another discussion for this specific follow-up issue. Please kindly let me know your thoughts. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From tschatzl at openjdk.java.net Thu Apr 8 15:05:42 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 8 Apr 2021 15:05:42 GMT Subject: RFR: 8264788: Make SequentialSubTasksDone use-once [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that simplifies code ahead? > > SequentialSubTasksDone has some machinery to reset itself automatically after all tasks were claimed. > > This is not used at all and can/should be removed. > > Test: tier1-3 Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3372/files - new: https://git.openjdk.java.net/jdk/pull/3372/files/53930970..919ff6e7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3372&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3372&range=00-01 Stats: 11 lines in 2 files changed: 1 ins; 4 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3372.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3372/head:pull/3372 PR: https://git.openjdk.java.net/jdk/pull/3372 From tschatzl at openjdk.java.net Thu Apr 8 15:05:42 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 8 Apr 2021 15:05:42 GMT Subject: RFR: 8264788: Make SequentialSubTasksDone use-once [v2] In-Reply-To: <0QDQXiibSb7nKsN3Cjf1dsIkX4tlr6Wc7nhKfpS4DlE=.5e8189c8-3926-4061-a148-aba01bf23bea@github.com> References: <0QDQXiibSb7nKsN3Cjf1dsIkX4tlr6Wc7nhKfpS4DlE=.5e8189c8-3926-4061-a148-aba01bf23bea@github.com> Message-ID: On Thu, 8 Apr 2021 09:02:20 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review > > Looking at how `SequentialSubTasksDone` is used, I don't really see the benefit of having `assert(_num_claimed == _num_tasks)` in the destructor. If we drop that assertion, `Atomic::add` is more readable. > > I prefer `Atomic::add` with the assertion removed, but this version is fine as well. I changed to the `Atomic::add` but kept the assert (modified). ------------- PR: https://git.openjdk.java.net/jdk/pull/3372 From yyang at openjdk.java.net Fri Apr 9 07:18:24 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 9 Apr 2021 07:18:24 GMT Subject: Withdrawn: 8264682: MemProfiling does not own Heap_lock when using G1 In-Reply-To: References: Message-ID: <22s7OOLgMBFF1tkEbH8vNgr7fuOFcPjNJNjn-U7t5CI=.ab0ee64e-1e2e-4d9a-b734-451d61f71f6a@github.com> On Mon, 5 Apr 2021 07:34:30 GMT, Yi Yang wrote: > Trivial fix for JDK-8264682. > > `Universe::heap()->used()` calls G1Allocator::used_in_alloc_regions() when G1 enabled, it checks whether Heap_lock was owned on this thread's behalf. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3340 From tschatzl at openjdk.java.net Fri Apr 9 10:53:20 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 9 Apr 2021 10:53:20 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: <3qu7jf8dJ_jTlNcs_gzjXE_k-ylkQTnVveXoMBUqhAA=.3e22c73a-98cd-4f46-a663-a7cc5bacd276@github.com> References:

<3qu7jf8dJ_jTlNcs_gzjXE_k-ylkQTnVveXoMBUqhAA=.3e22c73a-98cd-4f46-a663-a7cc5bacd276@github.com> Message-ID: On Thu, 8 Apr 2021 13:44:35 GMT, Hamlin Li wrote: >> Some more investigation and discussion about the BOT handling showed that we need to update the BOT for these Survivor-turned-to-Old regions after all. >> >> The reason is that contrary to what I thought, while BOT can handle queries for object start addresses above the "last known valid entry" (materialized in `_next_offset_threshold` and `_next_offset_index` ) mentioned in PR#3356, it only does so slowly, and while updating the BOT itself, not updating that "last known valid entry". >> So every time it queries for an object start in such regions, it starts walking from the bottom of that region. >> >> See the call chain `G1BlockOffsetTablePart::block_start` -> `forward_to_block_containing_addr` -> `forward_to_block_containing_addr_slow` where the call to `alloc_block_work` in `g1BlockOffsetTable.cpp:236` only updates local boundary and index (not `_next_offset_threshold` and `_next_offset_index`). >> >> This is a problem for young gcs as this behavior will make them slower than expected. Since it is impossible to make the updates to both `_next_offset_threshold` and `_next_offset_index` atomic, and another issue anyway; feel free to file an issue) I would prefer to penalize full gc (that is, keep old behavior) for that. >> >> The alternative would be to just `alloc_block` the whole region (so that `next_offset_index` and friend are at the top), but I think given the rarity of this case and full gc in general it is better to do the extra work in the full gc. >> >> Could you add code that walks such full young regions and does the `cross_threshold` thing? This additional code certainly does not need to actually compact these regions. >> >> Thanks, >> Thomas > > Thanks for discussion! I had the same concern (BOTs of these Survivor-turned-to-Old regions contains no useful info, so might make it very slow when finding block start in these regions), but I was not sure if there is some "lazy" mechanism to fill valid BOTs info for these regions when they are accessed subsequently. I just not have time to do further investigation, now I got the answer from you. > > Sure I will add code to fill valid BOTs info for these young regions by the end of full gc. > I'm not sure if it's OK for me to do this BOTs filling action in a separate issue? As we already had such a long discussion, I think it's might be better for us to initialize another discussion for this specific follow-up issue. Please kindly let me know your thoughts. It is fine with me to fix this separately. There is some lazy mechanism to fill this BOT, but it does not work in this case (which is the gist of what I said above). ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From akozlov at azul.com Fri Apr 9 11:35:34 2021 From: akozlov at azul.com (Anton Kozlov) Date: Fri, 9 Apr 2021 14:35:34 +0300 Subject: [jdk13u-dev] RFR: 8264640: CMS ParScanClosure misses a barrier In-Reply-To: <1512B9D3-78B4-4AB1-8B4C-D2BAA6344EFC@azul.com> References:

<1512B9D3-78B4-4AB1-8B4C-D2BAA6344EFC@azul.com> Message-ID: John, thank you for review and the comment! Thanks, Anton On 4/8/21 9:33 PM, John Cuthbertson wrote: > Hi Anton, > > This looks good to me. I think I?m still a reviewer for the jdk-updates project. > > For the benefit of everyone else... > > We were seeing this as a crash when obtaining the size of an object to be copied. The klass was observed to be transiently NULL. We found that the object, reached through another reference path, had already been copied and the from-space oop placed on the task queue for subsequent reference field scanning. The task queue, however, had overflowed and the from-space oop was placed on the shared overflow queue where objects are chained together through their klass field. If the reads are ordered as they are in the code then everything is OK as per the comment at line 105 (in ParScanClosure::do_oop_work) but we found that gcc had reordered the reads in the non-compressed oops case. So the mark word is read and the object is observed to not forwarded (yet). Then, via another reference path, the object is copied, forwarded, and placed on the overflow task queue ? over writing the from-space object?s klass. Then in the original path the klass is read and observed to be NULL or the next overflow entry ? leading to the crash. When the from-space oop is dequeued, its klass is restored ? which is what was observed in the core file. > > Using worker thread local queues, -XX:+ParGCUseLocalOverflow, seems to workaround the problem. > > Thanks, > > John Cuthbertson > >> On Apr 2, 2021, at 2:02 AM, Anton Kozlov wrote: >> >> Adding hotspot-gc-dev. It will be great to receive comments from GC experts, even the fix does not make sense for mainline jdk. >> >> Thanks, >> Anton >> >> On 4/2/21 11:51 AM, Anton Kozlov wrote: >>> Hi, please review an original fix for a GC crash. The jdk13u is the latest supported version that still has buggy code, it was deleted in jdk14 as a part of JEP 363: Remove the Concurrent Mark Sweep (CMS) Garbage Collector. So I'm proposing it here. >>> The fix is low-risk, on x86-64 it just introduces a compiler barrier to prevent two reads to be reordered as intended by surrounding comments. On CPUs with weaker memory models it introduces CPU barriers as well. >>> ------------- >>> Commit messages: >>> - Add missing barriers >>> Changes: https://git.openjdk.java.net/jdk13u-dev/pull/165/files >>> Webrev: https://webrevs.openjdk.java.net/?repo=jdk13u-dev&pr=165&range=00 >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8264640 >>> Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod >>> Patch: https://git.openjdk.java.net/jdk13u-dev/pull/165.diff >>> Fetch: git fetch https://git.openjdk.java.net/jdk13u-dev pull/165/head:pull/165 >>> PR: https://git.openjdk.java.net/jdk13u-dev/pull/165 > From shade at openjdk.java.net Fri Apr 9 12:01:24 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 9 Apr 2021 12:01:24 GMT Subject: RFR: 8264727: Remove extraneous whitespace from phase timings report In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 16:45:43 GMT, earthling-amzn wrote: > This extra space doesn't look like it was intended to improve any alignment of text in the report. @earthling-amzn Still want to integrate this? You need to title this PR correctly for bots to catch it up. ------------- PR: https://git.openjdk.java.net/jdk/pull/3342 From tschatzl at openjdk.java.net Fri Apr 9 12:29:28 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 9 Apr 2021 12:29:28 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References:

<3qu7jf8dJ_jTlNcs_gzjXE_k-ylkQTnVveXoMBUqhAA=.3e22c73a-98cd-4f46-a663-a7cc5bacd276@github.com> Message-ID: On Fri, 9 Apr 2021 10:49:11 GMT, Thomas Schatzl wrote: >> Thanks for discussion! I had the same concern (BOTs of these Survivor-turned-to-Old regions contains no useful info, so might make it very slow when finding block start in these regions), but I was not sure if there is some "lazy" mechanism to fill valid BOTs info for these regions when they are accessed subsequently. I just not have time to do further investigation, now I got the answer from you. >> >> Sure I will add code to fill valid BOTs info for these young regions by the end of full gc. >> I'm not sure if it's OK for me to do this BOTs filling action in a separate issue? As we already had such a long discussion, I think it's might be better for us to initialize another discussion for this specific follow-up issue. Please kindly let me know your thoughts. > > It is fine with me to fix this separately. > > There is some lazy mechanism to fill this BOT, but it does not work in this case (which is the gist of what I said above). Note that if that were fixed, I think PR #3356 would not be needed - because then young region BOTs have the expected contents :) Your call. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From mli at openjdk.java.net Fri Apr 9 12:48:20 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Fri, 9 Apr 2021 12:48:20 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: References:

<3qu7jf8dJ_jTlNcs_gzjXE_k-ylkQTnVveXoMBUqhAA=.3e22c73a-98cd-4f46-a663-a7cc5bacd276@github.com>

Message-ID: On Fri, 9 Apr 2021 12:26:35 GMT, Thomas Schatzl wrote: >> It is fine with me to fix this separately. >> >> There is some lazy mechanism to fill this BOT, but it does not work in this case (which is the gist of what I said above). > > Note that if that were fixed, I think PR #3356 would not be needed - because then young region BOTs have the expected contents :) Your call. Thanks Thomas! Although I'm not sure if pr #3356 is still necessary. But sure, I will add code to fill BOTs for "young" regions as a follow-up fix/enhanacement of this issue, I have just created an bug to track it https://bugs.openjdk.java.net/browse/JDK-8264987. So after this pr is approved and integrated I will initialize the pr for JDK-8264987. ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From github.com+71722661+earthling-amzn at openjdk.java.net Fri Apr 9 17:10:27 2021 From: github.com+71722661+earthling-amzn at openjdk.java.net (earthling-amzn) Date: Fri, 9 Apr 2021 17:10:27 GMT Subject: Integrated: 8264727: Shenandoah: Remove extraneous whitespace from phase timings report In-Reply-To: References: Message-ID: On Mon, 5 Apr 2021 16:45:43 GMT, earthling-amzn wrote: > This extra space doesn't look like it was intended to improve any alignment of text in the report. This pull request has now been integrated. Changeset: ec31b3a1 Author: William Kemper Committer: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/ec31b3a1 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8264727: Shenandoah: Remove extraneous whitespace from phase timings report Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/jdk/pull/3342 From zgu at openjdk.java.net Fri Apr 9 21:20:45 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 9 Apr 2021 21:20:45 GMT Subject: RFR: 8265012: Shenandoah: Backout JDK-8264718 Message-ID: <2dkdGG9JW1wdsKKluHOS33VAIne3WdE7xxUgIlfSQ2c=.65c62947-47cf-466c-b773-046afcff5fa6@github.com> It turns out that enquening string deduplication candidates during concurrent root scanning may result lock rank inversion between stack watermark lock and string dedup queue lock, if the scanning is triggered by stack watermark and dedup buffer happens to be full. Backout JDK-8264718 for now, will retry after Kim's string deduplication refactoring. Test: - [x] hotspot_gc_shenandoah ------------- Commit messages: - JDK-8265012 Changes: https://git.openjdk.java.net/jdk/pull/3423/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3423&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265012 Stats: 69 lines in 5 files changed: 26 ins; 39 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3423.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3423/head:pull/3423 PR: https://git.openjdk.java.net/jdk/pull/3423 From shade at openjdk.java.net Sat Apr 10 05:54:21 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Sat, 10 Apr 2021 05:54:21 GMT Subject: RFR: 8265012: Shenandoah: Backout JDK-8264718 In-Reply-To: <2dkdGG9JW1wdsKKluHOS33VAIne3WdE7xxUgIlfSQ2c=.65c62947-47cf-466c-b773-046afcff5fa6@github.com> References: <2dkdGG9JW1wdsKKluHOS33VAIne3WdE7xxUgIlfSQ2c=.65c62947-47cf-466c-b773-046afcff5fa6@github.com> Message-ID: On Fri, 9 Apr 2021 21:14:22 GMT, Zhengyu Gu wrote: > It turns out that enquening string deduplication candidates during concurrent root scanning may result lock rank inversion between stack watermark lock and string dedup queue lock, if the scanning is triggered by stack watermark and dedup buffer happens to be full. > > Backout JDK-8264718 for now, will retry after Kim's string deduplication refactoring. > > Test: > - [x] hotspot_gc_shenandoah Looks fine. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3423 From ayang at openjdk.java.net Mon Apr 12 09:01:56 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 12 Apr 2021 09:01:56 GMT Subject: RFR: 8264783: G1 BOT verification should not verify beyond allocation threshold In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 13:59:12 GMT, Thomas Schatzl wrote: > The G1 BOT contains an allocation threshold which basically acts as a "last known valid entry" index for the per-region BOT (which are views on the global BOT table). > > Currently G1 BOT verification actually "verifies" the BOT within a region past that BOT index. > > This causes issues with young regions; actually there is already code that prevents their BOT verification. This is perfectly fine, allocations in young regions do not update the BOTs. > > With JDK-8262068/PR #2760 this existing filtering of young regions (such that their BOT is not verified) does not work because there may be young regions that are not compacted (so with a BOT that have that last known valid entry at the start of the region) are still labelled as old. > > This change proposes to not try to verify the BOT beyond the last known valid index (which is arguably not worth doing), which also covers the existing young filtering. > > There are alternatives that may work in particular for JDK-8262068: > a) always compact young regions (which recreates the BOT) > b) create a "dummy" BOT that spans the entire part of the region containing live objects > > However they were rejected by me because > option a) takes time and directly counters that optimization for no reason > option b) makes finding the start of an object within these regions slow (they are not refined when there is at least *some* bot) > > and finally I do not think there is much point in trying to be clever about areas in the BOT that are known to not contain useful values. > > Testing: tier1-3 Marked as reviewed by ayang (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3356 From ayang at openjdk.java.net Mon Apr 12 09:07:00 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 12 Apr 2021 09:07:00 GMT Subject: RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v14] In-Reply-To: <2wLngv3n2wALz1ZbNgSTNaMKMYKTnh6ISlWEdNzBa9o=.f0996e4c-3bd5-42b3-9cc0-0ec8a600f1dc@github.com> References: <2wLngv3n2wALz1ZbNgSTNaMKMYKTnh6ISlWEdNzBa9o=.f0996e4c-3bd5-42b3-9cc0-0ec8a600f1dc@github.com> Message-ID: On Thu, 8 Apr 2021 10:06:54 GMT, Hamlin Li wrote: >> Summary >> ----------- >> >> Improve G1 Full GC by skip compaction for regions with high survival ratio. >> >> Backgroud >> ----------- >> >> There are 4 steps in full gc of G1 GC. >> - mark live objects >> - prepare forwardee >> - adjust pointers >> - compact >> >> When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy. >> >> Description >> ----------- >> >> We enhance the full gc implementation for the above situation through following steps: >> - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495) >> - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase; >> - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step; >> - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase; >> >> VM options related >> ----------- >> >> - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1. >> - default value of MarkSweepDeadRatio: 5 >> >> Test >> ----------- >> >> - specjbb2015: no regression >> - dacapo: (Attachment is the dacapo h2 full gc pause.) >> - 95% of full gc pauses: 10%-19% improvement. >> - 5% of full gc pauses: 1.2% improvement. >> - 0.1% of full gc pauses: -6.16% improvement. >> >> $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > add sanity check; refine code in update_attribute_table. Marked as reviewed by ayang (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2760 From tschatzl at openjdk.java.net Mon Apr 12 09:17:46 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 12 Apr 2021 09:17:46 GMT Subject: RFR: 8264783: G1 BOT verification should not verify beyond allocation threshold In-Reply-To: References:

Message-ID: On Wed, 7 Apr 2021 01:16:05 GMT, Hamlin Li wrote: >> The G1 BOT contains an allocation threshold which basically acts as a "last known valid entry" index for the per-region BOT (which are views on the global BOT table). >> >> Currently G1 BOT verification actually "verifies" the BOT within a region past that BOT index. >> >> This causes issues with young regions; actually there is already code that prevents their BOT verification. This is perfectly fine, allocations in young regions do not update the BOTs. >> >> With JDK-8262068/PR #2760 this existing filtering of young regions (such that their BOT is not verified) does not work because there may be young regions that are not compacted (so with a BOT that have that last known valid entry at the start of the region) are still labelled as old. >> >> This change proposes to not try to verify the BOT beyond the last known valid index (which is arguably not worth doing), which also covers the existing young filtering. >> >> There are alternatives that may work in particular for JDK-8262068: >> a) always compact young regions (which recreates the BOT) >> b) create a "dummy" BOT that spans the entire part of the region containing live objects >> >> However they were rejected by me because >> option a) takes time and directly counters that optimization for no reason >> option b) makes finding the start of an object within these regions slow (they are not refined when there is at least *some* bot) >> >> and finally I do not think there is much point in trying to be clever about areas in the BOT that are known to not contain useful values. >> >> Testing: tier1-3 > > The change looks good to me. > Thanks Thomas! Thanks @Hamlin-Li @albertnetymk for your revies. ------------- PR: https://git.openjdk.java.net/jdk/pull/3356 From tschatzl at openjdk.java.net Mon Apr 12 09:17:46 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 12 Apr 2021 09:17:46 GMT Subject: Integrated: 8264783: G1 BOT verification should not verify beyond allocation threshold In-Reply-To: References: Message-ID: On Tue, 6 Apr 2021 13:59:12 GMT, Thomas Schatzl wrote: > The G1 BOT contains an allocation threshold which basically acts as a "last known valid entry" index for the per-region BOT (which are views on the global BOT table). > > Currently G1 BOT verification actually "verifies" the BOT within a region past that BOT index. > > This causes issues with young regions; actually there is already code that prevents their BOT verification. This is perfectly fine, allocations in young regions do not update the BOTs. > > With JDK-8262068/PR #2760 this existing filtering of young regions (such that their BOT is not verified) does not work because there may be young regions that are not compacted (so with a BOT that have that last known valid entry at the start of the region) are still labelled as old. > > This change proposes to not try to verify the BOT beyond the last known valid index (which is arguably not worth doing), which also covers the existing young filtering. > > There are alternatives that may work in particular for JDK-8262068: > a) always compact young regions (which recreates the BOT) > b) create a "dummy" BOT that spans the entire part of the region containing live objects > > However they were rejected by me because > option a) takes time and directly counters that optimization for no reason > option b) makes finding the start of an object within these regions slow (they are not refined when there is at least *some* bot) > > and finally I do not think there is much point in trying to be clever about areas in the BOT that are known to not contain useful values. > > Testing: tier1-3 This pull request has now been integrated. Changeset: e604320b Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/e604320b Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod 8264783: G1 BOT verification should not verify beyond allocation threshold Reviewed-by: mli, ayang ------------- PR: https://git.openjdk.java.net/jdk/pull/3356 From sjohanss at openjdk.java.net Mon Apr 12 10:18:40 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Mon, 12 Apr 2021 10:18:40 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References:

<_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com>

Message-ID: