From mli at openjdk.org Sun Oct 1 10:28:36 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:28:36 GMT Subject: RFR: 8317317: G1: Make TestG1RemSetFlags use createTestJvm In-Reply-To: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> References: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> Message-ID: On Fri, 29 Sep 2023 14:51:23 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15989#pullrequestreview-1651878135 From mli at openjdk.org Sun Oct 1 10:28:36 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:28:36 GMT Subject: RFR: 8317218: G1: Make TestG1HeapRegionSize use createTestJvm In-Reply-To: References: Message-ID: On Thu, 28 Sep 2023 09:07:04 GMT, Leo Korinth wrote: > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC'` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseZGC'` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:G1HeapRegionSize=80000'` > TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS='` -> PASS 1 Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15959#pullrequestreview-1651878187 From mli at openjdk.org Sun Oct 1 10:28:43 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:28:43 GMT Subject: RFR: 8317188: G1: Make TestG1ConcRefinementThreads use createTestJvm In-Reply-To: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> References: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> Message-ID: On Thu, 28 Sep 2023 08:34:31 GMT, Leo Korinth wrote: > Testing after changes: > > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC' ` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseZGC' ` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:G1ConcRefinementThreads=42' ` -> TOTAL 0 Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15958#pullrequestreview-1651878205 From mli at openjdk.org Sun Oct 1 10:28:46 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:28:46 GMT Subject: RFR: 8317042: G1: Make TestG1ConcMarkStepDurationMillis use createTestJvm In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 16:37:54 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: `-XX+UseG1GC` (pass 1), `-XX+UseZGC` (total 0), `-XX:G1ConcMarkStepDurationMillis=23` (total 0) and no options (pass 1) Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15948#pullrequestreview-1651878221 From mli at openjdk.org Sun Oct 1 10:28:53 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:28:53 GMT Subject: RFR: 8317316: G1: Make TestG1PercentageOptions use createTestJvm In-Reply-To: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> References: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> Message-ID: On Fri, 29 Sep 2023 14:24:07 GMT, Leo Korinth wrote: > Done basic testing, will do more testing before pushing (together with other tests) Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15987#pullrequestreview-1651878165 From mli at openjdk.org Sun Oct 1 10:43:33 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:43:33 GMT Subject: RFR: 8316973: GC: Make TestDisableDefaultGC use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 17:24:22 GMT, Leo Korinth wrote: > There seems there is no need to strip external vm flags. The `@requires vm.gc=="null"` will handle the case when we iterate different GCs and we will then just skip the test. > > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC'` -> total 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java` -> success > > started tier 1-5 Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15931#pullrequestreview-1651880228 From mli at openjdk.org Sun Oct 1 10:44:43 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:44:43 GMT Subject: RFR: 8316410: GC: Make TestCompressedClassFlags use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 16:50:26 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: > > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseParallelGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC -XX:+ZGenerational' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseShenandoahGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g -XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseCompressedClassPointers' ==> TOTAL:0 Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15929#pullrequestreview-1651880326 From mli at openjdk.org Sun Oct 1 10:49:38 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 1 Oct 2023 10:49:38 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test In-Reply-To: References: Message-ID: <6VUav0RO4LriO26hZ_OrTxnDDw0Ri2_Gx1rnMeGgzCg=.687a8fb2-dace-40cf-b01c-56e4d4b5e591@github.com> On Mon, 18 Sep 2023 13:24:59 GMT, Soumadipta Roy wrote: > Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: > > * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** > * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** > * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** > * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** > * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** > > Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15788#pullrequestreview-1651881178 From lmesnik at openjdk.org Sun Oct 1 20:52:39 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sun, 1 Oct 2023 20:52:39 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 13:24:59 GMT, Soumadipta Roy wrote: > Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: > > * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** > * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** > * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** > * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** > * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** > > Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. Thanks for doing this!. You could add meaninful test id if you want. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15788#pullrequestreview-1651961743 From lmesnik at openjdk.org Sun Oct 1 21:08:35 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sun, 1 Oct 2023 21:08:35 GMT Subject: RFR: 8316973: GC: Make TestDisableDefaultGC use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 17:24:22 GMT, Leo Korinth wrote: > There seems there is no need to strip external vm flags. The `@requires vm.gc=="null"` will handle the case when we iterate different GCs and we will then just skip the test. > > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC'` -> total 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java` -> success > > started tier 1-5 Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15931#pullrequestreview-1651963605 From tschatzl at openjdk.org Mon Oct 2 08:09:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Oct 2023 08:09:50 GMT Subject: RFR: 8315503: G1: Code root scan causes long GC pauses due to imbalanced iteration [v4] In-Reply-To: References: Message-ID: On Mon, 25 Sep 2023 17:26:56 GMT, Ivan Walulya wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' into 8315503-code-root-scan-imbalance >> - iwalulya review - more (gtest) cleanup >> - iwalulya review >> - initial version that seems to work >> >> Contains kludge to avoid modification of currently scanned code root set. >> Ought to be fixed differently. >> >> Contains debug code in table scanners of CodeRootSet/CardSet to find out problems with table growing >> >> Hashcode hack for code root set, using copy&paste ZHash >> >> Shrink table after clean >> >> Bulk removal of nmethods from code root sets after class unloading. From Ivan. >> >> Cleanup, resize after bulk delete, hashcode verification > > Still LGTM! Thanks @walulyai @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/15811#issuecomment-1742516392 From tschatzl at openjdk.org Mon Oct 2 08:33:16 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Oct 2023 08:33:16 GMT Subject: Integrated: 8315503: G1: Code root scan causes long GC pauses due to imbalanced iteration In-Reply-To: References: Message-ID: On Tue, 19 Sep 2023 08:04:23 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that modifies the code root (remembered) set to use the CHT as internal representation. > > This removes lots of locking (inhibiting throughput), provides automatic balancing for the code root scan phase, and (parallel) bulk unregistering of nmethdos during code cache unloading improving performance of various pauses that deal with code root sets. > > With a stress test that frequently loads and unloads 6000 classes and associated methods from them we could previously see the following issues: > > During collection pauses: > > [4179,965s][gc,phases ] GC(273) Evacuate Collection Set: 812,18ms > [..] > [4179,965s][gc,phases ] GC(273) Code Root Scan (ms): Min: 0,00, Avg: 59,03, Max: 775,12, Diff: 775,12, Sum: 944,44, Workers: 16 > [...] > [4179,965s][gc,phases ] GC(273) Termination (ms): Min: 0,03, Avg: 643,90, Max: 690,96, Diff: 690,93, Sum: 10302,47, Workers: 16 > > > Code root scan now reduces to ~22ms max on average in this case. > > We have recently seen some imbalances in code root scan and long Remark pauses (thankfully not to that extreme) in other real-world applications too: > > [2466.979s][gc,phases ] GC(131) Code Root Scan (ms): Min: 0.0, Avg: 5.7, Max: 46.4, Diff: 46.4, Sum: 57.0, Workers: 10 > > > Some random comment: > * the mutex for the CHT had to be decreased in priority by one to not conflict with `CodeCache_lock`. This does not seem to be detrimental otherwise. At the same time, I had to move the locks at `nosafepoint-3` to `nosafepoint-4` as well to keep previous ordering. All mutexes with uses of `nosafepoint` as their rank seem to be good now. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 795e5dcc Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/795e5dcc856491031b87a1f2a942681a582673ab Stats: 382 lines in 13 files changed: 218 ins; 114 del; 50 mod 8315503: G1: Code root scan causes long GC pauses due to imbalanced iteration Co-authored-by: Ivan Walulya Reviewed-by: iwalulya, ayang ------------- PR: https://git.openjdk.org/jdk/pull/15811 From shade at openjdk.org Mon Oct 2 08:41:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Oct 2023 08:41:35 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 12:48:12 GMT, Zhengyu Gu wrote: > During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. > > Test: > hotspot_gc_shenandoah (fastdebug and release on MacOSX) This looks okay, but I think you can "just" add `OMC::cleanup_old_entries` in `VM_ShenandoahReferenceOperation` destructor, without introducing the intermediary? ------------- PR Review: https://git.openjdk.org/jdk/pull/15921#pullrequestreview-1652325437 From lkorinth at openjdk.org Mon Oct 2 10:05:36 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 2 Oct 2023 10:05:36 GMT Subject: RFR: 8317343: GC: Make TestHeapFreeRatio use createTestJvm Message-ID: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> This fix is implicitly dependent on https://github.com/openjdk/jdk/pull/15986/files for `@requires opt.x` support. Initial testing passes, but I will do more (tier) testing before pushing. ------------- Commit messages: - 8317343: GC: Make TestHeapFreeRatio use createTestJvm Changes: https://git.openjdk.org/jdk/pull/16007/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16007&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317343 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16007.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16007/head:pull/16007 PR: https://git.openjdk.org/jdk/pull/16007 From duke at openjdk.org Mon Oct 2 10:35:34 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Mon, 2 Oct 2023 10:35:34 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test In-Reply-To: References: Message-ID: On Sun, 1 Oct 2023 20:49:39 GMT, Leonid Mesnik wrote: > Thanks for doing this!. You could add meaninful test id if you want. @lmesnik I will push another commit with test ids ------------- PR Comment: https://git.openjdk.org/jdk/pull/15788#issuecomment-1742774439 From duke at openjdk.org Mon Oct 2 11:55:09 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Mon, 2 Oct 2023 11:55:09 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test [v2] In-Reply-To: References: Message-ID: > Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: > > * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** > * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** > * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** > * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** > * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** > > Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. Soumadipta Roy has updated the pull request incrementally with one additional commit since the last revision: Cosmetic fixes Fixing summary positioning along with some spacing and spelling issues. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15788/files - new: https://git.openjdk.org/jdk/pull/15788/files/004bda6c..8c73db56 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15788&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15788&range=00-01 Stats: 8 lines in 1 file changed: 1 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/15788.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15788/head:pull/15788 PR: https://git.openjdk.org/jdk/pull/15788 From duke at openjdk.org Mon Oct 2 11:55:10 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Mon, 2 Oct 2023 11:55:10 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test [v2] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 10:33:00 GMT, Soumadipta Roy wrote: > Thanks for doing this!. You could add meaninful test id if you want. @lmesnik Tried to add some test ids locally, however the naming doesn't sound or look that that great for people to concur unanimously. So keeping it to default. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15788#issuecomment-1742869496 From lkorinth at openjdk.org Mon Oct 2 11:55:33 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 2 Oct 2023 11:55:33 GMT Subject: RFR: 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm Message-ID: Also add parallel flag for first invocation (this is a bug). Initial testing ok, but will run tiers before pushing. ------------- Commit messages: - 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm Changes: https://git.openjdk.org/jdk/pull/16009/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16009&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317347 Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16009.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16009/head:pull/16009 PR: https://git.openjdk.org/jdk/pull/16009 From shade at openjdk.org Mon Oct 2 12:00:12 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Oct 2023 12:00:12 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test [v2] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 11:55:09 GMT, Soumadipta Roy wrote: >> Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: >> >> * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** >> * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** >> * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** >> * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** >> * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** >> >> Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. > > Soumadipta Roy has updated the pull request incrementally with one additional commit since the last revision: > > Cosmetic fixes > > Fixing summary positioning along with some spacing and spelling issues. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15788#pullrequestreview-1652598758 From tschatzl at openjdk.org Mon Oct 2 12:35:24 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Oct 2023 12:35:24 GMT Subject: RFR: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test [v2] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 11:55:09 GMT, Soumadipta Roy wrote: >> Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: >> >> * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** >> * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** >> * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** >> * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** >> * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** >> >> Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. > > Soumadipta Roy has updated the pull request incrementally with one additional commit since the last revision: > > Cosmetic fixes > > Fixing summary positioning along with some spacing and spelling issues. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15788#pullrequestreview-1652648456 From zgu at openjdk.org Mon Oct 2 13:37:05 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 13:37:05 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v2] In-Reply-To: References: Message-ID: > During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. > > Test: > hotspot_gc_shenandoah (fastdebug and release on MacOSX) Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into JDK-8316929 - cleanup - 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15921/files - new: https://git.openjdk.org/jdk/pull/15921/files/92dc4f4e..b84d2cad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=00-01 Stats: 7940 lines in 283 files changed: 5780 ins; 1041 del; 1119 mod Patch: https://git.openjdk.org/jdk/pull/15921.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15921/head:pull/15921 PR: https://git.openjdk.org/jdk/pull/15921 From zgu at openjdk.org Mon Oct 2 13:41:56 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 13:41:56 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v3] In-Reply-To: References: Message-ID: > During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. > > Test: > hotspot_gc_shenandoah (fastdebug and release on MacOSX) Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Fix order ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15921/files - new: https://git.openjdk.org/jdk/pull/15921/files/b84d2cad..c1e9cf27 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15921.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15921/head:pull/15921 PR: https://git.openjdk.org/jdk/pull/15921 From zgu at openjdk.org Mon Oct 2 13:48:58 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 13:48:58 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v4] In-Reply-To: References: Message-ID: > During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. > > Test: > hotspot_gc_shenandoah (fastdebug and release on MacOSX) Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: More cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15921/files - new: https://git.openjdk.org/jdk/pull/15921/files/c1e9cf27..4f41dc14 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15921&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15921.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15921/head:pull/15921 PR: https://git.openjdk.org/jdk/pull/15921 From zgu at openjdk.org Mon Oct 2 13:51:13 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 13:51:13 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v3] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 13:41:56 GMT, Zhengyu Gu wrote: >> During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. >> >> Test: >> hotspot_gc_shenandoah (fastdebug and release on MacOSX) > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Fix order > This looks okay, but I think you can "just" add `OMC::cleanup_old_entries` in `VM_ShenandoahReferenceOperation` destructor, without introducing the intermediary? > This looks okay, but I think you can "just" add `OMC::cleanup_old_entries` in `VM_ShenandoahReferenceOperation` destructor, without introducing the intermediary? Place the call in `doit_epilogue()` for the symmetry with other [GCs](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcVMOperations.cpp#L135). But you are right, we don't need another VM operation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15921#issuecomment-1743052006 From shade at openjdk.org Mon Oct 2 14:32:14 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Oct 2023 14:32:14 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v4] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 13:48:58 GMT, Zhengyu Gu wrote: >> During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. >> >> Test: >> hotspot_gc_shenandoah (fastdebug and release on MacOSX) > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > More cleanup Looks fine, thanks! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15921#pullrequestreview-1652875148 From lkorinth at openjdk.org Mon Oct 2 15:17:28 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 2 Oct 2023 15:17:28 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm Message-ID: In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` Minimal testing completed, will run tier testing before pushing. ------------- Commit messages: - 8317358: G1: Make TestMaxNewSize use createTestJvm Changes: https://git.openjdk.org/jdk/pull/16012/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16012&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317358 Stats: 30 lines in 1 file changed: 0 ins; 23 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16012.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16012/head:pull/16012 PR: https://git.openjdk.org/jdk/pull/16012 From duke at openjdk.org Mon Oct 2 15:20:31 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Mon, 2 Oct 2023 15:20:31 GMT Subject: Integrated: 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 13:24:59 GMT, Soumadipta Roy wrote: > Looking at tier4 tests, gc/stress tests take lots of time per test. For example TestStressRSetCoarsening takes about 33 minutes to runin fastdebug mode in macos arm cpu. This limits effective parallelism of tier4 testing on large machines. We can parallelize its `@run` configs to improve effective parallelism for tier4. Below are some of the observation from before any change, partial parallelization and full parallelization: > > * before_release : **585.70s user 23.80s system 106% cpu 9:32.16 total** > * before_fastdebug : **2033.77s user 30.04s system 105% cpu 32:43.10 total** > * fully-parallelized_fastdebug : **2246.94s user 36.97s system 135% cpu 28:07.24 total** > * fully-parallelized_release : **463.52s user 31.54s system 234% cpu 3:31.19 total** > * partially-parallelized_release : **461.15s user 20.88s system 257% cpu 3:06.91 total** > > Even though partial parallelization shows better results it has anomaly 33%-50% of the time where the run results are same as before_release. I have runn each of the above combinations multiple times to establish a consistency and fully parallel gives us the most benefit in terms of total time without deviating too much on user and system times. This pull request has now been integrated. Changeset: a564d436 Author: Soumadipta Roy Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/a564d436c722f14041231158f21c4ad3a2f6a3a5 Stats: 63 lines in 1 file changed: 55 ins; 1 del; 7 mod 8315692: Parallelize gc/stress/TestStressRSetCoarsening.java test Reviewed-by: shade, mli, lmesnik, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15788 From wkemper at openjdk.org Mon Oct 2 20:18:31 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 2 Oct 2023 20:18:31 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v3] In-Reply-To: <2bXcUWi2pXLKmpiPTNIT2_xu2fGBwPHB5YUMInN-OFo=.92656fc9-7549-41fa-bb9d-445cee3db5f8@github.com> References: <2bXcUWi2pXLKmpiPTNIT2_xu2fGBwPHB5YUMInN-OFo=.92656fc9-7549-41fa-bb9d-445cee3db5f8@github.com> Message-ID: On Fri, 29 Sep 2023 16:28:30 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: >> >> - Merge remote-tracking branch 'openjdk/master' into shenandoah-oome-redux >> - Extend exemption for EATests that rely on timely OOME to Shenandoah >> - Improve comment, increase default for no progress threshold >> - Allocator should not reset bad progress count >> - Allocator should not reset bad progress count >> - Fix 32-bit build error >> - Do not use atomics in header >> - Signal gc threshold exceeded when appropriate > > test/hotspot/jtreg/TEST.groups line 612: > >> 610: gtest/NMTGtests.java >> 611: >> 612: hotspot_oome = \ > > Adding new test groups require additional attention. I think this should be removed. > > Note that during debugging/testing, you can run the same by: > > > $ make test TEST="runtime/reflect/ReflectOutOfMemoryError.java gc/InfiniteList.java runtime/ClassInitErrors/TestOutOfMemoryDuringInit.java" I'll revert this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1343104448 From zgu at openjdk.org Mon Oct 2 20:56:33 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 20:56:33 GMT Subject: RFR: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries [v4] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 14:29:11 GMT, Aleksey Shipilev wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> More cleanup > > Looks fine, thanks! Thanks, @shipilev ------------- PR Comment: https://git.openjdk.org/jdk/pull/15921#issuecomment-1743693867 From zgu at openjdk.org Mon Oct 2 20:56:38 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 2 Oct 2023 20:56:38 GMT Subject: Integrated: 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 12:48:12 GMT, Zhengyu Gu wrote: > During STW root scan, interpreted frame's oop map may be cached. But due to limited cache size (32 entries per instance class), entries may be evicted to old entries list due to collision, should be cleanup in VM_Operation's doit_epilogue(), or risk leaking memory. > > Test: > hotspot_gc_shenandoah (fastdebug and release on MacOSX) This pull request has now been integrated. Changeset: e25121d1 Author: Zhengyu Gu URL: https://git.openjdk.org/jdk/commit/e25121d1d908bd74e7a5914d85284ab322bed1a3 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8316929: Shenandoah: Shenandoah degenerated GC and full GC need to cleanup old OopMapCache entries Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/15921 From wkemper at openjdk.org Mon Oct 2 21:31:13 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 2 Oct 2023 21:31:13 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v4] In-Reply-To: References: Message-ID: > Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). William Kemper has updated the pull request incrementally with two additional commits since the last revision: - Merge check for no-progress into retry allocation block - Revert change to TEST.groups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15852/files - new: https://git.openjdk.org/jdk/pull/15852/files/cd4989e7..1971467f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=02-03 Stats: 33 lines in 2 files changed: 10 ins; 14 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/15852.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15852/head:pull/15852 PR: https://git.openjdk.org/jdk/pull/15852 From wkemper at openjdk.org Mon Oct 2 21:31:19 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 2 Oct 2023 21:31:19 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v3] In-Reply-To: <2bXcUWi2pXLKmpiPTNIT2_xu2fGBwPHB5YUMInN-OFo=.92656fc9-7549-41fa-bb9d-445cee3db5f8@github.com> References: <2bXcUWi2pXLKmpiPTNIT2_xu2fGBwPHB5YUMInN-OFo=.92656fc9-7549-41fa-bb9d-445cee3db5f8@github.com> Message-ID: On Fri, 29 Sep 2023 16:40:22 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: >> >> - Merge remote-tracking branch 'openjdk/master' into shenandoah-oome-redux >> - Extend exemption for EATests that rely on timely OOME to Shenandoah >> - Improve comment, increase default for no progress threshold >> - Allocator should not reset bad progress count >> - Allocator should not reset bad progress count >> - Fix 32-bit build error >> - Do not use atomics in header >> - Signal gc threshold exceeded when appropriate > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 877: > >> 875: } >> 876: >> 877: if (result == nullptr && !req.is_lab_alloc() && get_gc_no_progress_count() > ShenandoahNoProgressThreshold) { > > Can this be moved to the block that already does allocation-after-gc retries? That block already exits with `nullptr` (implies delivering OOME), and it already calls `handle_alloc_failure` (thus triggering GC). We "only" need it to specialize for `is_lab_alloc` and `ShenandoahNoProgressThreshold`? > > This PR changes that block anyway... Yes, I've made this change as you suggest, but I'm not sure it improves readability. Note that when the `ShenandoahNoProgressThreshold` is exceeded, we do _not_ retry the allocation or wait for another gc cycle to complete. > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 902: > >> 900: ResourceMark rm; >> 901: log_debug(gc, alloc)("Thread: %s, Result: " PTR_FORMAT ", Shared: %s, Size: " SIZE_FORMAT ", Original: " SIZE_FORMAT ", Latest: " SIZE_FORMAT, >> 902: Thread::current()->name(), p2i(result), BOOL_TO_STR(!req.is_lab_alloc()), req.size(), original_count, get_gc_no_progress_count()); > > There is `type_string()` that can be used instead of `is_lab_alloc()` here. TIL - thank you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1343174921 PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1343175156 From ayang at openjdk.org Tue Oct 3 08:30:05 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 3 Oct 2023 08:30:05 GMT Subject: RFR: 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder Message-ID: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> Simple relocating code to more appropriate folder. `ClearNoncleanCardWrapper` is moved to cpp as well. One can sth like `git diff --color-moved=dimmed_zebra` to better review this PR. ------------- Commit messages: - s1-dirty-card-closure Changes: https://git.openjdk.org/jdk/pull/16025/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16025&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317354 Stats: 362 lines in 4 files changed: 181 ins; 181 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16025.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16025/head:pull/16025 PR: https://git.openjdk.org/jdk/pull/16025 From tschatzl at openjdk.org Tue Oct 3 10:20:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:20:32 GMT Subject: RFR: 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder In-Reply-To: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> References: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> Message-ID: On Tue, 3 Oct 2023 08:12:42 GMT, Albert Mingkun Yang wrote: > Simple relocating code to more appropriate folder. `ClearNoncleanCardWrapper` is moved to cpp as well. > > One can sth like `git diff --color-moved=dimmed_zebra` to better review this PR. Lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16025#pullrequestreview-1654854908 From tschatzl at openjdk.org Tue Oct 3 10:22:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:22:32 GMT Subject: RFR: 8317042: G1: Make TestG1ConcMarkStepDurationMillis use createTestJvm In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 16:37:54 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: `-XX+UseG1GC` (pass 1), `-XX+UseZGC` (total 0), `-XX:G1ConcMarkStepDurationMillis=23` (total 0) and no options (pass 1) Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15948#pullrequestreview-1654858311 From tschatzl at openjdk.org Tue Oct 3 10:23:33 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:23:33 GMT Subject: RFR: 8317188: G1: Make TestG1ConcRefinementThreads use createTestJvm In-Reply-To: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> References: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> Message-ID: On Thu, 28 Sep 2023 08:34:31 GMT, Leo Korinth wrote: > Testing after changes: > > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC' ` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseZGC' ` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:G1ConcRefinementThreads=42' ` -> TOTAL 0 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15958#pullrequestreview-1654860006 From tschatzl at openjdk.org Tue Oct 3 10:24:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:24:35 GMT Subject: RFR: 8317218: G1: Make TestG1HeapRegionSize use createTestJvm In-Reply-To: References: Message-ID: On Thu, 28 Sep 2023 09:07:04 GMT, Leo Korinth wrote: > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC'` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseZGC'` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:G1HeapRegionSize=80000'` > TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS='` -> PASS 1 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15959#pullrequestreview-1654861258 From tschatzl at openjdk.org Tue Oct 3 10:25:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:25:32 GMT Subject: RFR: 8317316: G1: Make TestG1PercentageOptions use createTestJvm In-Reply-To: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> References: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> Message-ID: On Fri, 29 Sep 2023 14:24:07 GMT, Leo Korinth wrote: > Done basic testing, will do more testing before pushing (together with other tests) Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15987#pullrequestreview-1654863391 From tschatzl at openjdk.org Tue Oct 3 10:28:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:28:35 GMT Subject: RFR: 8317343: GC: Make TestHeapFreeRatio use createTestJvm In-Reply-To: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> References: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> Message-ID: On Mon, 2 Oct 2023 09:57:55 GMT, Leo Korinth wrote: > This fix is implicitly dependent on https://github.com/openjdk/jdk/pull/15986/files for `@requires opt.x` support. Initial testing passes, but I will do more (tier) testing before pushing. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16007#pullrequestreview-1654868543 From tschatzl at openjdk.org Tue Oct 3 10:28:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 3 Oct 2023 10:28:35 GMT Subject: RFR: 8317317: G1: Make TestG1RemSetFlags use createTestJvm In-Reply-To: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> References: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> Message-ID: On Fri, 29 Sep 2023 14:51:23 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15989#pullrequestreview-1654867432 From duke at openjdk.org Tue Oct 3 14:12:12 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Tue, 3 Oct 2023 14:12:12 GMT Subject: RFR: 8316608: Enable parallelism in vmTestbase/gc/vector tests Message-ID: The commit includes changes to unblock parallelism for more `hotspot:tier4` tests. in `vmTestbase/gc/vector` tests. Below are the before and after run comparisons: * before_fastdebug: **3480.71s user 83.97s system 830% cpu 7:09.41 total** * before_release: **1369.61s user 147.03s system 371% cpu 6:48.63 total** * after_fastdebug: **2214.52s user 63.19s system 2374% cpu 1:35.94 total** * after_release: **1130.28s user 110.97s system 2478% cpu 50.089 total** ------------- Commit messages: - 8316608: Enable parallelism in vmTestbase/gc/vector tests Changes: https://git.openjdk.org/jdk/pull/16028/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16028&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8316608 Stats: 299 lines in 13 files changed: 0 ins; 299 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16028.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16028/head:pull/16028 PR: https://git.openjdk.org/jdk/pull/16028 From dcubed at openjdk.org Tue Oct 3 18:28:01 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 3 Oct 2023 18:28:01 GMT Subject: RFR: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp Message-ID: Trivial fixes to ProblemList noisy tests in the JDK22 CI; [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms ------------- Commit messages: - 8317449: ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms - 8317448: ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp - 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp Changes: https://git.openjdk.org/jdk/pull/16031/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16031&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317446 Stats: 5 lines in 2 files changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16031.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16031/head:pull/16031 PR: https://git.openjdk.org/jdk/pull/16031 From thartmann at openjdk.org Tue Oct 3 18:33:38 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Oct 2023 18:33:38 GMT Subject: RFR: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 17:53:27 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList noisy tests in the JDK22 CI; > > [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp > [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp > [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms Looks good and trivial. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16031#pullrequestreview-1655866995 From dcubed at openjdk.org Tue Oct 3 19:18:50 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 3 Oct 2023 19:18:50 GMT Subject: RFR: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 18:30:44 GMT, Tobias Hartmann wrote: >> Trivial fixes to ProblemList noisy tests in the JDK22 CI; >> >> [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp >> [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp >> [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms > > Looks good and trivial. @TobiHartmann - Thanks for the fast review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16031#issuecomment-1745573937 From dcubed at openjdk.org Tue Oct 3 19:22:11 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 3 Oct 2023 19:22:11 GMT Subject: Integrated: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 17:53:27 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList noisy tests in the JDK22 CI; > > [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp > [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp > [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms This pull request has now been integrated. Changeset: 8ff10a0d Author: Daniel D. Daugherty URL: https://git.openjdk.org/jdk/commit/8ff10a0d3520fbeae9fe7aac4226d65b93ec79f8 Stats: 5 lines in 2 files changed: 5 ins; 0 del; 0 mod 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp 8317448: ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp 8317449: ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms Reviewed-by: thartmann ------------- PR: https://git.openjdk.org/jdk/pull/16031 From duke at openjdk.org Tue Oct 3 21:37:46 2023 From: duke at openjdk.org (Diego Pino Navarro) Date: Tue, 3 Oct 2023 21:37:46 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v10] In-Reply-To: <2meSHY2dSGLyLNXPTeSwOBbsCh6Ki0itMEfh2x-GhUM=.a715c087-a300-452f-b406-04168843b363@github.com> References: <2meSHY2dSGLyLNXPTeSwOBbsCh6Ki0itMEfh2x-GhUM=.a715c087-a300-452f-b406-04168843b363@github.com> Message-ID: On Mon, 12 Jun 2023 13:32:52 GMT, Ivan Walulya wrote: >> Please review this change which fixes the thread starvation problem during allocation for G1. >> >> The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. >> >> Starvation with an active GCLocker happens as below: >> >> 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. >> 2. GCLocker induced GC executes and frees some memory. >> 3. Thread A does not get any of that memory, but other threads also waiting for memory. >> 4. Goto 1 until the gclocker retry count has been reached. >> >> In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. >> >> Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. >> >> Testing: Tier 1-7 > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > clean up keep alive+1 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14077#issuecomment-1682616848 From shade at openjdk.org Wed Oct 4 08:02:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Oct 2023 08:02:35 GMT Subject: RFR: 8316608: Enable parallelism in vmTestbase/gc/vector tests In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 14:02:09 GMT, Soumadipta Roy wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4` tests. in `vmTestbase/gc/vector` tests. > > Below are the before and after run comparisons: > > # Fastdebug > before: 3480.71s user 83.97s system 830% cpu 7:09.41 total > after: 2214.52s user 63.19s system 2374% cpu 1:35.94 total > > # Release > before: 1369.61s user 147.03s system 371% cpu 6:48.63 total > after: 1130.28s user 110.97s system 2478% cpu 50.089 total I think this is good, but @tschatzl and @lmesnik need to be looped in for visibility. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16028#pullrequestreview-1656903673 From aph at openjdk.org Wed Oct 4 08:31:47 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 4 Oct 2023 08:31:47 GMT Subject: RFR: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 17:53:27 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList noisy tests in the JDK22 CI; > > [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp > [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp > [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms What is this all about? I can't see any link to a root cause. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16031#issuecomment-1746387857 From thartmann at openjdk.org Wed Oct 4 11:30:47 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 4 Oct 2023 11:30:47 GMT Subject: RFR: 8317446: ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 17:53:27 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList noisy tests in the JDK22 CI; > > [JDK-8317446](https://bugs.openjdk.org/browse/JDK-8317446) ProblemList gc/arguments/TestNewSizeFlags.java on macosx-aarch64 in Xcomp > [JDK-8317448](https://bugs.openjdk.org/browse/JDK-8317448) ProblemList compiler/interpreter/TestVerifyStackAfterDeopt.java on macosx-aarch64 in Xcomp > [JDK-8317449](https://bugs.openjdk.org/browse/JDK-8317449) ProblemList serviceability/jvmti/stress/StackTrace/NotSuspended/GetStackTraceNotSuspendedStressTest.java on several platforms This change is simply problem listing multiple tests. As described in the [OpenJDK Developers? Guide](https://openjdk.org/guide/#problemlisting-jtreg-tests), the problem listing issues are sub-tasks of the actual (still open) issues and are linked by Skara in the "Issues" section of this PR. The corresponding JBS issues are also listed in the problem list entries themselves. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16031#issuecomment-1746682420 From ayang at openjdk.org Wed Oct 4 11:52:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Oct 2023 11:52:41 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v12] In-Reply-To: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> References: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> Message-ID: On Thu, 28 Sep 2023 07:41:18 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Remove stripe size adaptations and cache potentially expensive start array queries Performed additional performance testing on the latest revision of this pull request and https://github.com/openjdk/jdk/compare/master...albertnetymk:jdk:pgc-precise-obj-arr?expand=1 (made sure they were on top of the same master commit) Couldn't identify significant differences when running micro benchmarks from the JBS ticket with different gc-threads; implemented various tweaks but the distinction between the two approaches remains mostly marginal. Both methods exhibit substantial improvements over the master, as demonstrated earlier. No performance difference observed in pjbb2005 between the master, this pull request, and shadow-card-table. (I had difficulty in running `timefold`, so I asked Thomas for help about it.) The cost of malloc + memset for the shadow-card-table is ~0.26ms per 1G of old-gen (each card being 512 bytes) (raw data: 0.553169 ms for 4395946 cards). Since the shadow-card-table approach doesn't result in any noticeable regression, offers better scalability for large-array-objects, and comes with the lowest implementation complexity, I am inclined to settle for the shadow-card-table approach for now and explore more sophisticated optimizations later on. What are others' thoughts on this direction? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1746714371 From rrich at openjdk.org Wed Oct 4 11:59:39 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 4 Oct 2023 11:59:39 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v12] In-Reply-To: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> References: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> Message-ID: On Thu, 28 Sep 2023 07:41:18 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Remove stripe size adaptations and cache potentially expensive start array queries We've had a public holiday yesterday. I'm still working on the version without shadow card table. It is more complex, and finding the issues is time consuming but I'd like to finish the work on it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1746726305 From iwalulya at openjdk.org Wed Oct 4 14:12:39 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 4 Oct 2023 14:12:39 GMT Subject: RFR: 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder In-Reply-To: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> References: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> Message-ID: <-ucg1BAbRw9VQhMzuOSmPARKhb45VOi6rv2dV2_aYK8=.7ff5b862-84d7-42f3-8b4f-cf5dc49c1ce5@github.com> On Tue, 3 Oct 2023 08:12:42 GMT, Albert Mingkun Yang wrote: > Simple relocating code to more appropriate folder. `ClearNoncleanCardWrapper` is moved to cpp as well. > > One can sth like `git diff --color-moved=dimmed_zebra` to better review this PR. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16025#pullrequestreview-1657635747 From ayang at openjdk.org Wed Oct 4 14:18:55 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Oct 2023 14:18:55 GMT Subject: RFR: 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder In-Reply-To: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> References: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> Message-ID: On Tue, 3 Oct 2023 08:12:42 GMT, Albert Mingkun Yang wrote: > Simple relocating code to more appropriate folder. `ClearNoncleanCardWrapper` is moved to cpp as well. > > One can sth like `git diff --color-moved=dimmed_zebra` to better review this PR. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16025#issuecomment-1746966388 From ayang at openjdk.org Wed Oct 4 14:18:56 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Oct 2023 14:18:56 GMT Subject: Integrated: 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder In-Reply-To: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> References: <49vr8f73ylqcYr7iaxBziG_cTPkZjATyCiCHwRxSoFg=.ac721666-4eb0-45a7-b312-b8f2d90c8f91@github.com> Message-ID: On Tue, 3 Oct 2023 08:12:42 GMT, Albert Mingkun Yang wrote: > Simple relocating code to more appropriate folder. `ClearNoncleanCardWrapper` is moved to cpp as well. > > One can sth like `git diff --color-moved=dimmed_zebra` to better review this PR. This pull request has now been integrated. Changeset: 4195246f Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/4195246fba721934f2b2c0525b1d5b2fe4b08122 Stats: 362 lines in 4 files changed: 181 ins; 181 del; 0 mod 8317354: Serial: Move DirtyCardToOopClosure to gc/serial folder Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16025 From jwaters at openjdk.org Thu Oct 5 10:12:26 2023 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 5 Oct 2023 10:12:26 GMT Subject: RFR: 8315880: change LockingMode default from LM_LEGACY to LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 20:47:13 GMT, Daniel D. Daugherty wrote: > Change the default of LockingMode to LM_LIGHTWEIGHT from LM_LEGACY. > > This fix has been tested with 3 Mach5 Tier[1-8] runs and a 4th is in process. I think this may have broken JDK-8288293 ? Is there any compiler specific implementation involved with the new lightweight locking? Just a quick question, I can figure out the rest on my own ------------- PR Comment: https://git.openjdk.org/jdk/pull/15797#issuecomment-1748553071 From rrich at openjdk.org Thu Oct 5 10:22:45 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 5 Oct 2023 10:22:45 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v13] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: - Split work strictly at stripe boundaries - Reset to master ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/50737dda..817b164c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=11-12 Stats: 505 lines in 8 files changed: 197 ins; 272 del; 36 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 5 10:23:13 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 5 Oct 2023 10:23:13 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v12] In-Reply-To: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> References: <50VtEqmFaxK4NnbXqU54rQW8R1YrGDa6HukQOuniupE=.5a5365f1-546a-4c48-a763-9248346c6593@github.com> Message-ID: On Thu, 28 Sep 2023 07:41:18 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Remove stripe size adaptations and cache potentially expensive start array queries With the last version the work is strictly limited to stripes. * Work partitioning happens at stripe boundaries splitting also non-array objects if necessary. * A worker thread accesses only card table entries corresponding to the stripe it's working on. * Object arrays are scanned precisely. * Non-object arrays are not scanned precisely. * The implementation is inspired by Albert's work (especially the `ObjStartCache` which prevents `ObjectStartArray` performance issues with very large arrays). It does not duplicate the card table though. Instead it copies imprecise card marks from (non-array) object start to the first card of stripes reached (see `PSCardTable::pre_scavenge`). * `ObjStartCache` is a class because I need to pass it by value to `find_first_clean_card`. I haven't seen a significant difference between this version, Albert's work and the baseline running the card_scan* tests (except for the improvement over the baseline of course). The new test [`card_scan_big_instances.java`](https://bugs.openjdk.org/secure/attachment/106702/card_scan_big_instances.java) fills the old generation with large (32K) non-array instances. It shows the costs of copying imprecise card marks or the complete card table: Baseline -------- $ H=64g; T=16 ; jdk-baseline/bin/java -Xms${H} -Xmx${H} -XX:+UseParallelGC -XX:ParallelGCThreads=$T -Xlog:gc=trace card_scan_big_instances 24 [0.005s][info][gc] Using Parallel BIG_INSTANCE_SIZE_BYTES:32768 (32K) bigInstancesCount:786432 [6.439s][trace][gc] GC(0) PSYoung generation size at maximum: 22369280K [6.439s][info ][gc] GC(0) Pause Young (Allocation Failure) 16384M->14095M(62805M) 2965.766ms ### System.gc [10.871s][trace][gc] GC(1) PSYoung generation size at maximum: 22369280K [10.871s][info ][gc] GC(1) Pause Young (System.gc()) 24926M->24600M(62805M) 2785.957ms [11.835s][info ][gc] GC(2) Pause Full (System.gc()) 24600M->24600M(62805M) 963.882ms ### System.gc done [14.442s][trace][gc] GC(3) PSYoung generation size at maximum: 22369280K [14.442s][info ][gc] GC(3) Pause Young (Allocation Failure) 40984M->24600M(62805M) 28.970ms [16.967s][trace][gc] GC(4) PSYoung generation size at maximum: 22369280K [16.967s][info ][gc] GC(4) Pause Young (Allocation Failure) 40984M->24600M(62805M) 30.074ms [19.490s][trace][gc] GC(5) PSYoung generation size at maximum: 22369280K [19.490s][info ][gc] GC(5) Pause Young (Allocation Failure) 40984M->24600M(62805M) 30.643ms Albert's first draft -------------------- $ H=64g; T=16 ; jdk-alb/bin/java -Xms${H} -Xmx${H} -XX:+UseParallelGC -XX:ParallelGCThreads=$T -Xlog:gc=trace card_scan_big_instances 24 [0.005s][info][gc] Using Parallel BIG_INSTANCE_SIZE_BYTES:32768 (32K) bigInstancesCount:786432 [6.410s][trace][gc] GC(0) PSYoung generation size at maximum: 22369280K [6.410s][info ][gc] GC(0) Pause Young (Allocation Failure) 16384M->14095M(62805M) 2957.403ms ### System.gc [10.734s][trace][gc] GC(1) PSYoung generation size at maximum: 22369280K [10.734s][info ][gc] GC(1) Pause Young (System.gc()) 24926M->24600M(62805M) 2656.551ms [11.713s][info ][gc] GC(2) Pause Full (System.gc()) 24600M->24600M(62805M) 978.694ms ### System.gc done [14.587s][trace][gc] GC(3) PSYoung generation size at maximum: 22369280K [14.587s][info ][gc] GC(3) Pause Young (Allocation Failure) 40984M->24600M(62805M) 74.870ms [17.132s][trace][gc] GC(4) PSYoung generation size at maximum: 22369280K [17.132s][info ][gc] GC(4) Pause Young (Allocation Failure) 40984M->24600M(62805M) 74.031ms [19.678s][trace][gc] GC(5) PSYoung generation size at maximum: 22369280K [19.678s][info ][gc] GC(5) Pause Young (Allocation Failure) 40984M->24600M(62805M) 71.357ms New --- $ H=64g; T=16 ; jdk-new/bin/java -Xms${H} -Xmx${H} -XX:+UseParallelGC -XX:ParallelGCThreads=$T -Xlog:gc=trace card_scan_big_instances 24 [0.006s][info][gc] Using Parallel BIG_INSTANCE_SIZE_BYTES:32768 (32K) bigInstancesCount:786432 [5.643s][trace][gc] GC(0) PSYoung generation size at maximum: 22369280K [5.643s][info ][gc] GC(0) Pause Young (Allocation Failure) 16384M->14095M(62805M) 2196.615ms ### System.gc [8.875s][trace][gc] GC(1) PSYoung generation size at maximum: 22369280K [8.875s][info ][gc] GC(1) Pause Young (System.gc()) 24926M->24601M(62805M) 1432.776ms [10.303s][info ][gc] GC(2) Pause Full (System.gc()) 24601M->24600M(62805M) 1428.142ms ### System.gc done [13.464s][trace][gc] GC(3) PSYoung generation size at maximum: 22369280K [13.464s][info ][gc] GC(3) Pause Young (Allocation Failure) 40984M->24600M(62805M) 106.833ms [15.929s][trace][gc] GC(4) PSYoung generation size at maximum: 22369280K [15.929s][info ][gc] GC(4) Pause Young (Allocation Failure) 40984M->24600M(62805M) 103.752ms [18.397s][trace][gc] GC(5) PSYoung generation size at maximum: 22369280K [18.397s][info ][gc] GC(5) Pause Young (Allocation Failure) 40984M->24600M(62805M) 106.153ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1748600398 From rkennke at openjdk.org Thu Oct 5 10:23:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 5 Oct 2023 10:23:27 GMT Subject: RFR: 8315880: change LockingMode default from LM_LEGACY to LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: <3pcyxBEm99WQE_cOsMPRu5sqFS9altVDvks4BqRw3nY=.37f3a180-f800-4b2c-bf4e-9e4d40e5e898@github.com> On Thu, 5 Oct 2023 10:09:33 GMT, Julian Waters wrote: > I think this may have broken JDK-8288293 ? > > Is there any compiler specific implementation involved with the new lightweight locking? Just a quick question, I can figure out the rest on my own Not that I am aware of. AFAIK, it works fine with various versions of GCC on Linux, Xcode on MacOS and VS on Windows. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15797#issuecomment-1748605692 From tschatzl at openjdk.org Thu Oct 5 10:35:10 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 5 Oct 2023 10:35:10 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: <_pmezp7zYPvMSNwaAHR71gZV3tnGp8vIu7LYYEqeCeM=.f8602dc1-ed70-4d17-9d32-4a36efc17c9b@github.com> On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. Good. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16012#pullrequestreview-1659504330 From tschatzl at openjdk.org Thu Oct 5 10:37:11 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 5 Oct 2023 10:37:11 GMT Subject: RFR: 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm In-Reply-To: References: Message-ID: <28X1pwXMd9BwfV7BaJ2xvspvvykSb95UTkw8lDUMSUA=.d8ccbcac-7642-459c-9331-5dbc99987c18@github.com> On Mon, 2 Oct 2023 11:48:18 GMT, Leo Korinth wrote: > Also add parallel flag for first invocation (this is a bug). Initial testing ok, but will run tiers before pushing. Seems good. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16009#pullrequestreview-1659507211 From tschatzl at openjdk.org Thu Oct 5 10:42:10 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 5 Oct 2023 10:42:10 GMT Subject: RFR: 8317318: Serial: Change GenCollectedHeap to SerialHeap in whitebox [v2] In-Reply-To: References: Message-ID: On Sat, 30 Sep 2023 17:27:42 GMT, Albert Mingkun Yang wrote: >> Use more precise type for Serial GC. I also added a ` ShouldNotReachHere()` there, because using serial-heap when serial-gc is not used seems problematic. > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > s1-prims lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15988#pullrequestreview-1659517772 From ayang at openjdk.org Thu Oct 5 12:37:10 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 5 Oct 2023 12:37:10 GMT Subject: RFR: 8317592: Serial: Remove Space::toContiguousSpace Message-ID: Simple removing unnecessary abstraction after using more precise type. ------------- Commit messages: - s1-remove-to-contiguous Changes: https://git.openjdk.org/jdk/pull/16054/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16054&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317592 Stats: 15 lines in 2 files changed: 0 ins; 11 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16054.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16054/head:pull/16054 PR: https://git.openjdk.org/jdk/pull/16054 From ayang at openjdk.org Thu Oct 5 12:43:40 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 5 Oct 2023 12:43:40 GMT Subject: RFR: 8317594: G1: Refactor find_empty_from_idx_reverse Message-ID: Simple range boundary value adjustment to remove some redundant operations inside/around this method. ------------- Commit messages: - g1-hr-manager Changes: https://git.openjdk.org/jdk/pull/16055/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16055&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317594 Stats: 22 lines in 2 files changed: 5 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/16055.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16055/head:pull/16055 PR: https://git.openjdk.org/jdk/pull/16055 From tschatzl at openjdk.org Thu Oct 5 14:24:31 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 5 Oct 2023 14:24:31 GMT Subject: RFR: 8317594: G1: Refactor find_empty_from_idx_reverse In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:34:12 GMT, Albert Mingkun Yang wrote: > Simple range boundary value adjustment to remove some redundant operations inside/around this method. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16055#pullrequestreview-1659929400 From tschatzl at openjdk.org Thu Oct 5 14:25:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 5 Oct 2023 14:25:00 GMT Subject: RFR: 8317592: Serial: Remove Space::toContiguousSpace In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:28:23 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary abstraction after using more precise type. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16054#pullrequestreview-1659931734 From rrich at openjdk.org Thu Oct 5 15:04:54 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 5 Oct 2023 15:04:54 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v13] In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 10:22:45 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: > > - Split work strictly at stripe boundaries > - Reset to master The difference to the baseline in the `card_scan_big_instances.java` test is < 5ms when the card mark copying to the stripes is done in parallel. It would be possible to improve this further: once a thread has completed its part of the copying it could begin scanning its stripes. Not sure if it's worth it. Testing: langtools:tier1 TEST_VM_OPTS="-XX:+UseParallelGC" hotspot:tier1 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1749053411 From rrich at openjdk.org Thu Oct 5 15:24:30 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 5 Oct 2023 15:24:30 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v14] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Parallel copying of imprecise marks to stripes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/817b164c..22fe8496 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=12-13 Stats: 61 lines in 3 files changed: 32 ins; 23 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From dcubed at openjdk.org Thu Oct 5 19:59:11 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 5 Oct 2023 19:59:11 GMT Subject: RFR: 8315880: change LockingMode default from LM_LEGACY to LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 20:47:13 GMT, Daniel D. Daugherty wrote: > Change the default of LockingMode to LM_LIGHTWEIGHT from LM_LEGACY. > > This fix has been tested with 3 Mach5 Tier[1-8] runs and a 4th is in process. That would be unexpected breakage indeed... ------------- PR Comment: https://git.openjdk.org/jdk/pull/15797#issuecomment-1749557286 From jwaters at openjdk.org Fri Oct 6 05:06:12 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 6 Oct 2023 05:06:12 GMT Subject: RFR: 8315880: change LockingMode default from LM_LEGACY to LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 19:56:00 GMT, Daniel D. Daugherty wrote: > That would be unexpected breakage indeed... Haha, indeed. Since this was just a default flag change, it's likely the entire thing was broken on my end to begin with, just that it never manifested since I never used the lightweight mode > > I think this may have broken JDK-8288293 ? > > Is there any compiler specific implementation involved with the new lightweight locking? Just a quick question, I can figure out the rest on my own > > Not that I am aware of. AFAIK, it works fine with various versions of GCC on Linux, Xcode on MacOS and VS on Windows. Not with gcc on Windows, unfortunately. Looks like I have some work cut out for me. Thanks for the replies ------------- PR Comment: https://git.openjdk.org/jdk/pull/15797#issuecomment-1749988003 From rrich at openjdk.org Fri Oct 6 11:17:12 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 6 Oct 2023 11:17:12 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v15] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: - Missed acquire semantics - Overlap scavenge with pre-scavenge ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/22fe8496..d845e650 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=13-14 Stats: 114 lines in 3 files changed: 80 ins; 26 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 6 11:17:25 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 6 Oct 2023 11:17:25 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v14] In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 15:24:30 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Parallel copying of imprecise marks to stripes I think it would be possible to combine the two approaches: we would have a read only copy of the card table only for the current stripe. This would reduce the required extra memory to just `num_cards_in_stripe * active_workers` (128 * active_workers) bytes. The readonly copy could be a local variable (on stack) in `scavenge_contents_parallel`. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1750252198 From coleenp at openjdk.org Fri Oct 6 11:51:46 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Oct 2023 11:51:46 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 17:19:35 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that fixes lock ranking after recent changes to the code root set, now using a CHT. > > The issue came up because the lock rank of the CHT lock has been larger than the rank of the Servicethread_lock where it is possible that code roots can be added. > > The suggested solution is to fix up the lock rankings to work; actually this PR contains two variants: > 1) one that statically sets the lock ranks of the CHT lock (and the ThreadSMR_lock that can be used during CHT operation) to something smaller than Servicethread_lock. > 2) one that allows setting of the CHT lock rank via parameter as well (the last commit changed the code to variant 1). > > The other lock ranking changes to Metaspace_lock and ContinuationRelativize_lock are simply undos of the respective changes in [JDK-8315503](https://bugs.openjdk.org/browse/JDK-8315503). > > Testing: tier1-8 for variant 2), tier 1-7 for variant 1) > > Thanks, > Thomas Variant 1 seems ok. Uses of the CHT shouldn't take locks, so having a low lock ranking for CHT lock seems like it'll be fine (I can't find where it takes the ThreadsSMRDelete_lock). If any of this breaks, we can try approach #2 next. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16062#pullrequestreview-1661696681 From iwalulya at openjdk.org Fri Oct 6 11:58:44 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 6 Oct 2023 11:58:44 GMT Subject: RFR: 8317343: GC: Make TestHeapFreeRatio use createTestJvm In-Reply-To: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> References: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> Message-ID: On Mon, 2 Oct 2023 09:57:55 GMT, Leo Korinth wrote: > This fix is implicitly dependent on https://github.com/openjdk/jdk/pull/15986/files for `@requires opt.x` support. Initial testing passes, but I will do more (tier) testing before pushing. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16007#pullrequestreview-1661706197 From iwalulya at openjdk.org Fri Oct 6 11:58:47 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 6 Oct 2023 11:58:47 GMT Subject: RFR: 8317318: Serial: Change GenCollectedHeap to SerialHeap in whitebox [v2] In-Reply-To: References: Message-ID: On Sat, 30 Sep 2023 17:27:42 GMT, Albert Mingkun Yang wrote: >> Use more precise type for Serial GC. I also added a ` ShouldNotReachHere()` there, because using serial-heap when serial-gc is not used seems problematic. > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > s1-prims Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15988#pullrequestreview-1661705568 From iwalulya at openjdk.org Fri Oct 6 12:00:18 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 6 Oct 2023 12:00:18 GMT Subject: RFR: 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 11:48:18 GMT, Leo Korinth wrote: > Also add parallel flag for first invocation (this is a bug). Initial testing ok, but will run tiers before pushing. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16009#pullrequestreview-1661707800 From iwalulya at openjdk.org Fri Oct 6 12:01:57 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 6 Oct 2023 12:01:57 GMT Subject: RFR: 8317592: Serial: Remove Space::toContiguousSpace In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:28:23 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary abstraction after using more precise type. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16054#pullrequestreview-1661710477 From ayang at openjdk.org Fri Oct 6 12:18:07 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:18:07 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v15] In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 11:17:12 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: > > - Missed acquire semantics > - Overlap scavenge with pre-scavenge I find pre-processing card-table removes much complexity in determining which (part of) obj belongs to current stripe. However, synchronizing with actual scavenging introduce some complexity. The fact that `find_first_clean_card` copies the cached-obj-start is easy to miss and hard to reason IMO. > we would have a read only copy of the card table only for the current stripe. It would still require pre-processing card-table, right? Otherwise, I don't see how one can work around the "interference" across stripes. Maybe this can simplify the impl of `find_first_clean_card`. I am not too concerned about the regression observed for "large (32K) non-array instances", because that pattern is not common in java and the pause-time is still reasonable (<100ms). The long-term optimization (or the redemption of the extra-mem-requirement) I have in mind is to use 1 bit (instead of 1 byte) for a card -- Parallel requires only a boolean info for a particular card. One can even pre-alloc two card-tables now that each card-table is 1/8 of its original size, to avoid calling malloc inside young-gc-pause. My preference is some simple code without much regression. Ofc, this is quite subjective. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1750541087 From ayang at openjdk.org Fri Oct 6 12:19:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:19:57 GMT Subject: RFR: 8317592: Serial: Remove Space::toContiguousSpace In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:28:23 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary abstraction after using more precise type. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16054#issuecomment-1750542055 From ayang at openjdk.org Fri Oct 6 12:19:58 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:19:58 GMT Subject: Integrated: 8317592: Serial: Remove Space::toContiguousSpace In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:28:23 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary abstraction after using more precise type. This pull request has now been integrated. Changeset: 691db5df Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/691db5df73a48cf7d78cb6b5f5085a3219baca50 Stats: 15 lines in 2 files changed: 0 ins; 11 del; 4 mod 8317592: Serial: Remove Space::toContiguousSpace Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16054 From ayang at openjdk.org Fri Oct 6 12:20:49 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:20:49 GMT Subject: RFR: 8317318: Serial: Change GenCollectedHeap to SerialHeap in whitebox [v2] In-Reply-To: References: Message-ID: On Sat, 30 Sep 2023 17:27:42 GMT, Albert Mingkun Yang wrote: >> Use more precise type for Serial GC. I also added a ` ShouldNotReachHere()` there, because using serial-heap when serial-gc is not used seems problematic. > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > s1-prims Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15988#issuecomment-1750541862 From ayang at openjdk.org Fri Oct 6 12:20:50 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:20:50 GMT Subject: Integrated: 8317318: Serial: Change GenCollectedHeap to SerialHeap in whitebox In-Reply-To: References: Message-ID: On Fri, 29 Sep 2023 14:47:19 GMT, Albert Mingkun Yang wrote: > Use more precise type for Serial GC. I also added a ` ShouldNotReachHere()` there, because using serial-heap when serial-gc is not used seems problematic. This pull request has now been integrated. Changeset: b3cc0c84 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/b3cc0c84316dd59f406a6fa23fcaf3d029910843 Stats: 11 lines in 1 file changed: 8 ins; 1 del; 2 mod 8317318: Serial: Change GenCollectedHeap to SerialHeap in whitebox Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/15988 From ayang at openjdk.org Fri Oct 6 12:33:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:33:45 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 17:19:35 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that fixes lock ranking after recent changes to the code root set, now using a CHT. > > The issue came up because the lock rank of the CHT lock has been larger than the rank of the Servicethread_lock where it is possible that code roots can be added. > > The suggested solution is to fix up the lock rankings to work; actually this PR contains two variants: > 1) one that statically sets the lock ranks of the CHT lock (and the ThreadSMR_lock that can be used during CHT operation) to something smaller than Servicethread_lock. > 2) one that allows setting of the CHT lock rank via parameter as well (the last commit changed the code to variant 1). > > The other lock ranking changes to Metaspace_lock and ContinuationRelativize_lock are simply undos of the respective changes in [JDK-8315503](https://bugs.openjdk.org/browse/JDK-8315503). > > Testing: tier1-8 for variant 2), tier 1-7 for variant 1) > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16062#pullrequestreview-1661761085 From ayang at openjdk.org Fri Oct 6 12:38:39 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 6 Oct 2023 12:38:39 GMT Subject: RFR: 8317675: Serial: Move gc/shared/generation to serial folder Message-ID: Simple moving files/renamings. ------------- Commit messages: - s1-generation Changes: https://git.openjdk.org/jdk/pull/16072/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16072&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317675 Stats: 20 lines in 12 files changed: 8 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16072.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16072/head:pull/16072 PR: https://git.openjdk.org/jdk/pull/16072 From tschatzl at openjdk.org Fri Oct 6 12:51:08 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 6 Oct 2023 12:51:08 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 11:49:21 GMT, Coleen Phillimore wrote: > Variant 1 seems ok. Uses of the CHT shouldn't take locks, so having a low lock ranking for CHT lock seems like it'll be fine (I can't find where it takes the ThreadsSMRDelete_lock). If any of this breaks, we can try approach #2 next. Thread lists synchronization in `GlobalCounter::write_synchronize()` uses the `ThreadsSMRDelete_lock` via `JavaThreadIteratorWithHandle`->`ThreadsListHandle`->`SafeThreadsListPtr` in the destructor ... ------------- PR Comment: https://git.openjdk.org/jdk/pull/16062#issuecomment-1750614008 From zgu at openjdk.org Fri Oct 6 13:37:00 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Fri, 6 Oct 2023 13:37:00 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs Message-ID: Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. This patch is intended to enable `OopMapCache` for concurrent GCs. Test: tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. ------------- Commit messages: - Fix merge conflicts - Merge - cleanup - Merge branch 'master' into JDK-8317466 - Cleanup - Merge branch 'master' into oopmapcache_for_concurrent_root_scan - v2 - v1 - v0 - 8317240: Promptly free OopMapEntry after fail to insert the entry to OopMapCache Changes: https://git.openjdk.org/jdk/pull/16074/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16074&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317466 Stats: 56 lines in 10 files changed: 20 ins; 15 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/16074.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16074/head:pull/16074 PR: https://git.openjdk.org/jdk/pull/16074 From lmesnik at openjdk.org Fri Oct 6 16:19:08 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 6 Oct 2023 16:19:08 GMT Subject: RFR: 8316608: Enable parallelism in vmTestbase/gc/vector tests In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 14:02:09 GMT, Soumadipta Roy wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4` tests. in `vmTestbase/gc/vector` tests. > > Below are the before and after run comparisons: > > # Fastdebug > before: 3480.71s user 83.97s system 830% cpu 7:09.41 total > after: 2214.52s user 63.19s system 2374% cpu 1:35.94 total > > # Release > before: 1369.61s user 147.03s system 371% cpu 6:48.63 total > after: 1130.28s user 110.97s system 2478% cpu 50.089 total testing didn't show any issues ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16028#pullrequestreview-1662232566 From iwalulya at openjdk.org Sat Oct 7 08:58:32 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Sat, 7 Oct 2023 08:58:32 GMT Subject: RFR: 8317594: G1: Refactor find_empty_from_idx_reverse In-Reply-To: References: Message-ID: <7HiwT6WwWBRZTatZKunKqaUp0emZLdZiqp2LcWN5r-0=.08e945b6-bdcf-40ef-8405-1a270085e634@github.com> On Thu, 5 Oct 2023 12:34:12 GMT, Albert Mingkun Yang wrote: > Simple range boundary value adjustment to remove some redundant operations inside/around this method. LGTM! Nit: Variable renaming `cur` to `i` results in more modifications than necessary, don't see the motivation for such renaming. ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16055#pullrequestreview-1663026170 From mli at openjdk.org Sat Oct 7 20:12:02 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 7 Oct 2023 20:12:02 GMT Subject: RFR: 8317675: Serial: Move gc/shared/generation to serial folder In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 12:30:14 GMT, Albert Mingkun Yang wrote: > Simple moving files/renamings. LGTM ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16072#pullrequestreview-1663104787 From mli at openjdk.org Sat Oct 7 20:13:14 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 7 Oct 2023 20:13:14 GMT Subject: RFR: 8314259: G1: Fix -Wconversion warnings of double to uint in G1CardSetConfiguration In-Reply-To: <9cfdAUfFRfTMkU4ztmJ5VcVQVARPi44PYGkRR_8gFCY=.b9064371-9d80-43af-85ae-d1ca88e1c776@github.com> References: <9cfdAUfFRfTMkU4ztmJ5VcVQVARPi44PYGkRR_8gFCY=.b9064371-9d80-43af-85ae-d1ca88e1c776@github.com> Message-ID: On Tue, 15 Aug 2023 08:34:44 GMT, Albert Mingkun Yang wrote: > Adding explicit double to int conversion. Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15284#pullrequestreview-1663104870 From tschatzl at openjdk.org Mon Oct 9 08:28:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 9 Oct 2023 08:28:32 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 11:49:21 GMT, Coleen Phillimore wrote: >> Hi all, >> >> please review this change that fixes lock ranking after recent changes to the code root set, now using a CHT. >> >> The issue came up because the lock rank of the CHT lock has been larger than the rank of the Servicethread_lock where it is possible that code roots can be added. >> >> The suggested solution is to fix up the lock rankings to work; actually this PR contains two variants: >> 1) one that statically sets the lock ranks of the CHT lock (and the ThreadSMR_lock that can be used during CHT operation) to something smaller than Servicethread_lock. >> 2) one that allows setting of the CHT lock rank via parameter as well (the last commit changed the code to variant 1). >> >> The other lock ranking changes to Metaspace_lock and ContinuationRelativize_lock are simply undos of the respective changes in [JDK-8315503](https://bugs.openjdk.org/browse/JDK-8315503). >> >> Testing: tier1-8 for variant 2), tier 1-7 for variant 1) >> >> Thanks, >> Thomas > > Variant 1 seems ok. Uses of the CHT shouldn't take locks, so having a low lock ranking for CHT lock seems like it'll be fine (I can't find where it takes the ThreadsSMRDelete_lock). If any of this breaks, we can try approach #2 next. Thanks @coleenp @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/16062#issuecomment-1752548969 From tschatzl at openjdk.org Mon Oct 9 08:31:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 9 Oct 2023 08:31:41 GMT Subject: Integrated: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 17:19:35 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that fixes lock ranking after recent changes to the code root set, now using a CHT. > > The issue came up because the lock rank of the CHT lock has been larger than the rank of the Servicethread_lock where it is possible that code roots can be added. > > The suggested solution is to fix up the lock rankings to work; actually this PR contains two variants: > 1) one that statically sets the lock ranks of the CHT lock (and the ThreadSMR_lock that can be used during CHT operation) to something smaller than Servicethread_lock. > 2) one that allows setting of the CHT lock rank via parameter as well (the last commit changed the code to variant 1). > > The other lock ranking changes to Metaspace_lock and ContinuationRelativize_lock are simply undos of the respective changes in [JDK-8315503](https://bugs.openjdk.org/browse/JDK-8315503). > > Testing: tier1-8 for variant 2), tier 1-7 for variant 1) > > Thanks, > Thomas This pull request has now been integrated. Changeset: 0cf1a558 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/0cf1a558bacf18d9fc41e43fb5e9eba39dc51f2e Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 Reviewed-by: coleenp, ayang ------------- PR: https://git.openjdk.org/jdk/pull/16062 From lkorinth at openjdk.org Mon Oct 9 09:17:44 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 9 Oct 2023 09:17:44 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm In-Reply-To: References: Message-ID: <49kT79BxjQG9RM9Y96eyATKoBZRRqPuO0F1wR7SsFoY=.2418dfbf-fd86-4c5e-bb87-a9ab994e8088@github.com> On Fri, 29 Sep 2023 13:44:26 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. I have tested this with tier1-5 on x86. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15986#issuecomment-1752585262 From ayang at openjdk.org Mon Oct 9 10:42:43 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 9 Oct 2023 10:42:43 GMT Subject: RFR: 8317594: G1: Refactor find_empty_from_idx_reverse In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:34:12 GMT, Albert Mingkun Yang wrote: > Simple range boundary value adjustment to remove some redundant operations inside/around this method. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16055#issuecomment-1752761563 From ayang at openjdk.org Mon Oct 9 10:42:44 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 9 Oct 2023 10:42:44 GMT Subject: Integrated: 8317594: G1: Refactor find_empty_from_idx_reverse In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 12:34:12 GMT, Albert Mingkun Yang wrote: > Simple range boundary value adjustment to remove some redundant operations inside/around this method. This pull request has now been integrated. Changeset: a57ae7e7 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/a57ae7e7d4c84b012e4a3533f316c4e7e6f99bb7 Stats: 22 lines in 2 files changed: 5 ins; 0 del; 17 mod 8317594: G1: Refactor find_empty_from_idx_reverse Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16055 From sjohanss at openjdk.org Mon Oct 9 10:59:33 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Mon, 9 Oct 2023 10:59:33 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. >From what I can tell this test now fails when run with for example `-Xmn1g`. So I'm not sure if it really is a good idea to change this test to use `createTestJvm`. We could of course add this option to the requires list, but there are more options that can cause failures as well. ------------- Changes requested by sjohanss (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16012#pullrequestreview-1664196844 From coleenp at openjdk.org Mon Oct 9 14:09:02 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Oct 2023 14:09:02 GMT Subject: RFR: 8317730: Change byte_size to return size_t In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 12:57:34 GMT, Albert Mingkun Yang wrote: > Simple signature update to `byte_size` to match expectation from callers. This looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16100#pullrequestreview-1664533639 From iwalulya at openjdk.org Mon Oct 9 15:22:55 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 9 Oct 2023 15:22:55 GMT Subject: RFR: 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping Message-ID: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> Hi all, Please review this small change to return the maximum allowed TLAB size from `unsafe_max_tlab_alloc` in case the current region does not have enough free space to allocate MinTLABSize. We assume that the next TLAB allocation will happen in a new region. Testing: Tier 1-3. Thanks ------------- Commit messages: - initial Changes: https://git.openjdk.org/jdk/pull/16102/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16102&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8170817 Stats: 8 lines in 1 file changed: 5 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16102/head:pull/16102 PR: https://git.openjdk.org/jdk/pull/16102 From rrich at openjdk.org Mon Oct 9 15:34:05 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 9 Oct 2023 15:34:05 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v15] In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 12:15:40 GMT, Albert Mingkun Yang wrote: > I find pre-processing card-table removes much complexity in determining which (part of) obj belongs to current stripe. However, synchronizing with actual scavenging introduce some complexity. The complexity for synchronization is not too bad though. Also it only comes from overlapping card table preprocessing with scavenging. I think this could be removed again without loosing performance. > The fact that `find_first_clean_card` copies the cached-obj-start is easy to miss Yes, it is easy to miss. I thought it was a minor detail anyway. > and hard to reason IMO. It could be passed by reference if the query in `process_range` would be pulled up before the `find_first_clean_card` call. Let me know if you think that was better. > > we would have a read only copy of the card table only for the current stripe. > > It would still require pre-processing card-table, right? Otherwise, I don't see how one can work around the "interference" across stripes. Maybe this can simplify the impl of `find_first_clean_card`. That's correct. The implementation should be straight forward. I think I'll experiment with it. > > I am not too concerned about the regression observed for "large (32K) non-array instances", because that pattern is not common in java and the pause-time is still reasonable (<100ms). Agreed. > The long-term optimization (or the redemption of the extra-mem-requirement) I have in mind is to use 1 bit (instead of 1 byte) for a card -- Parallel requires only a boolean info for a particular card. One can even pre-alloc two card-tables now that each card-table is 1/8 of its original size, to avoid calling malloc inside young-gc-pause. > > My preference is some simple code without much regression. Ofc, this is quite subjective. Sure. My first preference would be that the change can be backported. We were discussing internally if the increased memory consumption could be an issue. Since environments that are sensitive to this either configure serial or g1 we thought it could be ok. At least from our point of view. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1753232330 From rrich at openjdk.org Mon Oct 9 15:42:02 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 9 Oct 2023 15:42:02 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v16] In-Reply-To: References: Message-ID: <7MCfxKnwPdEgJ_bTJ6T-WGBaiUTm3v_zNuED3OdbNP0=.beb49d29-ad0e-40ab-a0f8-0fff5373dbc4@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: - find_first_clean_card: return end_card if final object extends beyond it. - Cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/d845e650..272ab97b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=14-15 Stats: 6 lines in 1 file changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From wkemper at openjdk.org Mon Oct 9 16:55:37 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 9 Oct 2023 16:55:37 GMT Subject: RFR: 8317535 Shenandoah: Remove unused code Message-ID: Tested with `hotspot_gc_shenandoah`, `specjbb`, `specjvm`, `dacapo`, `extremem` and `heapothesys`. ------------- Commit messages: - Merge upstream - Remove unused code and other minor fixes Changes: https://git.openjdk.org/jdk/pull/16104/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16104&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317535 Stats: 185 lines in 20 files changed: 0 ins; 179 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16104.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16104/head:pull/16104 PR: https://git.openjdk.org/jdk/pull/16104 From kbarrett at openjdk.org Mon Oct 9 18:41:01 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 9 Oct 2023 18:41:01 GMT Subject: RFR: 8317730: Change byte_size to return size_t In-Reply-To: References: Message-ID: <2Lc8rkVmxtf8GLdNC2tKWRnoBin5TixdrtBKNiWHm6U=.ad5dceb1-5155-44f1-929a-d1fd4fc45335@github.com> On Mon, 9 Oct 2023 12:57:34 GMT, Albert Mingkun Yang wrote: > Simple signature update to `byte_size` to match expectation from callers. Looks good. I checked all the uses, and they all are dealing with size_t. Thanks for spotting and fixing. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16100#pullrequestreview-1665088255 From duke at openjdk.org Mon Oct 9 18:59:08 2023 From: duke at openjdk.org (Soumadipta Roy) Date: Mon, 9 Oct 2023 18:59:08 GMT Subject: Integrated: 8316608: Enable parallelism in vmTestbase/gc/vector tests In-Reply-To: References: Message-ID: On Tue, 3 Oct 2023 14:02:09 GMT, Soumadipta Roy wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4` tests. in `vmTestbase/gc/vector` tests. > > Below are the before and after run comparisons: > > # Fastdebug > before: 3480.71s user 83.97s system 830% cpu 7:09.41 total > after: 2214.52s user 63.19s system 2374% cpu 1:35.94 total > > # Release > before: 1369.61s user 147.03s system 371% cpu 6:48.63 total > after: 1130.28s user 110.97s system 2478% cpu 50.089 total This pull request has now been integrated. Changeset: f61499c7 Author: Soumadipta Roy Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/f61499c73fe03e2e3680d7f58a84183364c5c5ac Stats: 299 lines in 13 files changed: 0 ins; 299 del; 0 mod 8316608: Enable parallelism in vmTestbase/gc/vector tests Reviewed-by: shade, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/16028 From shade at openjdk.org Mon Oct 9 21:03:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 9 Oct 2023 21:03:23 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC Message-ID: See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. Additional testing: - [ ] Linux x86_64 fastdebug `tier1 tier2 tier3` ------------- Commit messages: - Keep the cast - Fix Changes: https://git.openjdk.org/jdk/pull/16107/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16107&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317755 Stats: 161 lines in 2 files changed: 158 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16107.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16107/head:pull/16107 PR: https://git.openjdk.org/jdk/pull/16107 From ayang at openjdk.org Tue Oct 10 09:45:13 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Oct 2023 09:45:13 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v15] In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 15:31:22 GMT, Richard Reingruber wrote: > Also it only comes from overlapping card table preprocessing with scavenging. I think this could be removed again without loosing performance. That complexity is uncalled for if its benefit is marginal. > It could be passed by reference if the query in process_range would be pulled up before the find_first_clean_card call. > The implementation should be straight forward. I think I'll experiment with it. Could it be updated to not query object-start? That would remove much complexity inside that method. Additionally, I wonder if the scanning-dirty-chunk iteration can be simplified a bit: the num of calls to `scan_obj_with_limit` seems excessive and it's not obvious whether it's intended or not that `continue` skips `drain_stacks_cond_depth`). If so, dirtying-first-card-inside-a-stripe probably strikes the best balance btw complexity and performance/mem-overhead for now. Otherwise, I prefer shadow-card-table for its simplicity and the mem-overhead issue can be addressed later on. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1754833838 From tschatzl at openjdk.org Tue Oct 10 11:20:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Oct 2023 11:20:55 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: <7S39vb4eh2w9761PKaTUG5er4IYi_ZqOOWR4cYukZB4=.3f6c8780-5b96-4ce6-92d4-56d2aa03a12b@github.com> On Mon, 9 Oct 2023 20:46:44 GMT, Aleksey Shipilev wrote: > See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` Initial comments when looking at it: >From what I understand from the CR description, the VM is not completely idle but has GCs now and then which do not (or almost do not) increase old gen occupancy. So the problem seems to be that the existing policy for the periodic gcs does not detect idleness (and the request to give back memory) in this case. The JEP mentions the `G1PeriodicGCSystemLoadThreshold` option to improve upon that. Maybe use of it is an option in this case? This is probably a significant behavioral change which I think is unintended: if the user enabled to use execution of a full gc instead of a concurrent marking - then instead of chugging along if there is at least some activity, there would be regular full gcs now regardless of "being idle" (depending on how you see idle; very long periodis of only young gcs can be induced by e.g. an overly large heap). I do not think this is expected, even if the user asked for full gcs for periodic gcs. Because this topic about periodic gcs has come up a few times, the RFE does not describe the actual situation you are in, so it would be interesting to have more information. For example, would it be acceptable to have some kind of interaction with the VM to set e.g. `SoftMaxHeapSize` at idle? It is kind of hard to cover all "idle" situations and this has been part of the discussion in the original JEP, and this change seems to try to fix this with the imo fairly crude hammer of "guarantee a whole heap analysis" regularly instead. Which is not necessarily incompatible with the current mechanism (i.e. an either-or situation). There may be other aspects of this suggestion. Formal things: This is a significant behavioral change which needs a CSR imo. The referenced JEP exactly defines the expected behavior to listen to young gcs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1755080550 From tschatzl at openjdk.org Tue Oct 10 11:21:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Oct 2023 11:21:00 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: <7S39vb4eh2w9761PKaTUG5er4IYi_ZqOOWR4cYukZB4=.3f6c8780-5b96-4ce6-92d4-56d2aa03a12b@github.com> References: <7S39vb4eh2w9761PKaTUG5er4IYi_ZqOOWR4cYukZB4=.3f6c8780-5b96-4ce6-92d4-56d2aa03a12b@github.com> Message-ID: On Tue, 10 Oct 2023 11:09:11 GMT, Thomas Schatzl wrote: > Formal things: This is a significant behavioral change which needs a CSR imo. The referenced JEP exactly defines the expected behavior to listen to young gcs. Note that this requirement can be removed again if there are significant changes, so probably don't start writing yet... however the current change as is would need one imo. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1755091876 From ayang at openjdk.org Tue Oct 10 11:55:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Oct 2023 11:55:27 GMT Subject: RFR: 8317730: Change byte_size to return size_t In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 12:57:34 GMT, Albert Mingkun Yang wrote: > Simple signature update to `byte_size` to match expectation from callers. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16100#issuecomment-1755200730 From ayang at openjdk.org Tue Oct 10 12:00:02 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Oct 2023 12:00:02 GMT Subject: Integrated: 8317730: Change byte_size to return size_t In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 12:57:34 GMT, Albert Mingkun Yang wrote: > Simple signature update to `byte_size` to match expectation from callers. This pull request has now been integrated. Changeset: fb4098ff Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/fb4098ff1a7cca5ec42600f9ab753681961bb1ad Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8317730: Change byte_size to return size_t Reviewed-by: coleenp, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/16100 From ayang at openjdk.org Tue Oct 10 12:02:58 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Oct 2023 12:02:58 GMT Subject: RFR: 8317797: G1: Remove unimplemented predict_will_fit Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/16116/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16116&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317797 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16116.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16116/head:pull/16116 PR: https://git.openjdk.org/jdk/pull/16116 From tschatzl at openjdk.org Tue Oct 10 12:48:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Oct 2023 12:48:50 GMT Subject: RFR: 8317797: G1: Remove unimplemented predict_will_fit In-Reply-To: References: Message-ID: On Tue, 10 Oct 2023 11:53:29 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16116#pullrequestreview-1667576382 From ayang at openjdk.org Wed Oct 11 09:25:24 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 09:25:24 GMT Subject: RFR: 8317797: G1: Remove unimplemented predict_will_fit In-Reply-To: References: Message-ID: On Tue, 10 Oct 2023 11:53:29 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16116#issuecomment-1757239181 From ayang at openjdk.org Wed Oct 11 09:25:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 09:25:25 GMT Subject: Integrated: 8317797: G1: Remove unimplemented predict_will_fit In-Reply-To: References: Message-ID: On Tue, 10 Oct 2023 11:53:29 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 731fb4ee Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/731fb4eea21ab67d90970d7c6107fb0a4fbee9ec Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod 8317797: G1: Remove unimplemented predict_will_fit Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16116 From rrich at openjdk.org Wed Oct 11 09:36:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 09:36:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: References: Message-ID: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Shadow table per stripe ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/272ab97b..8b544d84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=15-16 Stats: 273 lines in 2 files changed: 110 ins; 124 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Wed Oct 11 09:36:52 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 09:36:52 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v15] In-Reply-To: References: Message-ID: On Tue, 10 Oct 2023 09:42:00 GMT, Albert Mingkun Yang wrote: > > Also it only comes from overlapping card table preprocessing with scavenging. I think this could be removed again without loosing performance. > > That complexity is uncalled for if its benefit is marginal. I'll remove it. > > It could be passed by reference if the query in process_range would be pulled up before the find_first_clean_card call. > > The implementation should be straight forward. I think I'll experiment with it. > > Could it be updated to not query object-start? That would remove much complexity inside that method. > > Additionally, I wonder if the scanning-dirty-chunk iteration can be simplified a bit [...] Probably. I though it would be a good idea (for performance and clearity) to strucure the processing 1. objects reaching in, 2. objects contained in, 3. objects reaching out of the dirty chunk. I found now that it's neither necessary for performance nor is it helping to better understand the code. I'll push a new version that's supposed to look very much like yours, except it does the card table preprocessing and keeps a shadow copy of the card table entries corresponding to the current stripe on stack (so not malloc'ed). I think it would be a good base for further enhancements you have on your mind but also good to be backported. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1757252721 From rrich at openjdk.org Wed Oct 11 09:36:54 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 09:36:54 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v16] In-Reply-To: <7MCfxKnwPdEgJ_bTJ6T-WGBaiUTm3v_zNuED3OdbNP0=.beb49d29-ad0e-40ab-a0f8-0fff5373dbc4@github.com> References: <7MCfxKnwPdEgJ_bTJ6T-WGBaiUTm3v_zNuED3OdbNP0=.beb49d29-ad0e-40ab-a0f8-0fff5373dbc4@github.com> Message-ID: <8os7ovPJW5cmNTfhaqsH-oiyaNj2K21U8LT_JPzV_-Q=.9bc2818b-d6b8-4914-a27c-0b6851a572ee@github.com> On Mon, 9 Oct 2023 15:42:02 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision: > > - find_first_clean_card: return end_card if final object extends beyond it. > - Cleanup Tested https://github.com/openjdk/jdk/pull/14846/commits/8b544d84da282c9f0f86d3f275c6688baac91da8 with langtools:tier1 TEST_VM_OPTS="-XX:+UseParallelGC" jdk:tier1 TEST_VM_OPTS="-XX:+UseParallelGC" hotspot:tier1 card_scan* tests ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1757255124 From ayang at openjdk.org Wed Oct 11 11:15:17 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 11:15:17 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> Message-ID: <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> On Wed, 11 Oct 2023 09:36:50 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Shadow table per stripe > I think it would be a good base for further enhancements you have on your mind but also good to be backported. Agree. Card-scanning logic is quite clear now. src/hotspot/share/gc/parallel/psCardTable.cpp line 228: > 226: scan_obj_with_limit(pm, obj.obj, addr_l, end); > 227: } > 228: } I wonder if the else branch can be replaced by: if (obj.addr < i_addr && obj.addr > start) { // already-scanned } else { scan_obj_with_limit(pm, obj.obj, addr_l, end); } ------------- PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1670870614 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1354760039 From tschatzl at openjdk.org Wed Oct 11 12:01:08 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 11 Oct 2023 12:01:08 GMT Subject: RFR: 8317675: Serial: Move gc/shared/generation to serial folder In-Reply-To: References: Message-ID: <-8kcTDDHu1mlJT54ScjdqCKifwpgtU0a3liS65_EkSo=.13a106ad-1936-4310-81ec-7dc4160204c3@github.com> On Fri, 6 Oct 2023 12:30:14 GMT, Albert Mingkun Yang wrote: > Simple moving files/renamings. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16072#pullrequestreview-1671002516 From tschatzl at openjdk.org Wed Oct 11 12:02:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 11 Oct 2023 12:02:18 GMT Subject: RFR: 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping In-Reply-To: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> References: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> Message-ID: On Mon, 9 Oct 2023 15:14:42 GMT, Ivan Walulya wrote: > Hi all, > > Please review this small change to return the maximum allowed TLAB size from `unsafe_max_tlab_alloc` in case the current region does not have enough free space to allocate MinTLABSize. We assume that the next TLAB allocation will happen in a new region. > > Testing: Tier 1-3. > > Thanks Lgtm. Thanks for the documentation ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16102#pullrequestreview-1671003439 From ayang at openjdk.org Wed Oct 11 14:32:03 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 14:32:03 GMT Subject: RFR: 8317675: Serial: Move gc/shared/generation to serial folder In-Reply-To: References: Message-ID: <8sD8WrfCjM0fbk--8aO0122sWY_5UVmiugbZXenqHNI=.cc059b8b-431d-4f54-bff8-e919f3505ac6@github.com> On Fri, 6 Oct 2023 12:30:14 GMT, Albert Mingkun Yang wrote: > Simple moving files/renamings. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16072#issuecomment-1757818459 From ayang at openjdk.org Wed Oct 11 14:37:10 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 14:37:10 GMT Subject: Integrated: 8317675: Serial: Move gc/shared/generation to serial folder In-Reply-To: References: Message-ID: <4r8D0F5qesa7eDzVMiGaiQ7P-e2_fK_b_RJ4Z45EC4M=.2bb67236-38c5-4fac-a44c-64ee9e68bb05@github.com> On Fri, 6 Oct 2023 12:30:14 GMT, Albert Mingkun Yang wrote: > Simple moving files/renamings. This pull request has now been integrated. Changeset: 8a9c4d52 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/8a9c4d5266bd40962e388ca666a9879fa317e5f5 Stats: 20 lines in 12 files changed: 8 ins; 8 del; 4 mod 8317675: Serial: Move gc/shared/generation to serial folder Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16072 From rrich at openjdk.org Wed Oct 11 15:02:32 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 15:02:32 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> Message-ID: On Wed, 11 Oct 2023 11:09:48 GMT, Albert Mingkun Yang wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> Shadow table per stripe > > src/hotspot/share/gc/parallel/psCardTable.cpp line 228: > >> 226: scan_obj_with_limit(pm, obj.obj, addr_l, end); >> 227: } >> 228: } > > I wonder if the else branch can be replaced by: > > > if (obj.addr < i_addr && obj.addr > start) { > // already-scanned > } else { > scan_obj_with_limit(pm, obj.obj, addr_l, end); > } You mean to replace L220-L227, right? There we know `obj.addr >= addr_l`. `obj.addr < i_addr` cannot be true then IMO. It would overlap with an objArray if `i_addr` was set there. If `i_addr` was set at the end of a non-objArray then objects starting before `i_addr` cannot start in or after `addr_l`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1355164575 From ayang at openjdk.org Wed Oct 11 15:33:03 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 15:33:03 GMT Subject: RFR: 8317963: Serial: Remove unused GenerationIsInReservedClosure Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/16153/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16153&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317963 Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16153.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16153/head:pull/16153 PR: https://git.openjdk.org/jdk/pull/16153 From kbarrett at openjdk.org Wed Oct 11 15:45:45 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 11 Oct 2023 15:45:45 GMT Subject: RFR: 8317963: Serial: Remove unused GenerationIsInReservedClosure In-Reply-To: References: Message-ID: <3OmPOGzSoNMQEDvWbkNxJGwyrCOj2h-XyPun53tonsA=.09006e73-3026-4714-8ea4-ffa0a5427d9a@github.com> On Wed, 11 Oct 2023 15:22:03 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Looks good, and trivial. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16153#pullrequestreview-1671665616 From ayang at openjdk.org Wed Oct 11 16:26:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Oct 2023 16:26:27 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> Message-ID: On Wed, 11 Oct 2023 14:59:29 GMT, Richard Reingruber wrote: >> src/hotspot/share/gc/parallel/psCardTable.cpp line 228: >> >>> 226: scan_obj_with_limit(pm, obj.obj, addr_l, end); >>> 227: } >>> 228: } >> >> I wonder if the else branch can be replaced by: >> >> >> if (obj.addr < i_addr && obj.addr > start) { >> // already-scanned >> } else { >> scan_obj_with_limit(pm, obj.obj, addr_l, end); >> } > > You mean to replace L220-L227, right? There we know `obj.addr >= addr_l`. `obj.addr < i_addr` cannot be true then IMO. It would overlap with an objArray if `i_addr` was set there. If `i_addr` was set at the end of a non-objArray then objects starting before `i_addr` cannot start in or after `addr_l`. No, I mean L212 to L228. "Comment on lines +212 to +228" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1355177469 From rrich at openjdk.org Wed Oct 11 16:56:34 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 16:56:34 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> Message-ID: On Wed, 11 Oct 2023 15:06:22 GMT, Albert Mingkun Yang wrote: >> You mean to replace L220-L227, right? There we know `obj.addr >= addr_l`. `obj.addr < i_addr` cannot be true then IMO. It would overlap with an objArray if `i_addr` was set there. If `i_addr` was set at the end of a non-objArray then objects starting before `i_addr` cannot start in or after `addr_l`. > > No, I mean L212 to L228. "Comment on lines +212 to +228" Yes indeed. Looks correct to me. Nice simplification! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1355375376 From rrich at openjdk.org Wed Oct 11 17:04:09 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 11 Oct 2023 17:04:09 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> Message-ID: <7Yql4JbKg57I3Vi3BoKoyD8QleWQ0wPh_yBh6bNjhj4=.31078396-6df1-4439-9504-d84b62442f13@github.com> On Wed, 11 Oct 2023 16:53:13 GMT, Richard Reingruber wrote: >> No, I mean L212 to L228. "Comment on lines +212 to +228" > > Yes indeed. Looks correct to me. Nice simplification! > No, I mean L212 to L228. "Comment on lines +212 to +228" I looked at this at first but thought it is obviously wrong if `obj` is precisely marked after a previous young collection. But if it is precisely marked it is of course correct scan from the beginning of the dirty chunk (addr_l) to the obj/stripe end. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1355385596 From zgu at openjdk.org Wed Oct 11 18:50:14 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 11 Oct 2023 18:50:14 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: > Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. > > GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. > > This patch is intended to enable `OopMapCache` for concurrent GCs. > > Test: > tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Cleanup old oop map cache entry after class redefinition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16074/files - new: https://git.openjdk.org/jdk/pull/16074/files/11936030..015d4fb3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16074&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16074&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16074.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16074/head:pull/16074 PR: https://git.openjdk.org/jdk/pull/16074 From duke at openjdk.org Wed Oct 11 18:50:17 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Wed, 11 Oct 2023 18:50:17 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: On Wed, 11 Oct 2023 18:44:04 GMT, Zhengyu Gu wrote: >> Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. >> >> GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. >> >> This patch is intended to enable `OopMapCache` for concurrent GCs. >> >> Test: >> tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup old oop map cache entry after class redefinition src/hotspot/share/interpreter/oopMapCache.cpp line 506: > 504: > 505: if (Atomic::cmpxchg(&_array[i], entry, (OopMapCacheEntry*)nullptr, memory_order_relaxed) == entry) { > 506: enqueue_for_cleanup(entry); OopMapCache::flush_obsolete_entries() is called from VM_RedefineClasses::doit(). Enqueuing OopMapCacheEntries for cleanup means that we would need call OopMapCache::cleanup_old_entries() in VM_RedefineClasses::doit_epilogue() ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1355360462 From zgu at openjdk.org Wed Oct 11 18:50:18 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 11 Oct 2023 18:50:18 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: <0YGIJj0crvW3iZ_fd5MlNJ8UxWPECo9XgVppni70cjQ=.391f488a-077a-47ee-9cc4-9e47bceeec65@github.com> On Wed, 11 Oct 2023 16:40:57 GMT, Leela Mohan Venati wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup old oop map cache entry after class redefinition > > src/hotspot/share/interpreter/oopMapCache.cpp line 506: > >> 504: >> 505: if (Atomic::cmpxchg(&_array[i], entry, (OopMapCacheEntry*)nullptr, memory_order_relaxed) == entry) { >> 506: enqueue_for_cleanup(entry); > > OopMapCache::flush_obsolete_entries() is called from VM_RedefineClasses::doit(). Enqueuing OopMapCacheEntries for cleanup means that we would need call OopMapCache::cleanup_old_entries() in VM_RedefineClasses::doit_epilogue() Good catch. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1355594222 From tschatzl at openjdk.org Wed Oct 11 18:55:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 11 Oct 2023 18:55:29 GMT Subject: RFR: 8317963: Serial: Remove unused GenerationIsInReservedClosure In-Reply-To: References: Message-ID: <7iMuh3uyV84Hc22BgIWC9r3zUJplRKKuqTZeuW98HGM=.d74b7dc6-aaf3-4dbb-aa2e-cc937d4ac19a@github.com> On Wed, 11 Oct 2023 15:22:03 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16153#pullrequestreview-1672283006 From ayang at openjdk.org Thu Oct 12 08:55:13 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 12 Oct 2023 08:55:13 GMT Subject: RFR: 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping In-Reply-To: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> References: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> Message-ID: On Mon, 9 Oct 2023 15:14:42 GMT, Ivan Walulya wrote: > Hi all, > > Please review this small change to return the maximum allowed TLAB size from `unsafe_max_tlab_alloc` in case the current region does not have enough free space to allocate MinTLABSize. We assume that the next TLAB allocation will happen in a new region. > > Testing: Tier 1-3. > > Thanks Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16102#pullrequestreview-1673642541 From ayang at openjdk.org Thu Oct 12 08:59:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 12 Oct 2023 08:59:25 GMT Subject: RFR: 8317963: Serial: Remove unused GenerationIsInReservedClosure In-Reply-To: References: Message-ID: On Wed, 11 Oct 2023 15:22:03 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16153#issuecomment-1759201846 From ayang at openjdk.org Thu Oct 12 08:59:26 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 12 Oct 2023 08:59:26 GMT Subject: Integrated: 8317963: Serial: Remove unused GenerationIsInReservedClosure In-Reply-To: References: Message-ID: On Wed, 11 Oct 2023 15:22:03 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 77dc8911 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/77dc89115e2a8de5fc600874d82cd3a75cd3b4fb Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod 8317963: Serial: Remove unused GenerationIsInReservedClosure Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16153 From rrich at openjdk.org Thu Oct 12 09:15:18 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 09:15:18 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: <7Yql4JbKg57I3Vi3BoKoyD8QleWQ0wPh_yBh6bNjhj4=.31078396-6df1-4439-9504-d84b62442f13@github.com> References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> <7Yql4JbKg57I3Vi3BoKoyD8QleWQ0wPh_yBh6bNjhj4=.31078396-6df1-4439-9504-d84b62442f13@github.com> Message-ID: On Wed, 11 Oct 2023 17:01:05 GMT, Richard Reingruber wrote: >> Yes indeed. Looks correct to me. Nice simplification! > >> No, I mean L212 to L228. "Comment on lines +212 to +228" > > I looked at this at first but thought it is obviously wrong if `obj` is precisely marked after a previous young collection. But if it is precisely marked it is of course correct scan from the beginning of the dirty chunk (addr_l) to the obj/stripe end. I get `assert(should_scavenge(p, true)) failed: revisiting object?` with test/jdk/java/lang/Thread/virtual/stress/Skynet.java when the change is applied. It reproduces well but not 100%. Need to look into it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356529361 From ayang at openjdk.org Thu Oct 12 09:15:31 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 12 Oct 2023 09:15:31 GMT Subject: RFR: 8317994: Serial: Use SerialHeap in generation Message-ID: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Simple renaming and removing of redundant macro guards. ------------- Commit messages: - s1-generation Changes: https://git.openjdk.org/jdk/pull/16162/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16162&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317994 Stats: 17 lines in 2 files changed: 1 ins; 8 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/16162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16162/head:pull/16162 PR: https://git.openjdk.org/jdk/pull/16162 From tschatzl at openjdk.org Thu Oct 12 10:01:13 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 12 Oct 2023 10:01:13 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 20:46:44 GMT, Aleksey Shipilev wrote: > See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` We discussed this issue at Oracle internally a bit, with the following thoughts: We are still not sure about the use case you want to cover with that change, and probably the option `G1PeriodicGCInterval` is not named very well - the functionality you are modifying has always been intended to compact the heap for when the VM is almost completely idle. Given our current understanding of the issue, this does not seem to be the case here. Instead the intention of this change is probably something more like general cleanup (potentially including heap compaction) that is currently only performed after a whole heap analysis. Similar issues were reported in the past (like https://mail.openjdk.org/pipermail/hotspot-gc-dev/2020-March/028857.html), but nobody acted on it yet. So the problems we see here are: * the `G1PeriodicGCInterval` option naming is a bit misleading. It should be more specific, containing "Idle" or something in it. * there is no real periodic whole heap and cleanup marking functionality. What we think would be most appropriate for solving your problem (if the suggestion in an earlier post does not apply) currently would be * rename `G1PeriodicGCInterval` to something else and start the deprecation/removal process (temporarily aliasing the old flag to the new one) * add the new functionality (periodic concurrent whole heap analysis) using a new flag Also note that just for heap compaction there is this CR https://bugs.openjdk.org/browse/JDK-8238687 that I've been working for a long time on and off which imo only needs testing and cleanup (hah!). If you want to look into this feel free to contact me. Hth, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1759303317 From iwalulya at openjdk.org Thu Oct 12 10:29:16 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 12 Oct 2023 10:29:16 GMT Subject: RFR: 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping In-Reply-To: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> References: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> Message-ID: On Mon, 9 Oct 2023 15:14:42 GMT, Ivan Walulya wrote: > Hi all, > > Please review this small change to return the maximum allowed TLAB size from `unsafe_max_tlab_alloc` in case the current region does not have enough free space to allocate MinTLABSize. We assume that the next TLAB allocation will happen in a new region. > > Testing: Tier 1-3. > > Thanks Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16102#issuecomment-1759343117 From rrich at openjdk.org Thu Oct 12 10:32:13 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 10:32:13 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v18] In-Reply-To: References: Message-ID: <_zdWXMwfI--pEBEBq1iiZ2tZm64aU6-b0T4Qg-cndrc=.5d97336e-b7e8-4867-b447-7ca3c89e4fb5@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with four additional commits since the last revision: - Make sure to scan obj reaching in just once - Simplification suggested by Albert - Don't overlap card table processing with scavenging for simplicity - Cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/8b544d84..c9e040f5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=16-17 Stats: 83 lines in 2 files changed: 6 ins; 63 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From iwalulya at openjdk.org Thu Oct 12 10:32:27 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 12 Oct 2023 10:32:27 GMT Subject: Integrated: 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping In-Reply-To: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> References: <3_DGucK_oOxNiifa0k42NxgdbGz3MVkArCdFJsGNqMs=.fb5a3f0f-bf8e-4310-a878-66e55e6e3c36@github.com> Message-ID: On Mon, 9 Oct 2023 15:14:42 GMT, Ivan Walulya wrote: > Hi all, > > Please review this small change to return the maximum allowed TLAB size from `unsafe_max_tlab_alloc` in case the current region does not have enough free space to allocate MinTLABSize. We assume that the next TLAB allocation will happen in a new region. > > Testing: Tier 1-3. > > Thanks This pull request has now been integrated. Changeset: 4c79e7d5 Author: Ivan Walulya URL: https://git.openjdk.org/jdk/commit/4c79e7d59caec01b4d2bdae2f7d25f1dd24ffbf6 Stats: 8 lines in 1 file changed: 5 ins; 2 del; 1 mod 8170817: G1: Returning MinTLABSize from unsafe_max_tlab_alloc causes TLAB flapping Reviewed-by: tschatzl, ayang ------------- PR: https://git.openjdk.org/jdk/pull/16102 From rrich at openjdk.org Thu Oct 12 10:38:19 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 10:38:19 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v17] In-Reply-To: References: <7Wf_JP97N60baqhknTPclz-KiF8cBARxqTD8KtVfD0w=.631369a7-e665-406a-8e10-4942893dc578@github.com> <_jwOMcqQDRIjTc-CLlvGk42ySeBHAHa5OP1KxYiFWu8=.7735fa6f-4afb-45fa-b594-4ccba269be79@github.com> <7Yql4JbKg57I3Vi3BoKoyD8QleWQ0wPh_yBh6bNjhj4=.31078396-6df1-4439-9504-d84b62442f13@github.com> Message-ID: On Thu, 12 Oct 2023 09:12:47 GMT, Richard Reingruber wrote: >>> No, I mean L212 to L228. "Comment on lines +212 to +228" >> >> I looked at this at first but thought it is obviously wrong if `obj` is precisely marked after a previous young collection. But if it is precisely marked it is of course correct scan from the beginning of the dirty chunk (addr_l) to the obj/stripe end. > > I get `assert(should_scavenge(p, true)) failed: revisiting object?` with test/jdk/java/lang/Thread/virtual/stress/Skynet.java when the change is applied. > > It reproduces well but not 100%. Need to look into it. `obj` can be scanned twice if it reaches into the stripe because `obj.addr > start` will still false upon the 2nd time. Fix is to check `i_addr > start`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356624126 From rrich at openjdk.org Thu Oct 12 10:53:14 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 10:53:14 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v19] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Re-cleanup (was accidentally reverted) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/c9e040f5..443f4826 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=17-18 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From ayang at openjdk.org Thu Oct 12 11:10:34 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 12 Oct 2023 11:10:34 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v19] In-Reply-To: References: Message-ID: On Thu, 12 Oct 2023 10:53:14 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Re-cleanup (was accidentally reverted) Could you merge master? I think the patch is in a good state. Best to do some final perf/correctness testing. I have only some minor comments/suggestions. Very subjective, so up to you. src/hotspot/share/gc/parallel/psCardTable.cpp line 156: > 154: > 155: // Helper struct to keep the following code compact. > 156: struct Obj { The scan-dirty-chunk code below is already quite compact; this struct-def is bulky in contrast. Maybe using plain variables inside while-loop is sufficient. Not sure either is definitely superior though. src/hotspot/share/gc/parallel/psCardTable.cpp line 165: > 163: is_obj_array(obj->is_objArray()), > 164: end_addr(addr + obj->size()) {} > 165: void next() { Maybe `move_to_next` so that it's clear that it's an action not a accessor (noun)? src/hotspot/share/gc/parallel/psCardTable.cpp line 307: > 305: HeapWord* start_addr; > 306: HeapWord* end_addr; > 307: DEBUG_ONLY(HeapWord* _prev_query); This debug-field clutters the real logic, IMO. src/hotspot/share/gc/parallel/psCardTable.hpp line 52: > 50: #ifdef ASSERT > 51: , _table_end((const CardValue*)(uintptr_t(_table) + (align_up(stripe.byte_size(), _card_size) >> _card_shift))) > 52: #endif Not sure about its usefulness; the logic in the caller is super clear while this assertion logic obstructs the flow. src/hotspot/share/gc/parallel/psCardTable.hpp line 82: > 80: const CardValue* const end) { > 81: for (const CardValue* i = start; i < end; ++i) { > 82: if (!is_clean(i)) { Better to use `is_dirty` to match the method name. ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1673870236 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356657815 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356650924 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356652217 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356649524 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1356646834 From tschatzl at openjdk.org Thu Oct 12 13:02:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 12 Oct 2023 13:02:36 GMT Subject: RFR: 8317994: Serial: Use SerialHeap in generation In-Reply-To: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> References: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Message-ID: On Thu, 12 Oct 2023 09:06:35 GMT, Albert Mingkun Yang wrote: > Simple renaming and removing of redundant macro guards. lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16162#pullrequestreview-1673974275 From tschatzl at openjdk.org Thu Oct 12 13:06:30 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 12 Oct 2023 13:06:30 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. Reconsidering and now tending to agree with @kstefanj about that heap size tests are too sensitive too various options (too many to list them all) to be worth the potential gain by running it with random options. Maybe some more control about the flags like suggested in some other PR would make my concerns go away. ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16012#pullrequestreview-1673994502 From rrich at openjdk.org Thu Oct 12 14:32:18 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 14:32:18 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v20] In-Reply-To: References: Message-ID: <-KHu3mjJltdBchaw6Xb1l3Dr8Ri4DqF5cg_F-XHzP1k=.eb2ea3ec-3e08-450d-85c1-c0729b20d0c4@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: - Merge branch 'master' - Re-cleanup (was accidentally reverted) - Make sure to scan obj reaching in just once - Simplification suggested by Albert - Don't overlap card table processing with scavenging for simplicity - Cleanup - Shadow table per stripe - find_first_clean_card: return end_card if final object extends beyond it. - Cleanup - Missed acquire semantics - ... and 22 more: https://git.openjdk.org/jdk/compare/880f96d2...381e001b ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/443f4826..381e001b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=18-19 Stats: 99446 lines in 2718 files changed: 41295 ins; 19417 del; 38734 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 12 15:58:21 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 15:58:21 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v20] In-Reply-To: <-KHu3mjJltdBchaw6Xb1l3Dr8Ri4DqF5cg_F-XHzP1k=.eb2ea3ec-3e08-450d-85c1-c0729b20d0c4@github.com> References: <-KHu3mjJltdBchaw6Xb1l3Dr8Ri4DqF5cg_F-XHzP1k=.eb2ea3ec-3e08-450d-85c1-c0729b20d0c4@github.com> Message-ID: On Thu, 12 Oct 2023 14:32:18 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Merge branch 'master' > - Re-cleanup (was accidentally reverted) > - Make sure to scan obj reaching in just once > - Simplification suggested by Albert > - Don't overlap card table processing with scavenging for simplicity > - Cleanup > - Shadow table per stripe > - find_first_clean_card: return end_card if final object extends beyond it. > - Cleanup > - Missed acquire semantics > - ... and 22 more: https://git.openjdk.org/jdk/compare/1fdb3ab3...381e001b JavadocHelperTest.java fails with parallel gc after merging master. I have opened https://bugs.openjdk.org/browse/JDK-8318025 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1759902498 From rrich at openjdk.org Thu Oct 12 16:24:17 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 16:24:17 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Feedback Albert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/381e001b..d12e96e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=19-20 Stats: 47 lines in 2 files changed: 4 ins; 28 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 12 16:24:24 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 12 Oct 2023 16:24:24 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v19] In-Reply-To: References: Message-ID: On Thu, 12 Oct 2023 11:06:16 GMT, Albert Mingkun Yang wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-cleanup (was accidentally reverted) > > src/hotspot/share/gc/parallel/psCardTable.cpp line 156: > >> 154: >> 155: // Helper struct to keep the following code compact. >> 156: struct Obj { > > The scan-dirty-chunk code below is already quite compact; this struct-def is bulky in contrast. Maybe using plain variables inside while-loop is sufficient. Not sure either is definitely superior though. Ok. Done. > src/hotspot/share/gc/parallel/psCardTable.cpp line 165: > >> 163: is_obj_array(obj->is_objArray()), >> 164: end_addr(addr + obj->size()) {} >> 165: void next() { > > Maybe `move_to_next` so that it's clear that it's an action not a accessor (noun)? I removed the helper struct. > src/hotspot/share/gc/parallel/psCardTable.cpp line 307: > >> 305: HeapWord* start_addr; >> 306: HeapWord* end_addr; >> 307: DEBUG_ONLY(HeapWord* _prev_query); > > This debug-field clutters the real logic, IMO. I've removed it. > src/hotspot/share/gc/parallel/psCardTable.hpp line 52: > >> 50: #ifdef ASSERT >> 51: , _table_end((const CardValue*)(uintptr_t(_table) + (align_up(stripe.byte_size(), _card_size) >> _card_shift))) >> 52: #endif > > Not sure about its usefulness; the logic in the caller is super clear while this assertion logic obstructs the flow. I've reduce the code for the bounds check. > src/hotspot/share/gc/parallel/psCardTable.hpp line 82: > >> 80: const CardValue* const end) { >> 81: for (const CardValue* i = start; i < end; ++i) { >> 82: if (!is_clean(i)) { > > Better to use `is_dirty` to match the method name. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1357067138 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1357065828 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1357065987 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1357064658 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1357063772 From sangheki at openjdk.org Thu Oct 12 17:28:06 2023 From: sangheki at openjdk.org (Sangheon Kim) Date: Thu, 12 Oct 2023 17:28:06 GMT Subject: RFR: 8317994: Serial: Use SerialHeap in generation In-Reply-To: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> References: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Message-ID: On Thu, 12 Oct 2023 09:06:35 GMT, Albert Mingkun Yang wrote: > Simple renaming and removing of redundant macro guards. Marked as reviewed by sangheki (Reviewer). Looks good. ------------- PR Review: https://git.openjdk.org/jdk/pull/16162#pullrequestreview-1674754360 PR Comment: https://git.openjdk.org/jdk/pull/16162#issuecomment-1760045462 From lkorinth at openjdk.org Fri Oct 13 08:49:47 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 08:49:47 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm In-Reply-To: References: Message-ID: On Fri, 29 Sep 2023 13:44:26 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. Added the `@key flag-sensitive` after conversation with @kstefanj ------------- PR Comment: https://git.openjdk.org/jdk/pull/15986#issuecomment-1761145761 From lkorinth at openjdk.org Fri Oct 13 08:49:47 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 08:49:47 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: References: Message-ID: > Minor testing done, I will test more later with other fixes. Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: add @key flag-sensitive ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15986/files - new: https://git.openjdk.org/jdk/pull/15986/files/c85a18bb..e0dde6f1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=00-01 Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15986/head:pull/15986 PR: https://git.openjdk.org/jdk/pull/15986 From sjohanss at openjdk.org Fri Oct 13 12:06:24 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 13 Oct 2023 12:06:24 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: References: Message-ID: On Fri, 13 Oct 2023 08:49:47 GMT, Leo Korinth wrote: >> Minor testing done, I will test more later with other fixes. > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > add @key flag-sensitive Some small comments but I'm ok with the change as is. I'm not sure how we should handle additional flags we know fail, but aren't part of the CI testing. For example: `G1HeapRegionSize=8m` or larger is not good for `TestG1HeapSizeFlags.java`. So feel free to add this if you think that is the correct way to go. test/hotspot/jtreg/gc/arguments/TestG1HeapSizeFlags.java line 29: > 27: * @test TestG1HeapSizeFlags > 28: * @bug 8006088 > 29: * @requires vm.gc.G1 & vm.opt.x.Xmx == null & vm.opt.x.Xms == null & vm.opt.MinHeapSize == null & vm.opt.MaxHeapSize == null & vm.opt.InitialHeapSize == null Please move the requires below the @key to be consistent with the other tests. test/jtreg-ext/requires/VMProps.java line 143: > 141: map.put("vm.flagless", this::isFlagless); > 142: map.put("jdk.foreign.linker", this::jdkForeignLinker); > 143: map.putAll(xOptFlags()); // -Xmx4g -> @requires vm.opt.x.Xmx == "4g" ) Does it have to be `vm.opt.x.*` or could we just add those to `vm.opt` and use for example: `vm.opt.Xmx == null`? Or do we have -X and -XX: options that collide? test/jtreg-ext/requires/VMProps.java line 728: > 726: return allFlags() > 727: .filter(s -> s.startsWith("-X") && !s.startsWith("-XX:") && !s.equals("-X")) > 728: .map(s -> s.replaceFirst("-*", "")) Is there a need to replace "-*", should you be able to just do: ```suggestion .map(s -> s.replaceFirst("-", "")) ------------- Marked as reviewed by sjohanss (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15986#pullrequestreview-1676158052 PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358057744 PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358061437 PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358091024 From lkorinth at openjdk.org Fri Oct 13 12:26:46 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 12:26:46 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: References: Message-ID: On Fri, 13 Oct 2023 10:01:50 GMT, Stefan Johansson wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> add @key flag-sensitive > > test/hotspot/jtreg/gc/arguments/TestG1HeapSizeFlags.java line 29: > >> 27: * @test TestG1HeapSizeFlags >> 28: * @bug 8006088 >> 29: * @requires vm.gc.G1 & vm.opt.x.Xmx == null & vm.opt.x.Xms == null & vm.opt.MinHeapSize == null & vm.opt.MaxHeapSize == null & vm.opt.InitialHeapSize == null > > Please move the requires below the @key to be consistent with the other tests. will do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358183735 From lkorinth at openjdk.org Fri Oct 13 12:31:56 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 12:31:56 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: References: Message-ID: <86mYy5bPRV8rF4odI1T4iks8h0Bvxovzz6RljqV1goE=.add2b821-e318-44ae-85c0-2dce25b30312@github.com> On Fri, 13 Oct 2023 10:05:31 GMT, Stefan Johansson wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> add @key flag-sensitive > > test/jtreg-ext/requires/VMProps.java line 143: > >> 141: map.put("vm.flagless", this::isFlagless); >> 142: map.put("jdk.foreign.linker", this::jdkForeignLinker); >> 143: map.putAll(xOptFlags()); // -Xmx4g -> @requires vm.opt.x.Xmx == "4g" ) > > Does it have to be `vm.opt.x.*` or could we just add those to `vm.opt` and use for example: `vm.opt.Xmx == null`? Or do we have -X and -XX: options that collide? I know of no colliding options ATM. They could be colliding in the future (especially if we chose to parse extra options not starting with a `-X`). Also, I think it is incredibly confusing to reuse the `vm.opt` namespace that is used by vanilla JTREG by our extension code in VMProps. I would object to put both in the `vm.opt` namespace. > test/jtreg-ext/requires/VMProps.java line 728: > >> 726: return allFlags() >> 727: .filter(s -> s.startsWith("-X") && !s.startsWith("-XX:") && !s.equals("-X")) >> 728: .map(s -> s.replaceFirst("-*", "")) > > Is there a need to replace "-*", should you be able to just do: > ```suggestion > .map(s -> s.replaceFirst("-", "")) I will remove it, it is needed if I want to parse extra options starting with two dashes, but I chose to keep it simple and just parse extra options starting with `-X`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358193037 PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358195607 From sjohanss at openjdk.org Fri Oct 13 13:28:30 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 13 Oct 2023 13:28:30 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: <86mYy5bPRV8rF4odI1T4iks8h0Bvxovzz6RljqV1goE=.add2b821-e318-44ae-85c0-2dce25b30312@github.com> References: <86mYy5bPRV8rF4odI1T4iks8h0Bvxovzz6RljqV1goE=.add2b821-e318-44ae-85c0-2dce25b30312@github.com> Message-ID: <_HRwIANVepf9Cl_pXEdLInhh-9TZYernjhU3jeFjolA=.89a561f6-97d5-446c-8f58-1ef4de313324@github.com> On Fri, 13 Oct 2023 12:26:47 GMT, Leo Korinth wrote: >> test/jtreg-ext/requires/VMProps.java line 143: >> >>> 141: map.put("vm.flagless", this::isFlagless); >>> 142: map.put("jdk.foreign.linker", this::jdkForeignLinker); >>> 143: map.putAll(xOptFlags()); // -Xmx4g -> @requires vm.opt.x.Xmx == "4g" ) >> >> Does it have to be `vm.opt.x.*` or could we just add those to `vm.opt` and use for example: `vm.opt.Xmx == null`? Or do we have -X and -XX: options that collide? > > I know of no colliding options ATM. They could be colliding in the future (especially if we chose to parse extra options not starting with a `-X`). Also, I think it is incredibly confusing to reuse the `vm.opt` namespace that is used by vanilla JTREG by our extension code in VMProps. I would object to put both in the `vm.opt` namespace. I agree it is a bit ugly and the point of this being an extension is good. An alternative would be to use `vm.xopt.Xmx` to even less clobber the `vm.opt` namespace, but I guess we already do by adding the `vm.opt.final` details. I'm ok with the current way, just wanted to raise possible alternatives. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358264639 From lkorinth at openjdk.org Fri Oct 13 13:46:33 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 13:46:33 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v3] In-Reply-To: References: Message-ID: > Minor testing done, I will test more later with other fixes. Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: fix improvments requested by Stefan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15986/files - new: https://git.openjdk.org/jdk/pull/15986/files/e0dde6f1..f0aaa67f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=01-02 Stats: 3 lines in 2 files changed: 1 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15986/head:pull/15986 PR: https://git.openjdk.org/jdk/pull/15986 From lkorinth at openjdk.org Fri Oct 13 13:48:08 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 13:48:08 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: <_HRwIANVepf9Cl_pXEdLInhh-9TZYernjhU3jeFjolA=.89a561f6-97d5-446c-8f58-1ef4de313324@github.com> References: <86mYy5bPRV8rF4odI1T4iks8h0Bvxovzz6RljqV1goE=.add2b821-e318-44ae-85c0-2dce25b30312@github.com> <_HRwIANVepf9Cl_pXEdLInhh-9TZYernjhU3jeFjolA=.89a561f6-97d5-446c-8f58-1ef4de313324@github.com> Message-ID: <0Tls6vBsVPztIKIE6ejfURZ10Xk5Xqvz8T682KTfL6Q=.d1a656ab-7bd0-416e-a2b4-58aaf2cdfdd7@github.com> On Fri, 13 Oct 2023 13:25:39 GMT, Stefan Johansson wrote: >> I know of no colliding options ATM. They could be colliding in the future (especially if we chose to parse extra options not starting with a `-X`). Also, I think it is incredibly confusing to reuse the `vm.opt` namespace that is used by vanilla JTREG by our extension code in VMProps. I would object to put both in the `vm.opt` namespace. > > I agree it is a bit ugly and the point of this being an extension is good. An alternative would be to use `vm.xopt.Xmx` to even less clobber the `vm.opt` namespace, but I guess we already do by adding the `vm.opt.final` details. > > I'm ok with the current way, just wanted to raise possible alternatives. `vm.xopt` was my first choice, but it works badly because of other reasons, so lets keep it as it is now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358282079 From lkorinth at openjdk.org Fri Oct 13 14:08:23 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 14:08:23 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v2] In-Reply-To: <0Tls6vBsVPztIKIE6ejfURZ10Xk5Xqvz8T682KTfL6Q=.d1a656ab-7bd0-416e-a2b4-58aaf2cdfdd7@github.com> References: <86mYy5bPRV8rF4odI1T4iks8h0Bvxovzz6RljqV1goE=.add2b821-e318-44ae-85c0-2dce25b30312@github.com> <_HRwIANVepf9Cl_pXEdLInhh-9TZYernjhU3jeFjolA=.89a561f6-97d5-446c-8f58-1ef4de313324@github.com> <0Tls6vBsVPztIKIE6ejfURZ10Xk5Xqvz8T682KTfL6Q=.d1a656ab-7bd0-416e-a2b4-58aaf2cdfdd7@github.com> Message-ID: On Fri, 13 Oct 2023 13:39:48 GMT, Leo Korinth wrote: >> I agree it is a bit ugly and the point of this being an extension is good. An alternative would be to use `vm.xopt.Xmx` to even less clobber the `vm.opt` namespace, but I guess we already do by adding the `vm.opt.final` details. >> >> I'm ok with the current way, just wanted to raise possible alternatives. > > `vm.xopt` was my first choice, but it works badly because of other reasons, so lets keep it as it is now. The reason is that we need to use a namespace that starts with `vm.opt`, because that string is *special* in JTREG. If the property starts with that prefix, there is no need to register all possible values manually --- something that would be hard to maintain. However, reusing parts of namespaces have prior art in for example `vm.gc.*` so although maybe not optimal, it should not be problematic. It follows standard practice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1358301087 From ayang at openjdk.org Fri Oct 13 14:30:13 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 13 Oct 2023 14:30:13 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: References: Message-ID: <88WMBggE7v-4BZphBaKYrlmJ7uyu_fW7_4KKZU1ZSkg=.8594feb7-8961-41e5-8965-8bfa1ef22e73@github.com> On Thu, 12 Oct 2023 16:24:17 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Feedback Albert Thank you for the revision. Did some local testing; no issue found. src/hotspot/share/gc/parallel/psCardTable.hpp line 94: > 92: > 93: // Pre-scavenge support. > 94: // The pre-scavenge phase can overlap with scavenging. Is this obsolete? ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1676623345 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1358340559 From lkorinth at openjdk.org Fri Oct 13 14:31:25 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 14:31:25 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. I will rebase this on 8317228 and use flag-sensitive (when 8317228 has been integrated). May that be enough? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1761616918 From lkorinth at openjdk.org Fri Oct 13 14:32:56 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 14:32:56 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v4] In-Reply-To: References: Message-ID: > Minor testing done, I will test more later with other fixes. Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: add flag-sensitive key to TEST.ROOT ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15986/files - new: https://git.openjdk.org/jdk/pull/15986/files/f0aaa67f..a5b87914 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15986/head:pull/15986 PR: https://git.openjdk.org/jdk/pull/15986 From lkorinth at openjdk.org Fri Oct 13 14:36:30 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 13 Oct 2023 14:36:30 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. I will also take a look at the other comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1761624698 From rrich at openjdk.org Fri Oct 13 14:39:48 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 13 Oct 2023 14:39:48 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v22] In-Reply-To: References: Message-ID: <1_KC1smUe4FUpFs3u9GRAYc_RGtxMKFyc4SwnZWqJUk=.55658bac-4a3b-4e57-bdad-dbf2ec9b9170@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Remove obsolete comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/d12e96e2..607f0c22 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=20-21 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 13 14:39:52 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 13 Oct 2023 14:39:52 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: <88WMBggE7v-4BZphBaKYrlmJ7uyu_fW7_4KKZU1ZSkg=.8594feb7-8961-41e5-8965-8bfa1ef22e73@github.com> References: <88WMBggE7v-4BZphBaKYrlmJ7uyu_fW7_4KKZU1ZSkg=.8594feb7-8961-41e5-8965-8bfa1ef22e73@github.com> Message-ID: On Fri, 13 Oct 2023 14:25:34 GMT, Albert Mingkun Yang wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback Albert > > src/hotspot/share/gc/parallel/psCardTable.hpp line 94: > >> 92: >> 93: // Pre-scavenge support. >> 94: // The pre-scavenge phase can overlap with scavenging. > > Is this obsolete? Oh sure. It's obsolete now. I removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1358347257 From rrich at openjdk.org Fri Oct 13 14:41:48 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 13 Oct 2023 14:41:48 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: References: Message-ID: <1s9_eK30_SkOiLxFIRRv5w_JEbmEz93C3zsZpNaYK0Q=.9c91e63a-9122-4e81-82b3-93104d9444a2@github.com> On Thu, 12 Oct 2023 16:24:17 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Feedback Albert Should https://bugs.openjdk.org/browse/JDK-8309960 be reverted? Better in a follow-up I guess. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1761632602 From ayang at openjdk.org Fri Oct 13 14:51:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 13 Oct 2023 14:51:37 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: <1s9_eK30_SkOiLxFIRRv5w_JEbmEz93C3zsZpNaYK0Q=.9c91e63a-9122-4e81-82b3-93104d9444a2@github.com> References: <1s9_eK30_SkOiLxFIRRv5w_JEbmEz93C3zsZpNaYK0Q=.9c91e63a-9122-4e81-82b3-93104d9444a2@github.com> Message-ID: <-tM40FGW10TWooaxzhFrtU7Xx9sQga4nhvZdUqDRLnQ=.dc86d37b-3502-43b4-8818-6ed13454d31b@github.com> On Fri, 13 Oct 2023 14:39:16 GMT, Richard Reingruber wrote: > Should https://bugs.openjdk.org/browse/JDK-8309960 be reverted? Better in a follow-up I guess. Better in its own PR, IMO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1761646606 From rkennke at openjdk.org Fri Oct 13 19:35:05 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 13 Oct 2023 19:35:05 GMT Subject: RFR: 8317535 Shenandoah: Remove unused code In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 16:47:32 GMT, William Kemper wrote: > Tested with `hotspot_gc_shenandoah`, `specjbb`, `specjvm`, `dacapo`, `extremem` and `heapothesys`. Nice cleanup! Thank you! I'm really quite baffled by some of it, but OTOH, in all cases it looks obviously ok if compiler doesn't complain. *shrugs* ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16104#pullrequestreview-1677215067 From ysr at openjdk.org Fri Oct 13 21:40:41 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 13 Oct 2023 21:40:41 GMT Subject: RFR: 8317535 Shenandoah: Remove unused code In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 16:47:32 GMT, William Kemper wrote: > Tested with `hotspot_gc_shenandoah`, `specjbb`, `specjvm`, `dacapo`, `extremem` and `heapothesys`. Nice clean up indeed; thank you! I am in awe of how you found these (discarded) needles in the proverbial haystack. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16104#pullrequestreview-1677399323 From wkemper at openjdk.org Fri Oct 13 21:54:54 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 13 Oct 2023 21:54:54 GMT Subject: RFR: 8317535 Shenandoah: Remove unused code In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 16:47:32 GMT, William Kemper wrote: > Tested with `hotspot_gc_shenandoah`, `specjbb`, `specjvm`, `dacapo`, `extremem` and `heapothesys`. The power of the IDE! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16104#issuecomment-1762282997 From wkemper at openjdk.org Fri Oct 13 21:59:06 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 13 Oct 2023 21:59:06 GMT Subject: Integrated: 8317535 Shenandoah: Remove unused code In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 16:47:32 GMT, William Kemper wrote: > Tested with `hotspot_gc_shenandoah`, `specjbb`, `specjvm`, `dacapo`, `extremem` and `heapothesys`. This pull request has now been integrated. Changeset: e942f368 Author: William Kemper Committer: Y. Srinivas Ramakrishna URL: https://git.openjdk.org/jdk/commit/e942f368c370e059c654e33408940a987013a5c7 Stats: 185 lines in 20 files changed: 0 ins; 179 del; 6 mod 8317535: Shenandoah: Remove unused code Reviewed-by: rkennke, ysr ------------- PR: https://git.openjdk.org/jdk/pull/16104 From mli at openjdk.org Sun Oct 15 15:10:07 2023 From: mli at openjdk.org (Hamlin Li) Date: Sun, 15 Oct 2023 15:10:07 GMT Subject: RFR: 8317994: Serial: Use SerialHeap in generation In-Reply-To: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> References: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Message-ID: On Thu, 12 Oct 2023 09:06:35 GMT, Albert Mingkun Yang wrote: > Simple renaming and removing of redundant macro guards. LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16162#pullrequestreview-1678809156 From ayang at openjdk.org Mon Oct 16 10:31:16 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 16 Oct 2023 10:31:16 GMT Subject: RFR: 8317994: Serial: Use SerialHeap in generation In-Reply-To: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> References: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Message-ID: On Thu, 12 Oct 2023 09:06:35 GMT, Albert Mingkun Yang wrote: > Simple renaming and removing of redundant macro guards. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16162#issuecomment-1764119279 From ayang at openjdk.org Mon Oct 16 10:31:17 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 16 Oct 2023 10:31:17 GMT Subject: Integrated: 8317994: Serial: Use SerialHeap in generation In-Reply-To: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> References: <1wl6pmkFt6Ur1KHbP_S5Ygf3oTUFA3c3eLKsChBupkY=.7fe10a5e-4578-4e2b-848b-0028dd4179d6@github.com> Message-ID: On Thu, 12 Oct 2023 09:06:35 GMT, Albert Mingkun Yang wrote: > Simple renaming and removing of redundant macro guards. This pull request has now been integrated. Changeset: a27fc7ef Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/a27fc7efd4d77bc3509294688cb7804bbc5f1e9c Stats: 17 lines in 2 files changed: 1 ins; 8 del; 8 mod 8317994: Serial: Use SerialHeap in generation Reviewed-by: tschatzl, sangheki, mli ------------- PR: https://git.openjdk.org/jdk/pull/16162 From ayang at openjdk.org Mon Oct 16 10:32:21 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 16 Oct 2023 10:32:21 GMT Subject: RFR: 8318155: Remove unnecessary virtual specifier in Space Message-ID: Simple (or even trivial) removing some unnecessary "virtual" marks. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/16200/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16200&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318155 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16200.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16200/head:pull/16200 PR: https://git.openjdk.org/jdk/pull/16200 From lkorinth at openjdk.org Mon Oct 16 10:34:19 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 16 Oct 2023 10:34:19 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v5] In-Reply-To: References: Message-ID: > Minor testing done, I will test more later with other fixes. Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: add explanation: test is sensitive to certain flags and might fail when flags are passed using -vmoptions and -javaoptions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15986/files - new: https://git.openjdk.org/jdk/pull/15986/files/a5b87914..8491a679 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15986&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15986/head:pull/15986 PR: https://git.openjdk.org/jdk/pull/15986 From sjohanss at openjdk.org Mon Oct 16 10:34:22 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Mon, 16 Oct 2023 10:34:22 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v4] In-Reply-To: References: Message-ID: On Fri, 13 Oct 2023 14:32:56 GMT, Leo Korinth wrote: >> Minor testing done, I will test more later with other fixes. > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > add flag-sensitive key to TEST.ROOT test/hotspot/jtreg/TEST.ROOT line 35: > 33: # randomness: test uses randomness, test cases differ from run to run > 34: # cgroups: test uses cgroups > 35: keys=stress headful intermittent randomness cgroups flag-sensitive Maybe add a comment above about the key, similar to the others. Something like: # flag-sensitive: test is sensitive to certain flags and might fail when flags are passed in ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1360411062 From lkorinth at openjdk.org Mon Oct 16 10:34:22 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 16 Oct 2023 10:34:22 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v4] In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 09:50:59 GMT, Stefan Johansson wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> add flag-sensitive key to TEST.ROOT > > test/hotspot/jtreg/TEST.ROOT line 35: > >> 33: # randomness: test uses randomness, test cases differ from run to run >> 34: # cgroups: test uses cgroups >> 35: keys=stress headful intermittent randomness cgroups flag-sensitive > > Maybe add a comment above about the key, similar to the others. Something like: > > # flag-sensitive: test is sensitive to certain flags and might fail when flags are passed in Thanks, I reworded it just slightly ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15986#discussion_r1360426624 From lkorinth at openjdk.org Mon Oct 16 11:52:49 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 16 Oct 2023 11:52:49 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. I want to just give a few clarifications: `isRunningG1(String[] args)` that I removed was not really useful from what I understand; the resulting heap region size would always be a 32M (ergonomic max) or less resulting in a successful test run, however when we propagate flags it *will* make a difference. I could add that code back, and I could add `@require` lines for `-Xmn` etc. I think what would be best though is to ignore all possible breaking flags and just add `flag-sensetive`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1764301136 From ayang at openjdk.org Mon Oct 16 13:31:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 16 Oct 2023 13:31:41 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 12:56:27 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: > > * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) > * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). > * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. > * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. > > Testing: gha > > Thanks, > Thomas src/hotspot/share/gc/parallel/psParallelCompact.cpp line 2068: > 2066: > 2067: // Release unloaded nmethods's memory. > 2068: CodeCache::flush_unlinked_nmethods(); The fact that recycling resources for nmethods involves two steps, unlink and free-unlinked, seems an insignificant impl detail in this caller context. Can that be hidden away from the public API, e.g. `do_unloading` perform these two steps directly? Is there a reason why `flush_unlinked_nmethods` is outside the scope? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16011#discussion_r1360667446 From tschatzl at openjdk.org Mon Oct 16 13:43:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 16 Oct 2023 13:43:03 GMT Subject: RFR: 8318155: Remove unnecessary virtual specifier in Space In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 10:12:10 GMT, Albert Mingkun Yang wrote: > Simple (or even trivial) removing some unnecessary "virtual" marks. lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16200#pullrequestreview-1680017554 From rrich at openjdk.org Mon Oct 16 14:39:17 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 16 Oct 2023 14:39:17 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v22] In-Reply-To: <1_KC1smUe4FUpFs3u9GRAYc_RGtxMKFyc4SwnZWqJUk=.55658bac-4a3b-4e57-bdad-dbf2ec9b9170@github.com> References: <1_KC1smUe4FUpFs3u9GRAYc_RGtxMKFyc4SwnZWqJUk=.55658bac-4a3b-4e57-bdad-dbf2ec9b9170@github.com> Message-ID: On Fri, 13 Oct 2023 14:39:48 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Remove obsolete comment I've done some jbb2005 benchmarking. The way I did it is not compliant but I though it would be ok to compare baseline with this pr: * 5 jbb2005 runs with baseline and another 5 with this pr. * 4 iterations per run, each with 8 warehouses. * Warm-up: first 3 iterations. * Result: gc pauses from the last iteration. I dont't see a significant difference between baseline and this pr, neither in gc times nor in benchmark throughput. [jbb2005_gc_times.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/jbb2005_gc_times.pdf) is a summary (generated from [jbb2005_gc_times.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/jbb2005_gc_times.ods)). The complete output from the runs with baseline is given in [jbb2005_5_runs_pr_14846_baseline.log.gz](https://cr.openjdk.org/~rrich/webrevs/8310031/jbb2005_5_runs_pr_14846_baseline.log.gz). The complete output from the runs with this pr is given in [jbb2005_5_runs_pr_14846.log.gz](https://cr.openjdk.org/~rrich/webrevs/8310031/jbb2005_5_runs_pr_14846.log.gz). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1764620634 From tschatzl at openjdk.org Mon Oct 16 14:40:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 16 Oct 2023 14:40:49 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 13:28:03 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: >> >> * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) >> * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). >> * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. >> * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. >> >> Testing: gha >> >> Thanks, >> Thomas > > src/hotspot/share/gc/parallel/psParallelCompact.cpp line 2068: > >> 2066: >> 2067: // Release unloaded nmethods's memory. >> 2068: CodeCache::flush_unlinked_nmethods(); > > The fact that recycling resources for nmethods involves two steps, unlink and free-unlinked, seems an insignificant impl detail in this caller context. Can that be hidden away from the public API, e.g. `do_unloading` perform these two steps directly? > > Is there a reason why `flush_unlinked_nmethods` is outside the scope? It is maybe an insignificant detail in the context of stw collectors, but for gcs doing concurrent class/code unloading there needs to be a handshake/memory synchronization between unlinking and freeing, so these two operations need to be split. I.e. what they do at a high level is: unlink classes unlink code cache unlink other stuff handshake free unlinked classes free unlinked code free other stuff Similarly I would like to split G1 class unloading into a STW part (unlinking stuff) and make all the freeing concurrent to avoid the additional (mostly) implementation overhead. Or at least start with that and then see if it is worth making everything concurrent. (The unlinking can be trimmed down further afaics). Back to this CR: as the description states I think the current code wrongly hides this free-unlinked procedure in that `UnloadingScope` (there is a reason it is only used for the STW collectors) unnecessarily making the observed behavior (of one of the most time-consuming parts of class/code unloading) surprising. Putting it on this level also allows more straightforward logging. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16011#discussion_r1360764085 From zgu at openjdk.org Mon Oct 16 17:12:30 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 16 Oct 2023 17:12:30 GMT Subject: RFR: 8316955: Unnecessary memory barrier for reading OopMapCacheEntry from OopMapCache In-Reply-To: References: Message-ID: <8exKVR4RuyHThXmIUD9ng9pUlkM-mtwbHOmdei2X6Yc=.917e8bfc-a0e0-4d96-a779-8a11c0b6a4ff@github.com> On Wed, 27 Sep 2023 13:22:43 GMT, Zhengyu Gu wrote: > OopMapCacheEntry is installed by CAS with memory_order_conservative, there is no need a memory barrier on read side. > > Test: > hotspot_gc on MacOSX (fastdebug and release) This path is not hot and based on @dholmes 's comment in PR, it is not worth the risk. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15944#issuecomment-1764915587 From zgu at openjdk.org Mon Oct 16 17:12:31 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 16 Oct 2023 17:12:31 GMT Subject: Withdrawn: 8316955: Unnecessary memory barrier for reading OopMapCacheEntry from OopMapCache In-Reply-To: References: Message-ID: <3fGjEVP2qtT1Me_0mvRBiYCQ3mObyFUNrmYy40913po=.b78cfa2c-7cf3-4737-bda4-acdf75ad88be@github.com> On Wed, 27 Sep 2023 13:22:43 GMT, Zhengyu Gu wrote: > OopMapCacheEntry is installed by CAS with memory_order_conservative, there is no need a memory barrier on read side. > > Test: > hotspot_gc on MacOSX (fastdebug and release) This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15944 From shade at openjdk.org Mon Oct 16 17:45:31 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 16 Oct 2023 17:45:31 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v4] In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 21:31:13 GMT, William Kemper wrote: >> Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Merge check for no-progress into retry allocation block > - Revert change to TEST.groups src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 887: > 885: size_t original_count = get_gc_no_progress_count() + 1; > 886: while (result == nullptr && original_count > get_gc_no_progress_count()) { > 887: if (!req.is_lab_alloc() && get_gc_no_progress_count() > ShenandoahNoProgressThreshold) { I am confused by this logic. So we come in here. Let's assume we come in with good conditions, `no_progress_count` (`npc`) is `0`. So `original_count` is `1`. We proceed to the loop. `original_count > npc` is true, so we enter. Suppose we finish the cycle without progress. `npc` is now `1`. We circle back to the loop. `original_count > npc` now fails, and we exit. At which point the `npc > ShenandoahNoProgressThreshold` comes into picture? I think the new code got `gc_count` confused with `no_progress_count`, really? test/jdk/com/sun/jdi/EATests.java line 274: > 272: public final boolean DeoptimizeObjectsALot; > 273: public final boolean DoEscapeAnalysis; > 274: public final boolean ConcurrentGCIsSelected; Let's do explicit `ZGCIsSelected` and `ShenandoahIsSelected`. Unfortunately, "Concurrent GC" is fairly ambiguous. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1361045071 PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1361023475 From duke at openjdk.org Mon Oct 16 20:12:25 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Mon, 16 Oct 2023 20:12:25 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> On Wed, 11 Oct 2023 18:50:14 GMT, Zhengyu Gu wrote: >> Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. >> >> GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. >> >> This patch is intended to enable `OopMapCache` for concurrent GCs. >> >> Test: >> tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup old oop map cache entry after class redefinition Changes requested by leelamv at github.com (no known OpenJDK username). src/hotspot/share/gc/shenandoah/shenandoahVMOperations.cpp line 64: > 62: OopMapCache::cleanup_old_entries(); > 63: } > 64: Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. If yes, i recommend adding OopMapCache::cleanup_old_entries() in VM_ShenandoahOperation::doit_epilogue(). And this would make the change simple and also revert the change in this [PR](https://github.com/openjdk/jdk/pull/15921) src/hotspot/share/oops/method.cpp line 311: > 309: void Method::mask_for(int bci, InterpreterOopMap* mask) { > 310: methodHandle h_this(Thread::current(), this); > 311: method_holder()->mask_for(h_this, bci, mask); Removing this condition allows all the threads including java threads to use/mutate oopMapCache. For ex: Java threads calls [JVM_CallStackWalk](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/jvm.cpp#L586) which walks the stack and calls locals() and expressions [here](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/stackwalk.cpp#L345) which access oopMapCache. ------------- PR Review: https://git.openjdk.org/jdk/pull/16074#pullrequestreview-1680858067 PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1361210493 PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1361202988 From sjohanss at openjdk.org Tue Oct 17 07:31:24 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 17 Oct 2023 07:31:24 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. I'm ok trying that out, we could always add the additional `@reguires` later on if this becomes problematic. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1765826156 From ayang at openjdk.org Tue Oct 17 07:34:22 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 07:34:22 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 14:37:04 GMT, Thomas Schatzl wrote: > It is maybe an insignificant detail in the context of stw collectors Then, could Serial and Parallel use APIs that don't expose these details? For instance, move `flush_unlinked_nmethods` inside `CodeCache::do_unloading`, as it is used only by those collectors. Why is `flush_unlinked_nmethods` outside of `UnloadingScope`? This newly-introduced scope in the caller context seems extremely out of place, IMO. > Putting it on this level also allows more straightforward logging. I don't get this. Can't see any log-print logic inside `flush_unlinked_nmethods`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16011#discussion_r1361632099 From ayang at openjdk.org Tue Oct 17 08:56:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 08:56:25 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v5] In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 10:34:19 GMT, Leo Korinth wrote: >> Minor testing done, I will test more later with other fixes. > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > add explanation: test is sensitive to certain flags and might fail when flags are passed using -vmoptions and -javaoptions Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15986#pullrequestreview-1681755096 From tschatzl at openjdk.org Tue Oct 17 09:21:22 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 17 Oct 2023 09:21:22 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Tue, 17 Oct 2023 07:31:51 GMT, Albert Mingkun Yang wrote: >Then, could Serial and Parallel use APIs that don't expose these details? There is no such API. This change does not intend to expose such a new API too. Just moving the various phases of code/class unloading to the same level in the source code as apparent to me to keep surprises low (and simplify logging to be able to _see_ problems in the first place. Then we can fix them in subsequent PRs). Also I would like to keep the existing structure of class/code unloading for all collectors for uniformity (e.g. separate unlinking from free-unlinking - maybe call it "purging" in the future?) - so that they will have the same structure and print the same logging messages in the same order in the future (at least for the STW collectors including G1). Note that other collectors have exactly the same phases, and unlinking is separate from purging everywhere else too. They just re-implement some phases using their own code (doing some renames in the process). Having both the same structure/phases and same (more comprehensive) logging output for more collectors will make troubleshooting much easier. There is no renaming in this change, e.g. `CodeCache::do_unloading()` (="unlinking") as this method does not do the complete unloading either indicated by having this external `flush_unlinked_nmethods` (="purging"). This is not the change to do this imo. (Iirc `CodeCache::do_unloading()` still has does some purging in it anyway, but let's not digress too much here). > For instance, move flush_unlinked_nmethods inside CodeCache::do_unloading, as it is used only by those collectors. It could, but then for (future) logging you would have to pass `GCTraceTimer` as the common way to do timing in gc code into `do_unloading()` which is some compiler code. Not sure if this is what we want as it is imo awkward. It is imho better, cleaner and sufficient to have timing outside at least initially. I.e. I would at this point prefer the style of { // Phase x GCTraceTimer x("do phase X"); do_phase_x(); } as how to time can be heavily collector dependent too. I would at the end of all that refactoring see if there is some useful coalescing of the methods into helpers that can be done. Maybe these scopes should be put into separate helper methods, I haven't decided yet what's best. >Why is flush_unlinked_nmethods outside of UnloadingScope? This newly-introduced scope in the caller context seems extremely out of place, IMO. I believe most of your questions stem from the naming. Please, do not be too hung up on names. The description of this PR already mentioned that `UnloadingScope` is a misnomer (and misnamed code is common in Hotspot code, or responsibilities changed over the course of years without fixing up the naming). So in addition to this class, be aware that there are lots of misnamed and not uniformly named methods in class/code unloading code too. `UnloadingScope` controls *unlinking* behavior by setting some globals and does not control the whole unloading process (ie. unlinking + purging). Additionally `UnloadingScope` actually did _not_ really contain `flush_unlinked_nmethods` earlier either. It has been the last call in the destructor of the `UnloadingScope`, _outside_, after the unlinking behavior has been reset. This is even more strange if you think about it. So from my understanding of the code this scope object ought to enclose only the unlinking part of the unloading (i.e. the decision of what to unlink, that is `CodeCache::do_unloading()`) before. This change only at most exposes the existing (as you say ugly - I agree) structure. Which isn't a bad thing to me to have exposed. It's not in scope of this change to fix this imo because that would mix it with other unrelated changes. (`UnloadingScope` should be called something different, maybe after this discussion something like `UnlinkingScope` or similar? Idk.) >> Putting it on this level also allows more straightforward logging. >I don't get this. Can't see any log-print logic inside flush_unlinked_nmethods. ... in the future. (https://bugs.openjdk.org/browse/JDK-8315504)[https://bugs.openjdk.org/browse/JDK-8315504] intends to add more timing/logging for every phase. There is currently no plan for comprehensive logging inside the various phases of the class/code unloading (at least not to the level of selected methods I did in recent investigations). What I intend to do is revisiting the phases in the future and move around work to better reflect the unlinking/purging split, and also link them together with a real `UnloadingScope` covering the whole unloading process (not to be mixed up with the current `UnloadingScope`) to allow gc-specific replacements/optimizations of the phases. Obviously there can be helper methods that simplify things for various gcs. This PR isn't this change though. Hth, Thomas ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16011#discussion_r1361795034 From lkorinth at openjdk.org Tue Oct 17 09:24:33 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:24:33 GMT Subject: RFR: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm [v5] In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 10:34:19 GMT, Leo Korinth wrote: >> Minor testing done, I will test more later with other fixes. > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > add explanation: test is sensitive to certain flags and might fail when flags are passed using -vmoptions and -javaoptions Thanks Stefan and Albert!!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15986#issuecomment-1766018271 From lkorinth at openjdk.org Tue Oct 17 09:24:33 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:24:33 GMT Subject: Integrated: 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm In-Reply-To: References: Message-ID: On Fri, 29 Sep 2023 13:44:26 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. This pull request has now been integrated. Changeset: 7ca0ae94 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/7ca0ae94159ac0fd2df23ee1a1e8cf626ce31048 Stats: 56 lines in 6 files changed: 40 ins; 6 del; 10 mod 8317228: GC: Make TestXXXHeapSizeFlags use createTestJvm Reviewed-by: sjohanss, ayang ------------- PR: https://git.openjdk.org/jdk/pull/15986 From lkorinth at openjdk.org Tue Oct 17 09:35:39 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:35:39 GMT Subject: RFR: 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm In-Reply-To: References: Message-ID: <2h_2RTxni17ohVaRHCYmkEVnv1hXWCAuj4DHWYw-v7M=.5c85a382-092b-4fce-9774-158faa24827c@github.com> On Mon, 2 Oct 2023 11:48:18 GMT, Leo Korinth wrote: > Also add parallel flag for first invocation (this is a bug). Initial testing ok, but will run tiers before pushing. Thanks Ivan and Thomas!!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16009#issuecomment-1766033054 From lkorinth at openjdk.org Tue Oct 17 09:35:39 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:35:39 GMT Subject: Integrated: 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 11:48:18 GMT, Leo Korinth wrote: > Also add parallel flag for first invocation (this is a bug). Initial testing ok, but will run tiers before pushing. This pull request has now been integrated. Changeset: 6ee6171e Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/6ee6171e8124ae8ce4f60c2582c2fe2cae6fc3db Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod 8317347: Parallel: Make TestInitialTenuringThreshold use createTestJvm Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16009 From lkorinth at openjdk.org Tue Oct 17 09:36:27 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:36:27 GMT Subject: RFR: 8317343: GC: Make TestHeapFreeRatio use createTestJvm In-Reply-To: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> References: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> Message-ID: <0a6i6R8ePr-nkrs4SFaj4aSm6DSyC1_1bmrzAOCLptA=.99f2b5fb-9fc5-40ef-a1bd-3e01c3078c59@github.com> On Mon, 2 Oct 2023 09:57:55 GMT, Leo Korinth wrote: > This fix is implicitly dependent on https://github.com/openjdk/jdk/pull/15986/files for `@requires opt.x` support. Initial testing passes, but I will do more (tier) testing before pushing. Thanks Ivan and Thomas!!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16007#issuecomment-1766034639 From lkorinth at openjdk.org Tue Oct 17 09:36:28 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:36:28 GMT Subject: Integrated: 8317343: GC: Make TestHeapFreeRatio use createTestJvm In-Reply-To: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> References: <90LqPWddFASKjNrWHZVoKgqHHAgrUBg0FUFnqtMJDCw=.012774fa-388e-4ef2-af4f-059c1a9ec41b@github.com> Message-ID: On Mon, 2 Oct 2023 09:57:55 GMT, Leo Korinth wrote: > This fix is implicitly dependent on https://github.com/openjdk/jdk/pull/15986/files for `@requires opt.x` support. Initial testing passes, but I will do more (tier) testing before pushing. This pull request has now been integrated. Changeset: c64bd3d6 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/c64bd3d6715304accd9a1e3266edd9d3d2353273 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod 8317343: GC: Make TestHeapFreeRatio use createTestJvm Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16007 From lkorinth at openjdk.org Tue Oct 17 09:44:22 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:44:22 GMT Subject: RFR: 8317317: G1: Make TestG1RemSetFlags use createTestJvm In-Reply-To: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> References: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> Message-ID: On Fri, 29 Sep 2023 14:51:23 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. Thanks Hamlin and Thomas!!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15989#issuecomment-1766046722 From lkorinth at openjdk.org Tue Oct 17 09:44:22 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 09:44:22 GMT Subject: Integrated: 8317317: G1: Make TestG1RemSetFlags use createTestJvm In-Reply-To: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> References: <9idI83bH5iOe-X793ARiN2Xi4SSOxpzPseaSLqlKSzM=.6be2e8ca-d17e-4e75-84e1-9436ac5a6438@github.com> Message-ID: On Fri, 29 Sep 2023 14:51:23 GMT, Leo Korinth wrote: > Minor testing done, I will test more later with other fixes. This pull request has now been integrated. Changeset: 5bd10521 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/5bd10521eb5e51e76b20e955addd45f76abba6f7 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8317317: G1: Make TestG1RemSetFlags use createTestJvm Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15989 From mli at openjdk.org Tue Oct 17 10:36:13 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 17 Oct 2023 10:36:13 GMT Subject: RFR: 8318155: Remove unnecessary virtual specifier in Space In-Reply-To: References: Message-ID: <7OMGKwonlPDN7hLc4FctlirhmMZbmugnDmzaimsmWYk=.d97c6e37-c0a4-46c2-8f3a-ebb91638890b@github.com> On Mon, 16 Oct 2023 10:12:10 GMT, Albert Mingkun Yang wrote: > Simple (or even trivial) removing some unnecessary "virtual" marks. LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16200#pullrequestreview-1681900899 From lkorinth at openjdk.org Tue Oct 17 11:53:54 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:53:54 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm [v2] In-Reply-To: References: Message-ID: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. Leo Korinth has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - @key flag-sensitive - Merge branch '_master_jdk' into _8317358 - 8317358: G1: Make TestMaxNewSize use createTestJvm ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16012/files - new: https://git.openjdk.org/jdk/pull/16012/files/ca9f1d8c..202fc48e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16012&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16012&range=00-01 Stats: 39284 lines in 1299 files changed: 24215 ins; 7280 del; 7789 mod Patch: https://git.openjdk.org/jdk/pull/16012.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16012/head:pull/16012 PR: https://git.openjdk.org/jdk/pull/16012 From lkorinth at openjdk.org Tue Oct 17 11:53:55 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:53:55 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. I have added `@key flag-sensitive` ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1766256848 From ayang at openjdk.org Tue Oct 17 11:54:53 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 11:54:53 GMT Subject: RFR: 8318155: Remove unnecessary virtual specifier in Space In-Reply-To: References: Message-ID: <_7M0jqaAp7nbA502gsRbA8iP3c430SWkhdrkCVQ3aVs=.29887c11-e589-4fb0-a131-379097231fc4@github.com> On Mon, 16 Oct 2023 10:12:10 GMT, Albert Mingkun Yang wrote: > Simple (or even trivial) removing some unnecessary "virtual" marks. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16200#issuecomment-1766261273 From ayang at openjdk.org Tue Oct 17 11:54:53 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 11:54:53 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 12:56:27 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: > > * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) > * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). > * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. > * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. > > Testing: gha > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16011#pullrequestreview-1682122861 From ayang at openjdk.org Tue Oct 17 11:54:55 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 11:54:55 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Tue, 17 Oct 2023 09:19:05 GMT, Thomas Schatzl wrote: > UnloadingScope should be called something different, maybe after this discussion something like UnlinkingScope or similar? OK, the new-scope starts to make sense if it's named `UnlinkingScope`. In my mind, the concept of `UnloadingScope` includes unlinking + purging; the implementation supports my understanding, as I believe the destructor still belongs to the obj-on-stack. However, I now realize that this interpretation could be subjective. > This change only at most exposes the existing (as you say ugly - I agree) structure. OK, that's all I wanna convey. Maybe this is an inevitable transient state. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16011#discussion_r1361991674 From lkorinth at openjdk.org Tue Oct 17 11:57:25 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:57:25 GMT Subject: RFR: 8317218: G1: Make TestG1HeapRegionSize use createTestJvm In-Reply-To: References: Message-ID: On Thu, 28 Sep 2023 09:07:04 GMT, Leo Korinth wrote: > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC'` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseZGC'` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:G1HeapRegionSize=80000'` > TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS='` -> PASS 1 Thanks Hamlin and Thomas! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15959#issuecomment-1766264530 From lkorinth at openjdk.org Tue Oct 17 11:58:13 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:58:13 GMT Subject: RFR: 8317188: G1: Make TestG1ConcRefinementThreads use createTestJvm In-Reply-To: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> References: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> Message-ID: On Thu, 28 Sep 2023 08:34:31 GMT, Leo Korinth wrote: > Testing after changes: > > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC' ` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseZGC' ` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:G1ConcRefinementThreads=42' ` -> TOTAL 0 Thanks Hamlin and Thomas! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15958#issuecomment-1766265551 From lkorinth at openjdk.org Tue Oct 17 11:59:10 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:59:10 GMT Subject: RFR: 8317316: G1: Make TestG1PercentageOptions use createTestJvm In-Reply-To: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> References: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> Message-ID: On Fri, 29 Sep 2023 14:24:07 GMT, Leo Korinth wrote: > Done basic testing, will do more testing before pushing (together with other tests) Thanks Hamlin and Thomas! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15987#issuecomment-1766262902 From lkorinth at openjdk.org Tue Oct 17 11:59:13 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:59:13 GMT Subject: RFR: 8317042: G1: Make TestG1ConcMarkStepDurationMillis use createTestJvm In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 16:37:54 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: `-XX+UseG1GC` (pass 1), `-XX+UseZGC` (total 0), `-XX:G1ConcMarkStepDurationMillis=23` (total 0) and no options (pass 1) Thanks Hamlin and Thomas! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15948#issuecomment-1766267051 From lkorinth at openjdk.org Tue Oct 17 11:59:10 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 11:59:10 GMT Subject: Integrated: 8317316: G1: Make TestG1PercentageOptions use createTestJvm In-Reply-To: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> References: <0C-HGCwaWhhH9Lo6IhAoSoOAFu7O-7oMJ0MtAizZxCU=.a67d9a2b-1fee-433c-8f23-670892f7b4e8@github.com> Message-ID: On Fri, 29 Sep 2023 14:24:07 GMT, Leo Korinth wrote: > Done basic testing, will do more testing before pushing (together with other tests) This pull request has now been integrated. Changeset: d8cd6058 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/d8cd60588aef6abcbfedbe3262d9a094c9bbcb8c Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod 8317316: G1: Make TestG1PercentageOptions use createTestJvm Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15987 From ayang at openjdk.org Tue Oct 17 11:59:14 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 11:59:14 GMT Subject: Integrated: 8318155: Remove unnecessary virtual specifier in Space In-Reply-To: References: Message-ID: On Mon, 16 Oct 2023 10:12:10 GMT, Albert Mingkun Yang wrote: > Simple (or even trivial) removing some unnecessary "virtual" marks. This pull request has now been integrated. Changeset: 8f79d889 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/8f79d889609b634282af1129559500c80505353a Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod 8318155: Remove unnecessary virtual specifier in Space Reviewed-by: tschatzl, mli ------------- PR: https://git.openjdk.org/jdk/pull/16200 From lkorinth at openjdk.org Tue Oct 17 12:01:11 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:01:11 GMT Subject: RFR: 8316973: GC: Make TestDisableDefaultGC use createTestJvm In-Reply-To: References: Message-ID: <_4S911tkSbX9Qb2R9KVdgqMOMtaCmfo5sHJFqmjaUSM=.b8a5b363-bd82-4264-8ba7-12c92495ed64@github.com> On Tue, 26 Sep 2023 17:24:22 GMT, Leo Korinth wrote: > There seems there is no need to strip external vm flags. The `@requires vm.gc=="null"` will handle the case when we iterate different GCs and we will then just skip the test. > > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC'` -> total 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java` -> success > > started tier 1-5 Thanks Albert, Hamlin and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15931#issuecomment-1766269611 From lkorinth at openjdk.org Tue Oct 17 12:02:14 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:02:14 GMT Subject: Integrated: 8317218: G1: Make TestG1HeapRegionSize use createTestJvm In-Reply-To: References: Message-ID: On Thu, 28 Sep 2023 09:07:04 GMT, Leo Korinth wrote: > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC'` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:+UseZGC'` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS=-XX:G1HeapRegionSize=80000'` > TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1HeapRegionSize.java JTREG='JAVA_OPTIONS='` -> PASS 1 This pull request has now been integrated. Changeset: 75b37e6d Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/75b37e6d7ec285f1a954f9d5b16bf9e6b642f2fc Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8317218: G1: Make TestG1HeapRegionSize use createTestJvm Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15959 From lkorinth at openjdk.org Tue Oct 17 12:03:08 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:03:08 GMT Subject: Integrated: 8317042: G1: Make TestG1ConcMarkStepDurationMillis use createTestJvm In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 16:37:54 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: `-XX+UseG1GC` (pass 1), `-XX+UseZGC` (total 0), `-XX:G1ConcMarkStepDurationMillis=23` (total 0) and no options (pass 1) This pull request has now been integrated. Changeset: 7e39e664 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/7e39e664cf6d4658b0aa03f9b5162cf7de40de28 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8317042: G1: Make TestG1ConcMarkStepDurationMillis use createTestJvm Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15948 From lkorinth at openjdk.org Tue Oct 17 12:04:03 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:04:03 GMT Subject: Integrated: 8317188: G1: Make TestG1ConcRefinementThreads use createTestJvm In-Reply-To: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> References: <0DdQPK9T0TaJ9GKOZpnPkf_sbMWj7KRNSOnZCCdx8gw=.bce252ce-ea85-4eb6-b7b1-87f9be203ff7@github.com> Message-ID: On Thu, 28 Sep 2023 08:34:31 GMT, Leo Korinth wrote: > Testing after changes: > > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseG1GC' ` -> PASS 1 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:+UseZGC' ` -> TOTAL 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestG1ConcRefinementThreads.java JTREG='JAVA_OPTIONS=-XX:G1ConcRefinementThreads=42' ` -> TOTAL 0 This pull request has now been integrated. Changeset: a949824e Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/a949824e98a8872645f292c9cc9ed2fe1cccadce Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8317188: G1: Make TestG1ConcRefinementThreads use createTestJvm Reviewed-by: mli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/15958 From lkorinth at openjdk.org Tue Oct 17 12:04:52 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:04:52 GMT Subject: Integrated: 8316973: GC: Make TestDisableDefaultGC use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 17:24:22 GMT, Leo Korinth wrote: > There seems there is no need to strip external vm flags. The `@requires vm.gc=="null"` will handle the case when we iterate different GCs and we will then just skip the test. > > Tested with: > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC'` -> total 0 > `make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestDisableDefaultGC.java` -> success > > started tier 1-5 This pull request has now been integrated. Changeset: 5f4be8ce Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/5f4be8cea980b3c2e8e5fb2067dc64b62fa0245c Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod 8316973: GC: Make TestDisableDefaultGC use createTestJvm Reviewed-by: ayang, mli, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/15931 From lkorinth at openjdk.org Tue Oct 17 12:05:44 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:05:44 GMT Subject: RFR: 8316410: GC: Make TestCompressedClassFlags use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 16:50:26 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: > > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseParallelGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC -XX:+ZGenerational' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseShenandoahGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g -XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseCompressedClassPointers' ==> TOTAL:0 Thanks Albert and Hamlin! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15929#issuecomment-1766272667 From lkorinth at openjdk.org Tue Oct 17 12:05:44 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 17 Oct 2023 12:05:44 GMT Subject: Integrated: 8316410: GC: Make TestCompressedClassFlags use createTestJvm In-Reply-To: References: Message-ID: On Tue, 26 Sep 2023 16:50:26 GMT, Leo Korinth wrote: > Use createTestJvm. > > Tested with: > > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseParallelGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC -XX:+ZGenerational' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseShenandoahGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC' ==> TOTAL:1 PASS:1 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g -XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:CompressedClassSpaceSize=1g' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:-UseCompressedClassPointers' ==> TOTAL:0 > make run-test TEST=open/test/hotspot/jtreg/gc/arguments/TestCompressedClassFlags.java JTREG='RETAIN=all;VERBOSE=all;JAVA_OPTIONS=-XX:+UseCompressedClassPointers' ==> TOTAL:0 This pull request has now been integrated. Changeset: e649c563 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/e649c563242a876a20007470c9412311ffa2a568 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod 8316410: GC: Make TestCompressedClassFlags use createTestJvm Reviewed-by: ayang, mli ------------- PR: https://git.openjdk.org/jdk/pull/15929 From sjohanss at openjdk.org Tue Oct 17 12:06:30 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 17 Oct 2023 12:06:30 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm [v2] In-Reply-To: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> References: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> Message-ID: <75hcHnvCffAOZfpwMmuepUxR3Vi2qAXFb83lY4EV9Zw=.5bd1342e-dc6a-443e-8c07-af080c22eb9e@github.com> On Tue, 17 Oct 2023 11:53:54 GMT, Leo Korinth wrote: >> In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. >> >> Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` >> >> Minimal testing completed, will run tier testing before pushing. > > Leo Korinth has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - @key flag-sensitive > - Merge branch '_master_jdk' into _8317358 > - 8317358: G1: Make TestMaxNewSize use createTestJvm Marked as reviewed by sjohanss (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16012#pullrequestreview-1682144154 From ayang at openjdk.org Tue Oct 17 12:07:23 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 17 Oct 2023 12:07:23 GMT Subject: RFR: 8318296: Move Space::initialize to ContiguousSpace Message-ID: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> Simple refactoring of moving methods to subclass. (`clear` is also moved, as it is called inside `initialize`.) ------------- Commit messages: - merge-subklass-space Changes: https://git.openjdk.org/jdk/pull/16219/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16219&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318296 Stats: 51 lines in 2 files changed: 17 ins; 29 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/16219.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16219/head:pull/16219 PR: https://git.openjdk.org/jdk/pull/16219 From tschatzl at openjdk.org Tue Oct 17 15:31:21 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 17 Oct 2023 15:31:21 GMT Subject: RFR: 8318296: Move Space::initialize to ContiguousSpace In-Reply-To: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> References: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> Message-ID: On Tue, 17 Oct 2023 11:57:59 GMT, Albert Mingkun Yang wrote: > Simple refactoring of moving methods to subclass. (`clear` is also moved, as it is called inside `initialize`.) Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16219#pullrequestreview-1682680414 From wkemper at openjdk.org Tue Oct 17 18:19:08 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 17 Oct 2023 18:19:08 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v5] In-Reply-To: References: Message-ID: > Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Make explicit field for Shenandoah - Restore original retry logic, pull gc overhead check back over retry - Merge branch 'openjdk-master' into shenandoah-oome-redux - Merge check for no-progress into retry allocation block - Revert change to TEST.groups - Merge remote-tracking branch 'openjdk/master' into shenandoah-oome-redux - Extend exemption for EATests that rely on timely OOME to Shenandoah - Improve comment, increase default for no progress threshold - Allocator should not reset bad progress count - Allocator should not reset bad progress count - ... and 3 more: https://git.openjdk.org/jdk/compare/b9836bee...868af376 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15852/files - new: https://git.openjdk.org/jdk/pull/15852/files/1971467f..868af376 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=03-04 Stats: 32231 lines in 1017 files changed: 18757 ins; 6567 del; 6907 mod Patch: https://git.openjdk.org/jdk/pull/15852.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15852/head:pull/15852 PR: https://git.openjdk.org/jdk/pull/15852 From wkemper at openjdk.org Tue Oct 17 18:36:41 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 17 Oct 2023 18:36:41 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v6] In-Reply-To: References: Message-ID: > Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). William Kemper has updated the pull request incrementally with two additional commits since the last revision: - Remove unnecessary nesting and other differences - Update formatting ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15852/files - new: https://git.openjdk.org/jdk/pull/15852/files/868af376..4154370c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=04-05 Stats: 56 lines in 2 files changed: 17 ins; 16 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/15852.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15852/head:pull/15852 PR: https://git.openjdk.org/jdk/pull/15852 From shade at openjdk.org Tue Oct 17 19:26:45 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 17 Oct 2023 19:26:45 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v6] In-Reply-To: References: Message-ID: On Tue, 17 Oct 2023 18:36:41 GMT, William Kemper wrote: >> Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Remove unnecessary nesting and other differences > - Update formatting All right, this looks fine. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15852#pullrequestreview-1683198268 From tschatzl at openjdk.org Wed Oct 18 10:02:39 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 18 Oct 2023 10:02:39 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm [v2] In-Reply-To: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> References: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> Message-ID: On Tue, 17 Oct 2023 11:53:54 GMT, Leo Korinth wrote: >> In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. >> >> Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` >> >> Minimal testing completed, will run tier testing before pushing. > > Leo Korinth has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - @key flag-sensitive > - Merge branch '_master_jdk' into _8317358 > - 8317358: G1: Make TestMaxNewSize use createTestJvm Similar to @kstefanj I am a bit doubtful to have a test that is needs constraints on heap size to run with any random flags. This can be seen on the very long list of @requires constraints. But I am good with doing this experiment and see if it will be annoying enough or not to reconsider. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16012#pullrequestreview-1684635185 From shade at openjdk.org Wed Oct 18 14:27:26 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 18 Oct 2023 14:27:26 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC [v2] In-Reply-To: References: Message-ID: > See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into JDK-8317755-g1-periodic-whole - Optional flag - Merge branch 'master' into JDK-8317755-g1-periodic-whole - Keep the cast - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16107/files - new: https://git.openjdk.org/jdk/pull/16107/files/d6791146..4025fd02 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16107&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16107&range=00-01 Stats: 32385 lines in 1033 files changed: 18888 ins; 6573 del; 6924 mod Patch: https://git.openjdk.org/jdk/pull/16107.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16107/head:pull/16107 PR: https://git.openjdk.org/jdk/pull/16107 From shade at openjdk.org Wed Oct 18 14:27:28 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 18 Oct 2023 14:27:28 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: On Mon, 9 Oct 2023 20:46:44 GMT, Aleksey Shipilev wrote: > See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` Thanks for looking at it! Re-reading JEP 346: ?Promptly Return Unused Committed Memory from G1?... In the scenario we are seeing, we do have lots of unused committed memory that would not be reclaimed promptly until concurrent cycle executes. The need for that cleanup is in worst case driven by heap connectivity changes that do not readily reflect in other observable heap metrics. A poster example would be a cache sitting perfectly in "oldgen", eventually dropping the entries, producing a huge garbage patch. We would only discover this after whole heap marking. In some cases, the tuning based on occupancy (e.g. soft max heap size, heap reserve, etc.) would help if we promote enough stuff to actually trigger the concurrent cycle. But if we keep churning very efficient young collections, we would get there very slowly. Therefore, I?d argue the current behavior is against _the spirit_ of the JEP 346, even though _the letter_ says that we track ?any? GC for periodic GC. There is no explicit mention why young GCs should actually be treated as recent GCs, even though ? with the benefit of hindsight ? they throw away promptness guarantees. Aside: Shenandoah periodic GC does not make any claims it would run only in idle phases; albeit the story is simpler without young GCs. But Generational Shenandoah would follow the same route: the periodic whole heap GC would start periodically regardless of young collections running in between or not. The current use of ?idle? is also awkward in JEP 346. If user enables `G1PeriodicGCInterval` without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. I guess we can think about ?idle? as ?GC is idle?, but then arguably not figuring out the whole heap situation _for hours_ can be described as ?GC is idle?. I think the much larger point of the JEP is to reclaim memory promptly, which in turn requires whole heap GC. Looks like JEP somewhat painted itself in the corner by considering all GCs, including young. I doubt that users would mind if we change the behavior of `G1PeriodicGCInterval` like this: the option is explicitly opt-in, the configurations I see in prod are running with huge intervals, etc. So we are asking for a relatively rare concurrent GC even when application is doing young GCs. But I agree that departing from the current behavior might still have undesired consequences, for which we need to plan the escape route. There is also a need to have this in JDK 17u and 21u. So as the compromise I am leaning towards introducing the flag what defines whether periodic GC checks for last whole heap collection or not. The default can still be the current behavior, but the behavior we want here is also easily achievable. This is implemented as new commit. (I realize this still requires CSR, being a product flag and all.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1768580421 From ayang at openjdk.org Wed Oct 18 16:22:00 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 18 Oct 2023 16:22:00 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC [v2] In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 14:27:26 GMT, Aleksey Shipilev wrote: >> See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into JDK-8317755-g1-periodic-whole > - Optional flag > - Merge branch 'master' into JDK-8317755-g1-periodic-whole > - Keep the cast > - Fix I believe that having periodic Garbage Collections (whether they be young, concurrent, full or mixed) may be something that user applications need in certain scenarios. However, the current JVM lacks a way for Java to invoke a specific kind of GC cycle. If the WhiteBox API (or something equivalent) were available in Java, a user application could easily implement periodic GCs, along with any custom tweaks, e.g. the behavior of master -- short-circuiting periodic concurrent GC. That way, one can avoid confusions around `The current use of ?idle? is also awkward in JEP 346...`. Just my 2 cents. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1768889026 From tschatzl at openjdk.org Thu Oct 19 07:59:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 07:59:29 GMT Subject: RFR: 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking Message-ID: Hi all, please review this change to fix a lock rank issue after [JDK-8317440](https://bugs.openjdk.org/browse/JDK-8317440). The `JfrStacktrace_lock` rank needs to be adjusted because it may be used when gathering CHT statistics in generational ZGC (afaiu this operation may trigger a relocation which may trigger page allocation which may trigger a JFR event which may trigger JFR trace writing). The `JfrStacktrace_lock` code seems to be a lock only guarding I/O too, so moving it to the lowest rank seems safe after code inspection, but also tier1-7 testing and around 100 runs of the failed application did not make it fail (reproduction is like 2/100 runs without this change). Testing: tier1-7, ~100 runs of the failing application with no error Hth, Thomas ------------- Commit messages: - change jfrstacktrace_lock rank Changes: https://git.openjdk.org/jdk/pull/16242/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16242&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318109 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16242.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16242/head:pull/16242 PR: https://git.openjdk.org/jdk/pull/16242 From tschatzl at openjdk.org Thu Oct 19 08:11:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 08:11:18 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 14:24:44 GMT, Aleksey Shipilev wrote: >> See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > > Thanks for looking at it! > > Re-reading JEP 346: ?Promptly Return Unused Committed Memory from G1?... > > In the scenario we are seeing, we do have lots of unused committed memory that would not be reclaimed promptly until concurrent cycle executes. The need for that cleanup is in worst case driven by heap connectivity changes that do not readily reflect in other observable heap metrics. A poster example would be a cache sitting perfectly in "oldgen", eventually dropping the entries, producing a huge garbage patch. We would only discover this after whole heap marking. In some cases, the tuning based on occupancy (e.g. soft max heap size, heap reserve, etc.) would help if we promote enough stuff to actually trigger the concurrent cycle. But if we keep churning very efficient young collections, we would get there very slowly. > > Therefore, I?d argue the current behavior is against _the spirit_ of the JEP 346, even though _the letter_ says that we track ?any? GC for periodic GC. There is no explicit mention why young GCs should actually be treated as recent GCs, even though ? with the benefit of hindsight ? they throw away promptness guarantees. Aside: Shenandoah periodic GC does not make any claims it would run only in idle phases; albeit the story is simpler without young GCs. But Generational Shenandoah would follow the same route: the periodic whole heap GC would start periodically regardless of young collections running in between or not. > > The current use of ?idle? is also awkward in JEP 346. If user enables `G1PeriodicGCInterval` without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. I guess we can think about ?idle? as ?GC is idle?, but then arguably not figuring out the whole heap situation _for hours_ can be described as ?GC is idle?. I think the much larger point of the JEP is to reclaim memory promptly, which in turn requires whole heap GC. Looks like JEP somewhat painted itself in the corner by considering all GCs, including young. > > I doubt that users would mind if we change the behavior of `G1PeriodicGCInterval` like this: the option is explicitly opt-in, the configurations I see in prod are running with huge intervals, etc. So we are asking for a relatively rare concurrent GC even when application is doing young GCs. But I agree that departing from the current behavior might still have undesired consequences, for which we need to plan the escape route. There is also a need to ... @shipilev : I have not made up my mind about the other parts of your proposal, but: > The current use of ?idle? is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. This is the reason for the `G1PeriodicGCSystemLoadThreshold` option and is handled by the feature/JEP. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1770283385 From tschatzl at openjdk.org Thu Oct 19 08:11:17 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 08:11:17 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC [v2] In-Reply-To: References: Message-ID: <5cJAc8Tppr8ogDpWztriAhnfUw6D0INhhmn1A2hxSSA=.9c4bf7e1-11c4-4907-9365-baa16187c6f9@github.com> On Wed, 18 Oct 2023 16:19:08 GMT, Albert Mingkun Yang wrote: > I believe that having periodic Garbage Collections (whether they be young, concurrent, full or mixed) may be something that user applications need in certain scenarios. However, the current JVM lacks a way for Java to invoke a specific kind of GC cycle. If the WhiteBox API (or something equivalent) were available in Java, a user application could easily implement periodic GCs, The user can already do this to some degree with `System.gc()` and `-XX:+ExplicitGCInvokesConcurrent`. I think the point @shipilev (and @gctony) make is that there are resources (not only Java heap memory) that ought to be cleaned up "in time" to keep resource usage and performance "optimal". The end user has even less insight into this than the VM - being "idle" is only a convenient time (for the user) to do that, so making the end user doing random stuff does not make anyone happy. This is corroborated by experience from me working with end users actually doing that (i.e. force a whole heap analysis externally). The solution was to tell them to not do that because they were spamming these to an extent very detrimental to performance - because they did not know better (it did not help that at that time a forced concurrent cycle interrupted an ongoing one, so it happened that effectively they went into a full gc sometimes). Both ZGC and Shenandoah have this regular cleanup (and CMS also provided the option) based on time. Not the best solution, but workable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1770279786 From rrich at openjdk.org Thu Oct 19 09:16:04 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 09:16:04 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v23] In-Reply-To: References: Message-ID: <1HXBowkNZo1iyNgOVA6qZYcyz0alZ8r5FBMQ3FqAvTE=.8a95691e-f02d-486a-96c3-88d5ec836228@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 36 additional commits since the last revision: - Use better name: _preprocessing_active_workers - Merge branch 'master' - Remove obsolete comment - Feedback Albert - Merge branch 'master' - Re-cleanup (was accidentally reverted) - Make sure to scan obj reaching in just once - Simplification suggested by Albert - Don't overlap card table processing with scavenging for simplicity - Cleanup - ... and 26 more: https://git.openjdk.org/jdk/compare/71431ab6...f7965512 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/607f0c22..f7965512 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=21-22 Stats: 14751 lines in 659 files changed: 9415 ins; 2652 del; 2684 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 19 09:22:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 09:22:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v11] In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 14:13:23 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with five additional commits since the last revision: >> >> - Eliminate special case for scanning the large array end >> - First card of large array should be cleared if dirty >> - Do all large array scanning in separate method >> - Limit stripe size to 1m with at least 8 threads >> - Small clean-ups > > Hi, > >> > I experimented with the aforementioned read-only card table idea a bit and here is the draft: >> > https://github.com/openjdk/jdk/compare/master...albertnetymk:jdk:pgc-precise-obj-arr?expand=1 >> >> This looks very nice! The code is a lot easier to follow than the baseline and this pr. >> >> With your draft I found out too that the regressions with just 2 threads come from the remaining `object_start` calls. Larger stripes mean fewer of them. The caching used in your draft is surly better. >> >> So by default 1 card table byte per 512b card is needed. The shadow card table will require 2M per gigabyte used old generation. I guess that's affordable. >> >> Would you think that your solution can be backported? > > I had a brief look at @albertnetymk's suggestion, a few comments: > > * it uses another card table - while "just" another 0.2% of the heap, we should try to avoid such regressions. G1 also does not need another card table... maybe some more effort should be put into optimizing that one away. > * obviously allocating and freeing during the pause is suboptimal wrt to pause time so the prototype should be improved in that regard :) > * the copying will stay (if there is a second card table), I would be interested in pause time changes for more throughput'y applications (jbb2005, timefold/optaplanner https://timefold.ai/blog/2023/java-21-performance) > * anything can be backported, but the question is whether the individual maintainers of these versions are going to. It does have a good case though which may make it easier to convince maintainers. > > Hth, > Thomas I don't intend to make any further changes or tests. So the pr would be ready for another look at it, @tschatzl , if you want. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1770407514 From lkorinth at openjdk.org Thu Oct 19 09:26:45 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 19 Oct 2023 09:26:45 GMT Subject: RFR: 8317358: G1: Make TestMaxNewSize use createTestJvm [v2] In-Reply-To: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> References: <_gLpPcem1tKtwWI7WyirwfrSUfQhyCVjuuf30ShQCoQ=.578fa9f0-3ea8-4ee2-a9a7-dce53addf0ad@github.com> Message-ID: <4txUaWNhUteYCQDDZsVWfN1Q8-wFeQgHaDBi5EeGgzI=.44ecaccc-de7f-4706-9107-d48ff626ca55@github.com> On Tue, 17 Oct 2023 11:53:54 GMT, Leo Korinth wrote: >> In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. >> >> Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` >> >> Minimal testing completed, will run tier testing before pushing. > > Leo Korinth has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - @key flag-sensitive > - Merge branch '_master_jdk' into _8317358 > - 8317358: G1: Make TestMaxNewSize use createTestJvm Thanks Thomas and Stefan! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16012#issuecomment-1770414016 From lkorinth at openjdk.org Thu Oct 19 09:29:44 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 19 Oct 2023 09:29:44 GMT Subject: Integrated: 8317358: G1: Make TestMaxNewSize use createTestJvm In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 15:09:49 GMT, Leo Korinth wrote: > In addition remove deprecated `Long(long)` constructor and rewrite `compareTo` to use `> 0` instead of `== 1`. > > Also remove unused `isRunningG1(String[] args)` and `checkIncompatibleNewSize(String[] flags)` > > Minimal testing completed, will run tier testing before pushing. This pull request has now been integrated. Changeset: 1a098356 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/1a098356dd3a157b12c2b5c527e61c8a628bdb2d Stats: 33 lines in 1 file changed: 3 ins; 23 del; 7 mod 8317358: G1: Make TestMaxNewSize use createTestJvm Reviewed-by: tschatzl, sjohanss ------------- PR: https://git.openjdk.org/jdk/pull/16012 From ayang at openjdk.org Thu Oct 19 12:00:54 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 19 Oct 2023 12:00:54 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v30] In-Reply-To: References: Message-ID: On Fri, 13 Oct 2023 01:38:13 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Add call to publish in parallel gc and update counter names Since it introduces new APIs in `CollectedHeap`... ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1770750667 From ayang at openjdk.org Thu Oct 19 12:47:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 19 Oct 2023 12:47:57 GMT Subject: RFR: 8318510: Serial: Remove TenuredGeneration::block_size Message-ID: Simple one-line change inside `complete_loaded_archive_space`, then `block_size` becomes unused. ------------- Commit messages: - s1-block-size Changes: https://git.openjdk.org/jdk/pull/16266/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16266&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318510 Stats: 27 lines in 6 files changed: 0 ins; 26 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16266.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16266/head:pull/16266 PR: https://git.openjdk.org/jdk/pull/16266 From tschatzl at openjdk.org Thu Oct 19 13:46:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 13:46:55 GMT Subject: RFR: 8318510: Serial: Remove TenuredGeneration::block_size In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 12:41:53 GMT, Albert Mingkun Yang wrote: > Simple one-line change inside `complete_loaded_archive_space`, then `block_size` becomes unused. lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16266#pullrequestreview-1687782306 From rrich at openjdk.org Thu Oct 19 14:49:22 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 14:49:22 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: preprocess_card_table_parallel should be private ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/f7965512..26f06361 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=22-23 Stats: 16 lines in 1 file changed: 8 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From tschatzl at openjdk.org Thu Oct 19 15:25:11 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 15:25:11 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 14:49:22 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > preprocess_card_table_parallel should be private Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/parallel/psCardTable.cpp line 146: > 144: // to all stripes (if any) they extend to. > 145: // A copy of card table entries corresponding to the stripe called "shadow" table > 146: // is used to separate card reading, clearing and redirtying. It looks like this documentation (at least the last part starting with the second sentence) should be placed to `scavenge_contents_parallel` because _this_ method does not do preprocessing (but `scavenge_contents_parallel`). (And the first sentence should be in the definitions in the hpp file) Generally I have a feeling that the documentation is scattered across too many places and that imo makes things harder to understand than necessary. I.e. I would really prefer some larger block summarizing how things work with at most small details added to the methods. The interface (.hpp) file has no documentation whatsoever, although I would assume that this would be the entry point trying to understand what is happening here. src/hotspot/share/gc/parallel/psCardTable.cpp line 157: > 155: StripeShadowTable sct(this, MemRegion(start, end)); > 156: > 157: // end might not be card-aligned Suggestion: // end might not be card-aligned. src/hotspot/share/gc/parallel/psCardTable.cpp line 171: > 169: } > 170: > 171: // Located a non-empty dirty chunk [dirty_l, dirty_r) Suggestion: // Located a non-empty dirty chunk [dirty_l, dirty_r). src/hotspot/share/gc/parallel/psCardTable.cpp line 175: > 173: HeapWord* addr_r = MIN2(sct.addr_for(dirty_r), end); > 174: > 175: // Scan objects overlapping [addr_l, addr_r) limited to [start, end) Suggestion: // Scan objects overlapping [addr_l, addr_r) limited to [start, end). src/hotspot/share/gc/parallel/psCardTable.cpp line 186: > 184: > 185: if (is_obj_array) { > 186: // precise-marked Suggestion: // Always scan obj arrays precisely (they are always marked precisely) to avoid unnecessary work. src/hotspot/share/gc/parallel/psCardTable.cpp line 190: > 188: } else { > 189: if (obj_addr < i_addr && i_addr > start) { > 190: // already-scanned Suggestion: // Already scanned this object. Has been one that spans multiple dirty chunks. The second condition makes sure // that we always scan the (non-Array) object reaching into this stripe. src/hotspot/share/gc/parallel/psCardTable.cpp line 201: > 199: } > 200: > 201: // move to next obj inside this dirty chunk Suggestion: // Move to next obj inside this dirty chunk. src/hotspot/share/gc/parallel/psCardTable.cpp line 205: > 203: } > 204: > 205: // Finished a dirty chunk Suggestion: // Finished a dirty chunk. src/hotspot/share/gc/parallel/psCardTable.cpp line 210: > 208: } > 209: > 210: // Propagate imprecise card marks from object start to the stripes an object extends to. Suggestion: // Propagate imprecise card marks from object start to all stripes an object extends to this thread is assigned to. (I saw that this is actually duplicated from the .hpp file. Better to improve the one in the .hpp file and remove this one) src/hotspot/share/gc/parallel/psCardTable.cpp line 217: > 215: uint stripe_index, > 216: uint n_stripes) { > 217: const uint active_workers = n_stripes; Suggestion: `active_workers` is unused. src/hotspot/share/gc/parallel/psCardTable.cpp line 221: > 219: CardValue* cur_card = byte_for(old_gen_bottom) + stripe_index * num_cards_in_stripe; > 220: CardValue* const end_card = byte_for(old_gen_top - 1) + 1; > 221: HeapWord* signaled_goal = nullptr; Unused too. Suggestion: src/hotspot/share/gc/parallel/psCardTable.cpp line 235: > 233: } > 234: } > 235: } I think this code becomes more clear if the nested-ifs are replaced by negation and `continue`. I also added some additional comments giving reasons for the conditions. Suggestion: for (CardValue* cur_card = byte_for(old_gen_bottom) + stripe_index * num_cards_in_stripe; // this may be left outside, your call, it is a bit long. cur_card < end_card; cur_card += num_cards_in_slice) { HeapWord* stripe_addr = addr_for(cur_card); if (is_dirty(cur_card) { // The first card of this stripe is already dirty, no need to see if the reaching-in object is a potentially imprecisely marked non-array object. continue; } HeapWord* first_obj_addr = object_start(stripe_addr); if (first_obj_addr == stripe_addr) { // (random comment) can't be > I think // No object reaching into this stripe. continue; } oop first_obj = cast_to_oop(first_obj_addr); if (!first_obj->is_array() && is_dirty(byte_for(first_obj_addr))) { // Found a non-array object reaching into the stripe assigned to this thread that has potentially been marked imprecisely. // Mark first card of stripe dirty so that this thread will process it later. *cur_card = dirty_card_val(); } } src/hotspot/share/gc/parallel/psCardTable.cpp line 242: > 240: SpinYield spin_yield; > 241: while (Atomic::load_acquire(&_preprocessing_active_workers) > 0) { > 242: spin_yield.wait(); I would prefer to have the synchronization as part of `scavenge_contents_parallel`; i.e. the logic there being Prepare Scavenge Synchronize Scavenge Here the synchronization feels out of place and surprising for a method that nowhere indicates that it is doing anything other than preprocessing the table. src/hotspot/share/gc/parallel/psCardTable.cpp line 309: > 307: > 308: const size_t stripe_size_in_words = num_cards_in_stripe * _card_size_in_words; > 309: const size_t slice_size_in_words = stripe_size_in_words * n_stripes; Reiterating this, these two initializations seem to be related to the "Scavenge" phase of this method and should be placed there. src/hotspot/share/gc/parallel/psCardTable.cpp line 311: > 309: const size_t slice_size_in_words = stripe_size_in_words * n_stripes; > 310: > 311: // Prepare scavenge Suggestion: // Prepare scavenge. src/hotspot/share/gc/parallel/psCardTable.cpp line 315: > 313: > 314: // Reset cached object > 315: cached_obj = {nullptr, old_gen_bottom}; Suggestion: // Prepare for actual scavenge. const size_t stripe_size_in_words = num_cards_in_stripe * _card_size_in_words; const size_t slice_size_in_words = stripe_size_in_words * n_stripes; cached_obj = {nullptr, old_gen_bottom}; (I do not feel that "Reset cached object" adds a lot) src/hotspot/share/gc/parallel/psCardTable.cpp line 319: > 317: // Scavenge > 318: HeapWord* cur_addr = old_gen_bottom + stripe_index * stripe_size_in_words; > 319: for (/* empty */; cur_addr < old_gen_top; cur_addr += slice_size_in_words) { Suggestion: for (HeapWord* cur_addr = old_gen_bottom + stripe_index * stripe_size_in_words; cur_addr < old_gen_top; cur_addr += slice_size_in_words) { feeld better to me than that `/*empty*/` marker, but both is fine. src/hotspot/share/gc/parallel/psCardTable.hpp line 35: > 33: class PSPromotionManager; > 34: > 35: class PSCardTable: public CardTable { I am aware that `PSCardTable`'s function is not only scavenging support, but I would prefer documentation how this works here (or with the `scavenge_contents_parallel` method, at "Scavenge Support" comment) src/hotspot/share/gc/parallel/psCardTable.hpp line 36: > 34: > 35: class PSCardTable: public CardTable { > 36: private: Suggestion: Unnecessary. src/hotspot/share/gc/parallel/psCardTable.hpp line 44: > 42: const CardValue* _table_base; > 43: > 44: public: Suggestion: public: Should align with `class`. src/hotspot/share/gc/parallel/psCardTable.hpp line 45: > 43: > 44: public: > 45: StripeShadowTable(PSCardTable* pst, MemRegion stripe) : Fwiw, this is the only place I can find in this change where a memory range is passed as `MemRegion`. All other places pass the start/end pointers directly. Nothing to do here, but maybe for uniformity keep doing either. (I somewhat prefer using `MemRegion`, but not a strong opinion at all) src/hotspot/share/gc/parallel/psCardTable.hpp line 47: > 45: StripeShadowTable(PSCardTable* pst, MemRegion stripe) : > 46: _table_base(_table - (uintptr_t(stripe.start()) >> _card_shift)) { > 47: // Old gen top is not card aligned. Suggestion: // Old gen top may not be card aligned. src/hotspot/share/gc/parallel/psCardTable.hpp line 49: > 47: // Old gen top is not card aligned. > 48: size_t copy_length = align_up(stripe.byte_size(), _card_size) >> _card_shift; > 49: size_t clear_length = align_down(stripe.byte_size(), _card_size) >> _card_shift; Can you explain why `align_down` is needed here? I remember some reason why this needs to be the case at least for the old code, and @albertnetymk also explained it to me recently, but just now I can't figure it out (and it may not be required any more). Please add a comment, this is not obvious. src/hotspot/share/gc/parallel/psCardTable.hpp line 51: > 49: size_t clear_length = align_down(stripe.byte_size(), _card_size) >> _card_shift; > 50: memcpy(_table, pst->byte_for(stripe.start()), copy_length); > 51: memset(pst->byte_for(stripe.start()), clean_card_val(), clear_length); `pst->byte_for(stripe.start())` could be extracted out. src/hotspot/share/gc/parallel/psPromotionManager.inline.hpp line 135: > 133: > 134: inline void PSPromotionManager::push_contents_bounded(oop obj, HeapWord* left, HeapWord* right) { > 135: if (!obj->klass()->is_typeArray_klass()) { I would probably put this check into `scan_obj_with_limit()` to also avoid the unnecessary prefetch and initialization of `PSPushContentsClosure`. `scan_obj_with_limit` seems to be the only caller anyway. ------------- PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1687805994 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365643777 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365704196 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365704534 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365704876 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365708507 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365709874 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365716122 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365716661 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365675777 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365596378 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365595698 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365684229 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365656451 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365658417 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365668045 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365667245 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365662064 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365719821 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365604379 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365606118 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365697976 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365692703 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365608483 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365692173 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365724310 From tschatzl at openjdk.org Thu Oct 19 15:25:20 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 15:25:20 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v23] In-Reply-To: <1HXBowkNZo1iyNgOVA6qZYcyz0alZ8r5FBMQ3FqAvTE=.8a95691e-f02d-486a-96c3-88d5ec836228@github.com> References: <1HXBowkNZo1iyNgOVA6qZYcyz0alZ8r5FBMQ3FqAvTE=.8a95691e-f02d-486a-96c3-88d5ec836228@github.com> Message-ID: On Thu, 19 Oct 2023 09:16:04 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 36 additional commits since the last revision: > > - Use better name: _preprocessing_active_workers > - Merge branch 'master' > - Remove obsolete comment > - Feedback Albert > - Merge branch 'master' > - Re-cleanup (was accidentally reverted) > - Make sure to scan obj reaching in just once > - Simplification suggested by Albert > - Don't overlap card table processing with scavenging for simplicity > - Cleanup > - ... and 26 more: https://git.openjdk.org/jdk/compare/8e97b7e4...f7965512 src/hotspot/share/gc/parallel/psCardTable.hpp line 91: > 89: return end; > 90: } > 91: }; Could these implementations moved into the .cpp file? They are only every referenced by that and should be inlined anyway to not clog the interface/hpp file too much. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365581799 From tschatzl at openjdk.org Thu Oct 19 15:39:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 15:39:32 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates Message-ID: Hi all, please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. Testing: gha, tier1-5 almost done Thanks, Thomas ------------- Commit messages: - Remove obsolete/incorrect asserts - initial version Changes: https://git.openjdk.org/jdk/pull/16270/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16270&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318507 Stats: 10 lines in 1 file changed: 0 ins; 7 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16270.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16270/head:pull/16270 PR: https://git.openjdk.org/jdk/pull/16270 From ayang at openjdk.org Thu Oct 19 16:04:31 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 19 Oct 2023 16:04:31 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 14:07:21 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> preprocess_card_table_parallel should be private > > src/hotspot/share/gc/parallel/psCardTable.hpp line 49: > >> 47: // Old gen top is not card aligned. >> 48: size_t copy_length = align_up(stripe.byte_size(), _card_size) >> _card_shift; >> 49: size_t clear_length = align_down(stripe.byte_size(), _card_size) >> _card_shift; > > Can you explain why `align_down` is needed here? I remember some reason why this needs to be the case at least for the old code, and @albertnetymk also explained it to me recently, but just now I can't figure it out (and it may not be required any more). Please add a comment, this is not obvious. Since old-gen-top before scavenging might not be card-aligned, it's unsafe to clear it; hence the conservative (align-down) calculation. However, the right shift will do implicit align-down as well, so it is probably not needed. Better be explicit, I was thinking. Either is fine, I guess. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365791289 From duke at openjdk.org Thu Oct 19 16:14:20 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Thu, 19 Oct 2023 16:14:20 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> References: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> Message-ID: On Mon, 16 Oct 2023 20:03:44 GMT, Leela Mohan Venati wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup old oop map cache entry after class redefinition > > src/hotspot/share/gc/shenandoah/shenandoahVMOperations.cpp line 64: > >> 62: OopMapCache::cleanup_old_entries(); >> 63: } >> 64: > > Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. If yes, i recommend adding OopMapCache::cleanup_old_entries() in VM_ShenandoahOperation::doit_epilogue(). And this would make the change simple and also revert the change in this [PR](https://github.com/openjdk/jdk/pull/15921) I stand corrected. My question is still relevant >> Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. My recommendation is incorrect. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1365804097 From rrich at openjdk.org Thu Oct 19 16:34:53 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 16:34:53 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v25] In-Reply-To: References: Message-ID: <_8M6U5h8p7PTuQ6XL113JmjeiSYG3dHiF2pNe0D36qo=.c9297e9b-d638-4ec2-993f-6a90e375934c@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Small cleanup changes suggested by Thomas. Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/26f06361..7843a023 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=23-24 Stats: 11 lines in 2 files changed: 0 ins; 3 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 19 16:34:55 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 16:34:55 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 14:41:37 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> preprocess_card_table_parallel should be private > > src/hotspot/share/gc/parallel/psCardTable.cpp line 315: > >> 313: >> 314: // Reset cached object >> 315: cached_obj = {nullptr, old_gen_bottom}; > > Suggestion: > > // Prepare for actual scavenge. > const size_t stripe_size_in_words = num_cards_in_stripe * _card_size_in_words; > const size_t slice_size_in_words = stripe_size_in_words * n_stripes; > > cached_obj = {nullptr, old_gen_bottom}; > > > (I do not feel that "Reset cached object" adds a lot) The reset is needed because the cache requires that queries are monotonic. There was an assertion checking monotonicity. Albert thought it wouldn't be needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365819544 From rrich at openjdk.org Thu Oct 19 16:42:27 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 16:42:27 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v26] In-Reply-To: References: Message-ID: <_2a21qNJwmnjwMP7EREBeCgrZvNOgx6ScN55rstyOUM=.e6d83a4f-dcb0-431b-8275-f95b3940db8c@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: More small changes Thomas suggested (line-breaks needed) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/7843a023..bd853c4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=24-25 Stats: 6 lines in 1 file changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From iwalulya at openjdk.org Thu Oct 19 17:41:53 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Oct 2023 17:41:53 GMT Subject: RFR: 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 10:28:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change to fix a lock rank issue after [JDK-8317440](https://bugs.openjdk.org/browse/JDK-8317440). The `JfrStacktrace_lock` rank needs to be adjusted because it may be used when gathering CHT statistics in generational ZGC (afaiu this operation may trigger a relocation which may trigger page allocation which may trigger a JFR event which may trigger JFR trace writing). > > The `JfrStacktrace_lock` code seems to be a lock only guarding I/O too, so moving it to the lowest rank seems safe after code inspection, but also tier1-7 testing and around 100 runs of the failed application did not make it fail (reproduction is like 2/100 runs without this change). > > Testing: tier1-7, ~100 runs of the failing application with no error > > Hth, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16242#pullrequestreview-1688262899 From iwalulya at openjdk.org Thu Oct 19 17:42:16 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Oct 2023 17:42:16 GMT Subject: RFR: 8318296: Move Space::initialize to ContiguousSpace In-Reply-To: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> References: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> Message-ID: On Tue, 17 Oct 2023 11:57:59 GMT, Albert Mingkun Yang wrote: > Simple refactoring of moving methods to subclass. (`clear` is also moved, as it is called inside `initialize`.) Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16219#pullrequestreview-1688256860 From shade at openjdk.org Thu Oct 19 17:43:42 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 19 Oct 2023 17:43:42 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 14:24:44 GMT, Aleksey Shipilev wrote: >> See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > > Thanks for looking at it! > > Re-reading JEP 346: ?Promptly Return Unused Committed Memory from G1?... > > In the scenario we are seeing, we do have lots of unused committed memory that would not be reclaimed promptly until concurrent cycle executes. The need for that cleanup is in worst case driven by heap connectivity changes that do not readily reflect in other observable heap metrics. A poster example would be a cache sitting perfectly in "oldgen", eventually dropping the entries, producing a huge garbage patch. We would only discover this after whole heap marking. In some cases, the tuning based on occupancy (e.g. soft max heap size, heap reserve, etc.) would help if we promote enough stuff to actually trigger the concurrent cycle. But if we keep churning very efficient young collections, we would get there very slowly. > > Therefore, I?d argue the current behavior is against _the spirit_ of the JEP 346, even though _the letter_ says that we track ?any? GC for periodic GC. There is no explicit mention why young GCs should actually be treated as recent GCs, even though ? with the benefit of hindsight ? they throw away promptness guarantees. Aside: Shenandoah periodic GC does not make any claims it would run only in idle phases; albeit the story is simpler without young GCs. But Generational Shenandoah would follow the same route: the periodic whole heap GC would start periodically regardless of young collections running in between or not. > > The current use of ?idle? is also awkward in JEP 346. If user enables `G1PeriodicGCInterval` without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. I guess we can think about ?idle? as ?GC is idle?, but then arguably not figuring out the whole heap situation _for hours_ can be described as ?GC is idle?. I think the much larger point of the JEP is to reclaim memory promptly, which in turn requires whole heap GC. Looks like JEP somewhat painted itself in the corner by considering all GCs, including young. > > I doubt that users would mind if we change the behavior of `G1PeriodicGCInterval` like this: the option is explicitly opt-in, the configurations I see in prod are running with huge intervals, etc. So we are asking for a relatively rare concurrent GC even when application is doing young GCs. But I agree that departing from the current behavior might still have undesired consequences, for which we need to plan the escape route. There is also a need to ... > @shipilev : I have not made up my mind about the other parts of your proposal, but: > > > The current use of ?idle? is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. > > This is the reason for the `G1PeriodicGCSystemLoadThreshold` option and is handled by the feature/JEP. Yes, that is why I said "without doing anything else". With that example, I wanted to point out that the definition of "idle" is already quite murky even with current JEP, where we have an additional option to tell if "idle" includes the actual system load. In this view, having another option that would tell if "idle" includes young GCs fits well, I think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1771405208 From shade at openjdk.org Thu Oct 19 17:43:47 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 19 Oct 2023 17:43:47 GMT Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC [v2] In-Reply-To: <5cJAc8Tppr8ogDpWztriAhnfUw6D0INhhmn1A2hxSSA=.9c4bf7e1-11c4-4907-9365-baa16187c6f9@github.com> References: <5cJAc8Tppr8ogDpWztriAhnfUw6D0INhhmn1A2hxSSA=.9c4bf7e1-11c4-4907-9365-baa16187c6f9@github.com> Message-ID: <6PMLrz1zvgLREP5K3VGphMy8BbH5CcNiA0wv2ZI4V7A=.acd2f93d-157a-431c-97a9-05bf8bd117d4@github.com> On Thu, 19 Oct 2023 08:06:02 GMT, Thomas Schatzl wrote: > This is corroborated by experience from me working with end users actually doing that (i.e. force a whole heap analysis externally). Yes, and that is the beauty of periodic GCs that are driven by GC itself, when it knows it should not start the periodic GCs if there was recent activity. That part if captured in JEP, I think. What this PR does, is extends the usefulness of periodic GCs to the cases where you need a whole heap analysis to make the best call about what to do next. In G1 case, maybe discovering that we need to start running mixed collections to unclutter the heap. We can indeed make an external agent that calls `System.gc()` with `+ExplicitGCInvokesConcurrent`, but that IMO ignores the purpose, usefulness and reliability of periodic GCs... if we make sure they cover the use cases we care about. Otherwise, off to explicit GCs we go! That would feel like a major step backwards. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1771415840 From iwalulya at openjdk.org Thu Oct 19 18:03:38 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Oct 2023 18:03:38 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 15:29:08 GMT, Thomas Schatzl wrote: > Hi all, > > please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. > > Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. > > Testing: gha, tier1-5 almost done > > Thanks, > Thomas LGTM! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16270#pullrequestreview-1688267191 From iwalulya at openjdk.org Thu Oct 19 18:07:01 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Oct 2023 18:07:01 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: On Mon, 2 Oct 2023 12:56:27 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: > > * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) > * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). > * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. > * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. > > Testing: gha > > Thanks, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16011#pullrequestreview-1688249138 From iwalulya at openjdk.org Thu Oct 19 18:07:04 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Oct 2023 18:07:04 GMT Subject: RFR: 8318510: Serial: Remove TenuredGeneration::block_size In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 12:41:53 GMT, Albert Mingkun Yang wrote: > Simple one-line change inside `complete_loaded_archive_space`, then `block_size` becomes unused. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16266#pullrequestreview-1688259538 From tschatzl at openjdk.org Thu Oct 19 18:49:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 18:49:35 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 16:24:57 GMT, Richard Reingruber wrote: >> src/hotspot/share/gc/parallel/psCardTable.cpp line 315: >> >>> 313: >>> 314: // Reset cached object >>> 315: cached_obj = {nullptr, old_gen_bottom}; >> >> Suggestion: >> >> // Prepare for actual scavenge. >> const size_t stripe_size_in_words = num_cards_in_stripe * _card_size_in_words; >> const size_t slice_size_in_words = stripe_size_in_words * n_stripes; >> >> cached_obj = {nullptr, old_gen_bottom}; >> >> >> (I do not feel that "Reset cached object" adds a lot) > > The reset is needed because the cache requires that queries are monotonic. There was an assertion checking monotonicity. Albert thought it wouldn't be needed. I meant the comment "Reset Cached Object". I also think that the code is required :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365959327 From tschatzl at openjdk.org Thu Oct 19 18:49:37 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 18:49:37 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 16:01:20 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/parallel/psCardTable.hpp line 49: >> >>> 47: // Old gen top is not card aligned. >>> 48: size_t copy_length = align_up(stripe.byte_size(), _card_size) >> _card_shift; >>> 49: size_t clear_length = align_down(stripe.byte_size(), _card_size) >> _card_shift; >> >> Can you explain why `align_down` is needed here? I remember some reason why this needs to be the case at least for the old code, and @albertnetymk also explained it to me recently, but just now I can't figure it out (and it may not be required any more). Please add a comment, this is not obvious. > > Since old-gen-top before scavenging might not be card-aligned, it's unsafe to clear it; hence the conservative (align-down) calculation. However, the right shift will do implicit align-down as well, so it is probably not needed. Better be explicit, I was thinking. Either is fine, I guess. The highest value `byte_size()` can have is old_gen-end - old_gen_bottom (both card-aligned; one stripe, one slice), which is the exact length needed when covering all cards. Any top value != end must have a committed corresponding card table entry, otherwise marking the card that contains `top` would crash. Also the copying would fail then, reading from uncommitted areas beyond the card table. The code does not do that afaics. So the only problematic one I can see would be clearing the card exactly starting at old_gen-end, which an `align_up()` wouldn't do either. So I do not completely get why clearing the card containing top would be unsafe. Can you give an example? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365962741 From rrich at openjdk.org Thu Oct 19 19:06:49 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 19:06:49 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: <2sSmoJUYGi_R0UTfD_3FA2o2q_n_zljpP2hYxqDWiZQ=.d8967c04-ee42-4338-b5ba-67c9c7633953@github.com> On Thu, 19 Oct 2023 18:38:03 GMT, Thomas Schatzl wrote: >> Since old-gen-top before scavenging might not be card-aligned, it's unsafe to clear it; hence the conservative (align-down) calculation. However, the right shift will do implicit align-down as well, so it is probably not needed. Better be explicit, I was thinking. Either is fine, I guess. > > The highest value `byte_size()` can have is old_gen-end - old_gen_bottom (both card-aligned; one stripe, one slice), which is the exact length needed when covering all cards. > Any top value != end must have a committed corresponding card table entry, otherwise marking the card that contains `top` would crash. Also the copying would fail then, reading from uncommitted areas beyond the card table. The code does not do that afaics. > > So the only problematic one I can see would be clearing the card exactly starting at old_gen-end, which an `align_up()` wouldn't do either. > > So I do not completely get why clearing the card containing top would be unsafe. Can you give an example? Likely I do not completely understand what you are saying but this would be my explanation why the `align_down` for `clear_length` is needed. `T := old_gen->object_space()->top()` is not necessarily card aligned at scavenge start. We must not clear the card for `T` if an object was copied there because it was promoted and it has a reference to a young object on that card. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1365992310 From rrich at openjdk.org Thu Oct 19 21:27:09 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 21:27:09 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v27] In-Reply-To: References: Message-ID: <8qkepkG9gUUj8f_GfFCZww2VPlK8htgBmLV1c4EbIWk=.25825e8f-46b0-412a-9373-17eef459a5d9@github.com> > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Review Thomas ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/bd853c4a..fd5d0725 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=25-26 Stats: 164 lines in 3 files changed: 82 ins; 65 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Oct 19 21:30:17 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 21:30:17 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v27] In-Reply-To: <8qkepkG9gUUj8f_GfFCZww2VPlK8htgBmLV1c4EbIWk=.25825e8f-46b0-412a-9373-17eef459a5d9@github.com> References: <8qkepkG9gUUj8f_GfFCZww2VPlK8htgBmLV1c4EbIWk=.25825e8f-46b0-412a-9373-17eef459a5d9@github.com> Message-ID: <2WssUSGC3w_LXQw8t3saKVnyBmoqox5gnbMfB3gr3XI=.c82c3412-0210-4007-b8e5-41bd8ceec597@github.com> On Thu, 19 Oct 2023 21:27:09 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Review Thomas Thanks for all the feedback Thomas! // Haven't incorporated all of it yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1771730013 From tschatzl at openjdk.org Thu Oct 19 21:46:45 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 19 Oct 2023 21:46:45 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: <2sSmoJUYGi_R0UTfD_3FA2o2q_n_zljpP2hYxqDWiZQ=.d8967c04-ee42-4338-b5ba-67c9c7633953@github.com> Message-ID: <-z08i1g6s8t-is8iRk-C6Br94tGw0ZQwRvJsFjvTIlw=.51bfed2c-3d04-4f6e-8e67-69d4b652c676@github.com> On Thu, 19 Oct 2023 20:06:59 GMT, Richard Reingruber wrote: >> Likely I do not completely understand what you are saying but this would be my explanation why the `align_down` for `clear_length` is needed. >> `T := old_gen->object_space()->top()` is not necessarily card aligned at scavenge start. We must not clear the card for `T` if an object was copied there because it was promoted and it has a reference to a young object on that card. > > Example > > We cannot clear card n containing old gen top T because we won't scan the promoted > objects on card n and we can only clear cards if we scan all objects on them afterwards. > > > card n-1 card n card n+1 > +--------------------+--------------------+-------------------- > | | . Promoted | > | | . Objects | > | | . | > +--------------------+--------------------+-------------------- > ^ > | > T > old gen top at scavenge start > / end of last stripe Okay, I think I understood this now. I would probably just fill up the last card before scavenge, but this explanation makes sense. Please add a comment there for the next person. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366158560 From rrich at openjdk.org Thu Oct 19 20:10:12 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 20:10:12 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: <2sSmoJUYGi_R0UTfD_3FA2o2q_n_zljpP2hYxqDWiZQ=.d8967c04-ee42-4338-b5ba-67c9c7633953@github.com> References: <2sSmoJUYGi_R0UTfD_3FA2o2q_n_zljpP2hYxqDWiZQ=.d8967c04-ee42-4338-b5ba-67c9c7633953@github.com> Message-ID: On Thu, 19 Oct 2023 19:04:23 GMT, Richard Reingruber wrote: >> The highest value `byte_size()` can have is old_gen-end - old_gen_bottom (both card-aligned; one stripe, one slice), which is the exact length needed when covering all cards. >> Any top value != end must have a committed corresponding card table entry, otherwise marking the card that contains `top` would crash. Also the copying would fail then, reading from uncommitted areas beyond the card table. The code does not do that afaics. >> >> So the only problematic one I can see would be clearing the card exactly starting at old_gen-end, which an `align_up()` wouldn't do either. >> >> So I do not completely get why clearing the card containing top would be unsafe. Can you give an example? > > Likely I do not completely understand what you are saying but this would be my explanation why the `align_down` for `clear_length` is needed. > `T := old_gen->object_space()->top()` is not necessarily card aligned at scavenge start. We must not clear the card for `T` if an object was copied there because it was promoted and it has a reference to a young object on that card. Example We cannot clear card n containing old gen top T because we won't scan the promoted objects on card n and we can only clear cards if we scan all objects on them afterwards. card n-1 card n card n+1 +--------------------+--------------------+-------------------- | | . Promoted | | | . Objects | | | . | +--------------------+--------------------+-------------------- ^ | T old gen top at scavenge start / end of last stripe ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366051508 From rrich at openjdk.org Thu Oct 19 20:34:12 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 19 Oct 2023 20:34:12 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 14:51:52 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> preprocess_card_table_parallel should be private > > src/hotspot/share/gc/parallel/psCardTable.cpp line 235: > >> 233: } >> 234: } >> 235: } > > I think this code becomes more clear if the nested-ifs are replaced by negation and `continue`. I also added some additional comments giving reasons for the conditions. > > Suggestion: > > for (CardValue* cur_card = byte_for(old_gen_bottom) + stripe_index * num_cards_in_stripe; // this may be left outside, your call, it is a bit long. > cur_card < end_card; > cur_card += num_cards_in_slice) { > HeapWord* stripe_addr = addr_for(cur_card); > if (is_dirty(cur_card) { > // The first card of this stripe is already dirty, no need to see if the reaching-in object is a potentially imprecisely marked non-array object. > continue; > } > HeapWord* first_obj_addr = object_start(stripe_addr); > if (first_obj_addr == stripe_addr) { // (random comment) can't be > I think > // No object reaching into this stripe. > continue; > } > oop first_obj = cast_to_oop(first_obj_addr); > if (!first_obj->is_array() && is_dirty(byte_for(first_obj_addr))) { > // Found a non-array object reaching into the stripe assigned to this thread that has potentially been marked imprecisely. > // Mark first card of stripe dirty so that this thread will process it later. > *cur_card = dirty_card_val(); > } > } It's actually just a minor detail that the thread that marks the first card dirty will also process that stripe. The assignment could be changed without effect. I'll leave that part out. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366079466 From rrich at openjdk.org Fri Oct 20 05:48:49 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 05:48:49 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v23] In-Reply-To: References: <1HXBowkNZo1iyNgOVA6qZYcyz0alZ8r5FBMQ3FqAvTE=.8a95691e-f02d-486a-96c3-88d5ec836228@github.com> Message-ID: On Thu, 19 Oct 2023 13:51:10 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 36 additional commits since the last revision: >> >> - Use better name: _preprocessing_active_workers >> - Merge branch 'master' >> - Remove obsolete comment >> - Feedback Albert >> - Merge branch 'master' >> - Re-cleanup (was accidentally reverted) >> - Make sure to scan obj reaching in just once >> - Simplification suggested by Albert >> - Don't overlap card table processing with scavenging for simplicity >> - Cleanup >> - ... and 26 more: https://git.openjdk.org/jdk/compare/355d3adc...f7965512 > > src/hotspot/share/gc/parallel/psCardTable.hpp line 91: > >> 89: return end; >> 90: } >> 91: }; > > Could these implementations moved into the .cpp file? They are only every referenced by that and should be inlined anyway to not clog the interface/hpp file too much. Moved the class StripeShadowTable to psCardTable.cpp (and renamed it). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366499415 From rrich at openjdk.org Fri Oct 20 05:55:40 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 05:55:40 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 14:37:06 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> preprocess_card_table_parallel should be private > > src/hotspot/share/gc/parallel/psCardTable.cpp line 242: > >> 240: SpinYield spin_yield; >> 241: while (Atomic::load_acquire(&_preprocessing_active_workers) > 0) { >> 242: spin_yield.wait(); > > I would prefer to have the synchronization as part of `scavenge_contents_parallel`; i.e. the logic there being > > Prepare Scavenge > Synchronize > Scavenge > > Here the synchronization feels out of place and surprising for a method that nowhere indicates that it is doing anything other than preprocessing the table. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366503671 From rrich at openjdk.org Fri Oct 20 06:37:37 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 06:37:37 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v24] In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 18:35:03 GMT, Thomas Schatzl wrote: >> The reset is needed because the cache requires that queries are monotonic. There was an assertion checking monotonicity. Albert thought it wouldn't be needed. > > I meant the comment "Reset Cached Object". I also think that the code is required :) Resetting would be redundant if we checked `addr >= cached_obj.start_addr`. The logic couldn't be misused then either. I'll add a comment that the queries are expected to be monotonic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366534590 From tschatzl at openjdk.org Fri Oct 20 07:33:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 20 Oct 2023 07:33:43 GMT Subject: RFR: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: <24PQu5F8UyuLPpat6tVbKzzpIXquqsZqUhJKeiNObTY=.3a9dd5fe-0284-4e39-8621-a17371d4cb59@github.com> On Thu, 19 Oct 2023 16:59:39 GMT, Ivan Walulya wrote: >> Hi all, >> >> please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: >> >> * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) >> * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). >> * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. >> * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. >> >> Testing: gha >> >> Thanks, >> Thomas > > Marked as reviewed by iwalulya (Reviewer). Thanks @walulyai @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/16011#issuecomment-1772224539 From tschatzl at openjdk.org Fri Oct 20 07:33:45 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 20 Oct 2023 07:33:45 GMT Subject: Integrated: 8317350: Move code cache purging out of CodeCache::UnloadingScope In-Reply-To: References: Message-ID: <4Y8Eh6q0_GqZyQwNc3JqCBY6yjXw-GfeMfZJ0wAuvHY=.57cdd6f9-61c2-49ee-a0b9-eab9bf96fbf5@github.com> On Mon, 2 Oct 2023 12:56:27 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that moves actual code cache flushing/purging out of `CodeCache::UnloadingScope`. Reasons: > > * I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959) > * imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to `UnloadingBehaviorScope` too in a separate CR). > * other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading. > * putting it there makes future logging of the various phases a little bit easier, not having `GCTraceTimer` et al. in various places. > > Testing: gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: bd3bc2c6 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/bd3bc2c6181668b5856732666dc251136b7fbb99 Stats: 60 lines in 7 files changed: 26 ins; 6 del; 28 mod 8317350: Move code cache purging out of CodeCache::UnloadingScope Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16011 From ayang at openjdk.org Fri Oct 20 08:40:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 20 Oct 2023 08:40:37 GMT Subject: RFR: 8318296: Move Space::initialize to ContiguousSpace In-Reply-To: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> References: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> Message-ID: On Tue, 17 Oct 2023 11:57:59 GMT, Albert Mingkun Yang wrote: > Simple refactoring of moving methods to subclass. (`clear` is also moved, as it is called inside `initialize`.) Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16219#issuecomment-1772317623 From ayang at openjdk.org Fri Oct 20 08:43:46 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 20 Oct 2023 08:43:46 GMT Subject: Integrated: 8318296: Move Space::initialize to ContiguousSpace In-Reply-To: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> References: <9aCHpkZSDcV7X_GeMhxNyTyHtVcPumNOEq-5xJ7lS8g=.82309dfd-f075-4ca6-9ecd-cb39597ae8cc@github.com> Message-ID: On Tue, 17 Oct 2023 11:57:59 GMT, Albert Mingkun Yang wrote: > Simple refactoring of moving methods to subclass. (`clear` is also moved, as it is called inside `initialize`.) This pull request has now been integrated. Changeset: cd25d1a2 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/cd25d1a2bf4530d8fd4d0515b69e2199df9c102f Stats: 51 lines in 2 files changed: 17 ins; 29 del; 5 mod 8318296: Move Space::initialize to ContiguousSpace Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16219 From rrich at openjdk.org Fri Oct 20 09:02:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 09:02:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v28] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Cleanup/improve comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/fd5d0725..7c20c9f1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=26-27 Stats: 31 lines in 2 files changed: 16 ins; 12 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 20 09:13:48 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 09:13:48 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v28] In-Reply-To: References: Message-ID: <70mj0_esW4xbhmSqjbtzuw5rdfUvuQlZYrLYsKiZzCw=.8712242c-d72d-402d-bb5a-53703058b6c7@github.com> On Fri, 20 Oct 2023 09:02:50 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup/improve comments I think I've addressed all remarks now. Let me know if I should revisit anything again. I haven't yet moved the summarizing comment from the definition of `scavenge_contents_parallel` in the cpp file to the header file. I thought it should remain close to the implementing code. Is that ok or should it be moved to the header? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1772366594 From tschatzl at openjdk.org Fri Oct 20 09:45:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 20 Oct 2023 09:45:43 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v28] In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 09:02:50 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup/improve comments Please fix the `PS` prefix issue for `StripeShadowCardTable`. Maybe incorporate the other comment related changes too. Other than that it looks great! src/hotspot/share/gc/parallel/psCardTable.cpp line 141: > 139: } > 140: > 141: class StripeShadowCardTable { Unfortunately, due to C++ having a global namespace, please prefix with `PS` even if it is local here. src/hotspot/share/gc/parallel/psCardTable.cpp line 161: > 159: size_t stripe_byte_size = pointer_delta(end, start) * HeapWordSize; > 160: size_t copy_length = align_up(stripe_byte_size, _card_size) >> _card_shift; > 161: size_t clear_length = align_down(stripe_byte_size, _card_size) >> _card_shift; Some reordering of the comment, moving it closer to the code it describes, some rewording. Suggestion: size_t stripe_byte_size = pointer_delta(end, start) * HeapWordSize; size_t copy_length = align_up(stripe_byte_size, _card_size) >> _card_shift; // The end of the last stripe may not be card aligned as it is equal to old // gen top at scavenge start. We should not clear the card containing old gen // top if not card aligned because there can be promoted objects on that // same card. If it were marked dirty because of the promoted object and we // cleared it, we would loose a card mark. size_t clear_length = align_down(stripe_byte_size, _card_size) >> _card_shift; src/hotspot/share/gc/parallel/psCardTable.cpp line 216: > 214: // The "shadow" table is a copy of the card table entries of the current stripe. > 215: // It is used to separate card reading, clearing and redirtying which reduces > 216: // complexity significantly. That would be a perfect comment to put just before the `ShadowCardTable` class definition. Not seeing the point of putting this at the place we instantiate it. src/hotspot/share/gc/parallel/psCardTable.cpp line 284: > 282: CardValue* const end_card = byte_for(old_gen_top - 1) + 1; > 283: > 284: for ( /* empty */ ; cur_card < end_card; cur_card += num_cards_in_slice) { Suggestion: for (/* empty */; cur_card < end_card; cur_card += num_cards_in_slice) { To be consistent with the other place in the file. Regular initializations also do not have a space after the bracket. src/hotspot/share/gc/parallel/psCardTable.cpp line 382: > 380: preprocess_card_table_parallel(object_start, old_gen_bottom, old_gen_top, stripe_index, n_stripes); > 381: > 382: // Sync with other workers Suggestion: // Sync with other workers. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1689634534 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366726043 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366729626 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366731067 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366731713 PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366733572 From tschatzl at openjdk.org Fri Oct 20 09:54:59 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 20 Oct 2023 09:54:59 GMT Subject: RFR: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope Message-ID: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Hi all, please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). There is more discussion in https://github.com/openjdk/jdk/pull/16011. Testing: GHA Thanks, Thomas ------------- Commit messages: - 8318585 initial version Changes: https://git.openjdk.org/jdk/pull/16283/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16283&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318585 Stats: 14 lines in 7 files changed: 3 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/16283.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16283/head:pull/16283 PR: https://git.openjdk.org/jdk/pull/16283 From ayang at openjdk.org Fri Oct 20 10:19:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 20 Oct 2023 10:19:27 GMT Subject: RFR: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope In-Reply-To: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> References: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Message-ID: On Fri, 20 Oct 2023 07:53:04 GMT, Thomas Schatzl wrote: > Hi all, > > please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. > > The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). > > There is more discussion in https://github.com/openjdk/jdk/pull/16011. > > Testing: GHA > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16283#pullrequestreview-1689708342 From ayang at openjdk.org Fri Oct 20 10:25:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 20 Oct 2023 10:25:45 GMT Subject: RFR: 8318510: Serial: Remove TenuredGeneration::block_size In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 12:41:53 GMT, Albert Mingkun Yang wrote: > Simple one-line change inside `complete_loaded_archive_space`, then `block_size` becomes unused. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16266#issuecomment-1772480871 From ayang at openjdk.org Fri Oct 20 10:25:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 20 Oct 2023 10:25:45 GMT Subject: Integrated: 8318510: Serial: Remove TenuredGeneration::block_size In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 12:41:53 GMT, Albert Mingkun Yang wrote: > Simple one-line change inside `complete_loaded_archive_space`, then `block_size` becomes unused. This pull request has now been integrated. Changeset: 6f1d8962 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/6f1d8962df05e2b298f3ec354430159041b51bcd Stats: 27 lines in 6 files changed: 0 ins; 26 del; 1 mod 8318510: Serial: Remove TenuredGeneration::block_size Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16266 From iwalulya at openjdk.org Fri Oct 20 11:09:27 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 20 Oct 2023 11:09:27 GMT Subject: RFR: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope In-Reply-To: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> References: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Message-ID: On Fri, 20 Oct 2023 07:53:04 GMT, Thomas Schatzl wrote: > Hi all, > > please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. > > The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). > > There is more discussion in https://github.com/openjdk/jdk/pull/16011. > > Testing: GHA > > Thanks, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16283#pullrequestreview-1689802405 From rrich at openjdk.org Fri Oct 20 13:13:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 13:13:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v28] In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 09:36:34 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup/improve comments > > src/hotspot/share/gc/parallel/psCardTable.cpp line 216: > >> 214: // The "shadow" table is a copy of the card table entries of the current stripe. >> 215: // It is used to separate card reading, clearing and redirtying which reduces >> 216: // complexity significantly. > > That would be a perfect comment to put just before the `ShadowCardTable` class definition. Not seeing the point of putting this at the place we instantiate it. I will move the comment to the decl. of `PSStripeShadowCardTable`. I put it in `PSCardTable::process_range` because `PSStripeShadowCardTable` feels like an implementation detail of it. I'd move the whole class there if it wasn't too big. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1366956932 From rrich at openjdk.org Fri Oct 20 13:18:58 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 13:18:58 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v29] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Accepting Thomas' (smaller) suggestions Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/7c20c9f1..67416b20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=27-28 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 20 13:27:57 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 13:27:57 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v30] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Review Thomas (PSStripeShadowCardTable) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/67416b20..a4430424 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=28-29 Stats: 14 lines in 2 files changed: 5 ins; 5 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 20 13:34:59 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 13:34:59 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v31] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Forgot to move comment to PSStripeShadowCardTable. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/a4430424..71b08484 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=29-30 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Fri Oct 20 14:28:39 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 20 Oct 2023 14:28:39 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v11] In-Reply-To: References: Message-ID: On Wed, 27 Sep 2023 14:13:23 GMT, Thomas Schatzl wrote: >> Richard Reingruber has updated the pull request incrementally with five additional commits since the last revision: >> >> - Eliminate special case for scanning the large array end >> - First card of large array should be cleared if dirty >> - Do all large array scanning in separate method >> - Limit stripe size to 1m with at least 8 threads >> - Small clean-ups > > Hi, > >> > I experimented with the aforementioned read-only card table idea a bit and here is the draft: >> > https://github.com/openjdk/jdk/compare/master...albertnetymk:jdk:pgc-precise-obj-arr?expand=1 >> >> This looks very nice! The code is a lot easier to follow than the baseline and this pr. >> >> With your draft I found out too that the regressions with just 2 threads come from the remaining `object_start` calls. Larger stripes mean fewer of them. The caching used in your draft is surly better. >> >> So by default 1 card table byte per 512b card is needed. The shadow card table will require 2M per gigabyte used old generation. I guess that's affordable. >> >> Would you think that your solution can be backported? > > I had a brief look at @albertnetymk's suggestion, a few comments: > > * it uses another card table - while "just" another 0.2% of the heap, we should try to avoid such regressions. G1 also does not need another card table... maybe some more effort should be put into optimizing that one away. > * obviously allocating and freeing during the pause is suboptimal wrt to pause time so the prototype should be improved in that regard :) > * the copying will stay (if there is a second card table), I would be interested in pause time changes for more throughput'y applications (jbb2005, timefold/optaplanner https://timefold.ai/blog/2023/java-21-performance) > * anything can be backported, but the question is whether the individual maintainers of these versions are going to. It does have a good case though which may make it easier to convince maintainers. > > Hth, > Thomas Thanks a lot @tschatzl for reviewing. I hope I've covered all of your feedback. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1772842725 From kirk at kodewerk.com Fri Oct 20 16:01:56 2023 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 20 Oct 2023 09:01:56 -0700 Subject: RFR: 8317755: G1: Periodic GC interval should test for the last whole heap GC In-Reply-To: References: Message-ID: <340068D8-3302-4BE6-994B-4D58AD8BC9A4@kodewerk.com> Don?t we have already have a GCInterval with a default value of Long.MAX_VALUE? When I hear the word idle I immediately start thinking, CPU idle. In this case however, I quickly shifted to memory idle which I think translates nicely into how idle are the allocators. Thus basing heap sizing ergonomics on allocation rates seems like a reasonable metric until you consider the edge cases. The most significant ?edge case? IME is when GC overheads start exceeding 20%. In those cases GC will throttle allocations and that in turn cases ergonomics to reduce sizes instead of increasing them (to reduce GC overhead). My conclusion from this is that ergonomics should consider both allocation rates and GC overhead when deciding on how to resize heap at the end of a collection. Fortunately there are a steady stream of GC events that create a convenient point in time to make an ergonomic adjustment. Not having allocations and as a result not having the collector run implies one has to manufacture a convenient point in time to make an ergonomic sizing decision. Unfortunately time based triggers are speculative and the history of speculatively triggered GC cycles has been less than wonderful (think DGC as but one case). My last consulting engagement prior to joining MS involved me tuning an application where the OEM?s recommended configuration (set out of the box) was to run a full collection every two minutes. As one can imagine, the results were devastating. It took 3 days of meetings with various stakeholders and managers to get permission to turn that setting turned off. If the application is truly idle then it?s a no foul no harm situation. However there are entire classes of applications that are light allocators and the question would be, what would be the impact of speculative collections on those applications? As for returning memory, two issues, there appears to be no definition for ?unused memory?. Secondly, what I can say after looking at 1000s of GC logs is that the amount of floating garbage that G1 leaves behind even after several concurrent cycles is not insignificant. I also write a G1 heap fragmentation viewer and what it revealed is that heap remains highly fragmented and scattered after each GC cycle. All this suggests that heap will need to be compacted with a full collection in order to return a significantly large enough block of memory to make the entire effort worthwhile. Again, if the application is idle, then no harm no foul. However, for those applications that are memory-idle but not CPU-idle this might not be a great course of action. In my mind, any trigger for a speculative collection would need to take into consideration, allocation rates, GC overhead, and mutator busyness (for cases when GC and allocator activity is low to 0). Kind regards, Kirk > On Oct 19, 2023, at 10:43 AM, Aleksey Shipilev wrote: > > On Wed, 18 Oct 2023 14:24:44 GMT, Aleksey Shipilev wrote: > >>> See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior. >>> >>> Additional testing: >>> - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` >> >> Thanks for looking at it! >> >> Re-reading JEP 346: ?Promptly Return Unused Committed Memory from G1?... >> >> In the scenario we are seeing, we do have lots of unused committed memory that would not be reclaimed promptly until concurrent cycle executes. The need for that cleanup is in worst case driven by heap connectivity changes that do not readily reflect in other observable heap metrics. A poster example would be a cache sitting perfectly in "oldgen", eventually dropping the entries, producing a huge garbage patch. We would only discover this after whole heap marking. In some cases, the tuning based on occupancy (e.g. soft max heap size, heap reserve, etc.) would help if we promote enough stuff to actually trigger the concurrent cycle. But if we keep churning very efficient young collections, we would get there very slowly. >> >> Therefore, I?d argue the current behavior is against _the spirit_ of the JEP 346, even though _the letter_ says that we track ?any? GC for periodic GC. There is no explicit mention why young GCs should actually be treated as recent GCs, even though ? with the benefit of hindsight ? they throw away promptness guarantees. Aside: Shenandoah periodic GC does not make any claims it would run only in idle phases; albeit the story is simpler without young GCs. But Generational Shenandoah would follow the same route: the periodic whole heap GC would start periodically regardless of young collections running in between or not. >> >> The current use of ?idle? is also awkward in JEP 346. If user enables `G1PeriodicGCInterval` without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. I guess we can think about ?idle? as ?GC is idle?, but then arguably not figuring out the whole heap situation _for hours_ can be described as ?GC is idle?. I think the much larger point of the JEP is to reclaim memory promptly, which in turn requires whole heap GC. Looks like JEP somewhat painted itself in the corner by considering all GCs, including young. >> >> I doubt that users would mind if we change the behavior of `G1PeriodicGCInterval` like this: the option is explicitly opt-in, the configurations I see in prod are running with huge intervals, etc. So we are asking for a relatively rare concurrent GC even when application is doing young GCs. But I agree that departing from the current behavior might still have undesired consequences, for which we need to plan the escape route. There is also a need to ... > >> @shipilev : I have not made up my mind about the other parts of your proposal, but: >> >>> The current use of ?idle? is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. >> >> This is the reason for the `G1PeriodicGCSystemLoadThreshold` option and is handled by the feature/JEP. > > Yes, that is why I said "without doing anything else". With that example, I wanted to point out that the definition of "idle" is already quite murky even with current JEP, where we have an additional option to tell if "idle" includes the actual system load. In this view, having another option that would tell if "idle" includes young GCs fits well, I think. > > ------------- > > PR Comment: https://git.openjdk.org/jdk/pull/16107#issuecomment-1771405208 From tschatzl at openjdk.org Fri Oct 20 16:07:40 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 20 Oct 2023 16:07:40 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v31] In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 13:34:59 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> #### Implementation (Updated 2023-10-20) >> >> Comment copied from `PSCardTable::scavenge_contents_parallel`: >> >> ```c++ >> // Scavenging and accesses to the card table are strictly limited to the stripe. >> // In particular scavenging of an object crossing stripe boundaries is shared >> // among the threads assigned to the stripes it resides on. This reduces >> // complexity and enables shared scanning of large objects. >> // It requires preprocessing of the card table though where imprecise card marks of >> // objects crossing stripe boundaries are propagated to the first card of >> // each stripe covered by the individual object. >> >> >> The baseline was refactored to make use of a read-only copy of the card table. That "shadow" table (`PSStripeShadowCardTable`) separates reading, clearing and redirtying of table entries which allows for a much simpler implementation. >> >> Scanning of object arrays is limited to dirty card chunks. >> >> ## Everything below refers to the Outdated Initial Implementation >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its refere... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Forgot to move comment to PSStripeShadowCardTable. Ship it! :) ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1690399343 From dcubed at openjdk.org Fri Oct 20 21:27:57 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 20 Oct 2023 21:27:57 GMT Subject: RFR: 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode Message-ID: A trivial fix to ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode. ------------- Commit messages: - 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode Changes: https://git.openjdk.org/jdk/pull/16296/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16296&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318622 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16296.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16296/head:pull/16296 PR: https://git.openjdk.org/jdk/pull/16296 From naoto at openjdk.org Fri Oct 20 21:36:40 2023 From: naoto at openjdk.org (Naoto Sato) Date: Fri, 20 Oct 2023 21:36:40 GMT Subject: RFR: 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 21:20:06 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode. Marked as reviewed by naoto (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16296#pullrequestreview-1690923460 From dcubed at openjdk.org Fri Oct 20 21:36:41 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 20 Oct 2023 21:36:41 GMT Subject: RFR: 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 21:30:32 GMT, Naoto Sato wrote: >> A trivial fix to ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode. > > Marked as reviewed by naoto (Reviewer). @naotoj - Thanks for the fast review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16296#issuecomment-1773420094 From dcubed at openjdk.org Fri Oct 20 21:36:42 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 20 Oct 2023 21:36:42 GMT Subject: Integrated: 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode In-Reply-To: References: Message-ID: <3CYYQRehc2JzNl07aKqgeJ5H0a3NatthVZ4EoEF6wRc=.d16f8ebf-8e7a-4e9d-b9b0-be788bc3b29d@github.com> On Fri, 20 Oct 2023 21:20:06 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode. This pull request has now been integrated. Changeset: af2f4bfa Author: Daniel D. Daugherty URL: https://git.openjdk.org/jdk/commit/af2f4bfa837a18964e00de1e3077119cfa4c68e0 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8318622: ProblemList gc/cslocker/TestCSLocker.java on linux-x64 in Xcomp mode Reviewed-by: naoto ------------- PR: https://git.openjdk.org/jdk/pull/16296 From mli at openjdk.org Sat Oct 21 10:45:33 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 21 Oct 2023 10:45:33 GMT Subject: RFR: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope In-Reply-To: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> References: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Message-ID: On Fri, 20 Oct 2023 07:53:04 GMT, Thomas Schatzl wrote: > Hi all, > > please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. > > The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). > > There is more discussion in https://github.com/openjdk/jdk/pull/16011. > > Testing: GHA > > Thanks, > Thomas LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16283#pullrequestreview-1691233443 From mli at openjdk.org Sat Oct 21 11:21:35 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 21 Oct 2023 11:21:35 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 15:29:08 GMT, Thomas Schatzl wrote: > Hi all, > > please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. > > Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. > > Testing: gha, tier1-5 almost done > > Thanks, > Thomas src/hotspot/share/gc/g1/g1RemSet.cpp line 1200: > 1198: // collecting remembered set entries for humongous regions that were not > 1199: // reclaimed. > 1200: r->rem_set()->set_state_complete(); In original code, in set_state_complete() it `_state = Complete;`, but in new patch, `_state = Complete;` is not called, is this expected? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16270#discussion_r1367712935 From mli at openjdk.org Sat Oct 21 11:55:51 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 21 Oct 2023 11:55:51 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate Message-ID: Hi, Can you help to review this simple refactoring? Thanks! ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/16300/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16300&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318629 Stats: 5 lines in 1 file changed: 0 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16300/head:pull/16300 PR: https://git.openjdk.org/jdk/pull/16300 From mli at openjdk.org Sat Oct 21 12:10:25 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 21 Oct 2023 12:10:25 GMT Subject: RFR: 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 10:28:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change to fix a lock rank issue after [JDK-8317440](https://bugs.openjdk.org/browse/JDK-8317440). The `JfrStacktrace_lock` rank needs to be adjusted because it may be used when gathering CHT statistics in generational ZGC (afaiu this operation may trigger a relocation which may trigger page allocation which may trigger a JFR event which may trigger JFR trace writing). > > The `JfrStacktrace_lock` code seems to be a lock only guarding I/O too, so moving it to the lowest rank seems safe after code inspection, but also tier1-7 testing and around 100 runs of the failed application did not make it fail (reproduction is like 2/100 runs without this change). > > Testing: tier1-7, ~100 runs of the failing application with no error > > Hth, > Thomas LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16242#pullrequestreview-1691247115 From ayang at openjdk.org Mon Oct 23 07:22:34 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 07:22:34 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 15:29:08 GMT, Thomas Schatzl wrote: > actually the second one is wrong, there may be code roots attached to the region. `occupancy_less_or_equal_than` should have ensured that code-roots is empty. (The method name is not very obvious on that, IMO. Breaking it into two might be better.) I believe one can `assert(r->rem_set()->is_complete() && r->rem_set()->is_empty(), "inv");` after `clear(...)`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16270#issuecomment-1774574130 From tschatzl at openjdk.org Mon Oct 23 07:28:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 07:28:36 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 07:20:11 GMT, Albert Mingkun Yang wrote: > > actually the second one is wrong, there may be code roots attached to the region. > > `occupancy_less_or_equal_than` should have ensured that code-roots is empty. (The method name is not very obvious on that, IMO. Breaking it into two might be better.) I believe one can `assert(r->rem_set()->is_complete() && r->rem_set()->is_empty(), "inv");` after `clear(...)`. ? This is similar to @Hamlin-Li 's question. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16270#issuecomment-1774581765 From tschatzl at openjdk.org Mon Oct 23 07:28:38 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 07:28:38 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Sat, 21 Oct 2023 11:18:43 GMT, Hamlin Li wrote: >> Hi all, >> >> please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. >> >> Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. >> >> Testing: gha, tier1-5 almost done >> >> Thanks, >> Thomas > > src/hotspot/share/gc/g1/g1RemSet.cpp line 1200: > >> 1198: // collecting remembered set entries for humongous regions that were not >> 1199: // reclaimed. >> 1200: r->rem_set()->set_state_complete(); > > In original code, in set_state_complete() it `_state = Complete;`, but in new patch, `_state = Complete;` is not called, is this expected? Only remembered sets of state `Complete` remembered sets reach here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16270#discussion_r1368226168 From tschatzl at openjdk.org Mon Oct 23 07:38:42 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 07:38:42 GMT Subject: RFR: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope In-Reply-To: References: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Message-ID: On Fri, 20 Oct 2023 10:16:32 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. >> >> The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). >> >> There is more discussion in https://github.com/openjdk/jdk/pull/16011. >> >> Testing: GHA >> >> Thanks, >> Thomas > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @walulyai @Hamlin-Li for you reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16283#issuecomment-1774592962 From tschatzl at openjdk.org Mon Oct 23 07:38:44 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 07:38:44 GMT Subject: Integrated: 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope In-Reply-To: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> References: <6rGG1pj1rRi-6_J2gr64KNM7LSTcxWlRf2kX0DHKai8=.1d325f38-08c8-40fe-997d-f1cabc7ba0cc@github.com> Message-ID: On Fri, 20 Oct 2023 07:53:04 GMT, Thomas Schatzl wrote: > Hi all, > > please review this renaming of `CodeCache::UnloadingScope` to `UnlinkingScope`. > > The idea is that this scope object does not cover whole the whole unloading process, but only the unlinking part. (Where Unloading = Unlinking + Purging. Note that the current class/code unloading code isn't very consistent about naming but cleanup has to start somewhere). > > There is more discussion in https://github.com/openjdk/jdk/pull/16011. > > Testing: GHA > > Thanks, > Thomas This pull request has now been integrated. Changeset: 4eab39d9 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/4eab39d9415b2ec5c2984d0d3c110e9364090835 Stats: 14 lines in 7 files changed: 3 ins; 0 del; 11 mod 8318585: Rename CodeCache::UnloadingScope to UnlinkingScope Reviewed-by: ayang, iwalulya, mli ------------- PR: https://git.openjdk.org/jdk/pull/16283 From ayang at openjdk.org Mon Oct 23 07:43:58 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 07:43:58 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates [v2] In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 07:41:05 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. >> >> Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. >> >> Testing: gha, tier1-5 almost done >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > add assert Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16270#pullrequestreview-1691887521 From tschatzl at openjdk.org Mon Oct 23 07:43:58 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 07:43:58 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates [v2] In-Reply-To: References: Message-ID: > Hi all, > > please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. > > Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. > > Testing: gha, tier1-5 almost done > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: add assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16270/files - new: https://git.openjdk.org/jdk/pull/16270/files/6af1e899..2af41e00 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16270&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16270&range=00-01 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16270.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16270/head:pull/16270 PR: https://git.openjdk.org/jdk/pull/16270 From mli at openjdk.org Mon Oct 23 07:47:31 2023 From: mli at openjdk.org (Hamlin Li) Date: Mon, 23 Oct 2023 07:47:31 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates [v2] In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 07:43:58 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. >> >> Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. >> >> Testing: gha, tier1-5 almost done >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > add assert LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16270#pullrequestreview-1691896341 From tschatzl at openjdk.org Mon Oct 23 08:19:34 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 08:19:34 GMT Subject: RFR: 8318507: G1: Improve remset clearing for humongous candidates [v2] In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 07:45:01 GMT, Hamlin Li wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> add assert > > LGTM. Thanks @Hamlin-Li @albertnetymk @walulyai for your reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16270#issuecomment-1774661096 From ayang at openjdk.org Mon Oct 23 08:22:02 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 08:22:02 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable Message-ID: The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. There is some duplication with G1's implementation, which should probably be dealt with in its own PR. ------------- Commit messages: - s1-bot Changes: https://git.openjdk.org/jdk/pull/16304/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318647 Stats: 791 lines in 8 files changed: 24 ins; 627 del; 140 mod Patch: https://git.openjdk.org/jdk/pull/16304.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16304/head:pull/16304 PR: https://git.openjdk.org/jdk/pull/16304 From tschatzl at openjdk.org Mon Oct 23 08:22:46 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 08:22:46 GMT Subject: Integrated: 8318507: G1: Improve remset clearing for humongous candidates In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 15:29:08 GMT, Thomas Schatzl wrote: > Hi all, > > please review this small cleanup that merges two calls to `HeapRegionRemSet::clear()/HeapRegionRemSet::set_state_complete()` to a single call to `clear()` with appropriate parameters. > > Also remove now obsolete asserts that are done in `clear()` already; actually the second one is wrong, there may be code roots attached to the region. > > Testing: gha, tier1-5 almost done > > Thanks, > Thomas This pull request has now been integrated. Changeset: 729f4c5d Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/729f4c5d141cdc272249c4c69efd05f96a654137 Stats: 11 lines in 1 file changed: 0 ins; 6 del; 5 mod 8318507: G1: Improve remset clearing for humongous candidates Reviewed-by: iwalulya, ayang, mli ------------- PR: https://git.openjdk.org/jdk/pull/16270 From ayang at openjdk.org Mon Oct 23 09:36:01 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 09:36:01 GMT Subject: RFR: 8318649: G1: Remove unimplemented HeapRegionRemSet::add_code_root_locked Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/16305/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16305&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318649 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16305.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16305/head:pull/16305 PR: https://git.openjdk.org/jdk/pull/16305 From tschatzl at openjdk.org Mon Oct 23 09:54:34 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 09:54:34 GMT Subject: RFR: 8318649: G1: Remove unimplemented HeapRegionRemSet::add_code_root_locked In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 09:30:02 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16305#pullrequestreview-1692140428 From ayang at openjdk.org Mon Oct 23 11:05:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 11:05:37 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate In-Reply-To: References: Message-ID: On Sat, 21 Oct 2023 11:49:16 GMT, Hamlin Li wrote: > Hi, > Can you help to review this simple refactoring? > Thanks! Can't say the new code is easier to read. I wonder what you think of sth like: assert(is_young || is_old || is_humongous); if (is_old) { ... } else { ... } ------------- PR Review: https://git.openjdk.org/jdk/pull/16300#pullrequestreview-1692262420 From ayang at openjdk.org Mon Oct 23 11:08:42 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 11:08:42 GMT Subject: RFR: 8318649: G1: Remove unimplemented HeapRegionRemSet::add_code_root_locked In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 09:30:02 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16305#issuecomment-1774950050 From ayang at openjdk.org Mon Oct 23 11:08:43 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Oct 2023 11:08:43 GMT Subject: Integrated: 8318649: G1: Remove unimplemented HeapRegionRemSet::add_code_root_locked In-Reply-To: References: Message-ID: On Mon, 23 Oct 2023 09:30:02 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 7c0a8288 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/7c0a8288b23c11d455472762b56d5b20ac5b9f03 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8318649: G1: Remove unimplemented HeapRegionRemSet::add_code_root_locked Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16305 From tschatzl at openjdk.org Mon Oct 23 14:02:39 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 14:02:39 GMT Subject: RFR: 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking In-Reply-To: References: Message-ID: On Thu, 19 Oct 2023 17:06:16 GMT, Ivan Walulya wrote: >> Hi all, >> >> please review this change to fix a lock rank issue after [JDK-8317440](https://bugs.openjdk.org/browse/JDK-8317440). The `JfrStacktrace_lock` rank needs to be adjusted because it may be used when gathering CHT statistics in generational ZGC (afaiu this operation may trigger a relocation which may trigger page allocation which may trigger a JFR event which may trigger JFR trace writing). >> >> The `JfrStacktrace_lock` code seems to be a lock only guarding I/O too, so moving it to the lowest rank seems safe after code inspection, but also tier1-7 testing and around 100 runs of the failed application did not make it fail (reproduction is like 2/100 runs without this change). >> >> Testing: tier1-7, ~100 runs of the failing application with no error >> >> Hth, >> Thomas > > Marked as reviewed by iwalulya (Reviewer). Thanks @walulyai @Hamlin-Li for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/16242#issuecomment-1775270782 From tschatzl at openjdk.org Mon Oct 23 14:02:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Oct 2023 14:02:41 GMT Subject: Integrated: 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking In-Reply-To: References: Message-ID: On Wed, 18 Oct 2023 10:28:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change to fix a lock rank issue after [JDK-8317440](https://bugs.openjdk.org/browse/JDK-8317440). The `JfrStacktrace_lock` rank needs to be adjusted because it may be used when gathering CHT statistics in generational ZGC (afaiu this operation may trigger a relocation which may trigger page allocation which may trigger a JFR event which may trigger JFR trace writing). > > The `JfrStacktrace_lock` code seems to be a lock only guarding I/O too, so moving it to the lowest rank seems safe after code inspection, but also tier1-7 testing and around 100 runs of the failed application did not make it fail (reproduction is like 2/100 runs without this change). > > Testing: tier1-7, ~100 runs of the failing application with no error > > Hth, > Thomas This pull request has now been integrated. Changeset: 9f767aa4 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/9f767aa44b4699ed5404b934ac751f2cdd0ba824 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8318109: Writing JFR records while a CHT has taken its lock asserts in rank checking Reviewed-by: iwalulya, mli ------------- PR: https://git.openjdk.org/jdk/pull/16242 From mli at openjdk.org Mon Oct 23 14:37:59 2023 From: mli at openjdk.org (Hamlin Li) Date: Mon, 23 Oct 2023 14:37:59 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: References: Message-ID: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> > Hi, > Can you help to review this simple refactoring? > Thanks! Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: refine code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16300/files - new: https://git.openjdk.org/jdk/pull/16300/files/a7332e76..9c4149e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16300&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16300&range=00-01 Stats: 9 lines in 1 file changed: 3 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16300/head:pull/16300 PR: https://git.openjdk.org/jdk/pull/16300 From mli at openjdk.org Mon Oct 23 14:38:00 2023 From: mli at openjdk.org (Hamlin Li) Date: Mon, 23 Oct 2023 14:38:00 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: References: Message-ID: <4Y6s3rGw0Nz1Ju4ZioWtZhRRo_9wvfooaB3LaJox8Nc=.a82b780a-48a7-4109-b5ca-bacae212348d@github.com> On Mon, 23 Oct 2023 11:02:58 GMT, Albert Mingkun Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> refine code > > Can't say the new code is easier to read. > > I wonder what you think of sth like: > > > assert(is_young || is_old || is_humongous); > > if (is_old) { > ... > } else { > ... > } @albertnetymk Thanks for the suggestion, I think it's better. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16300#issuecomment-1775344959 From zgu at openjdk.org Mon Oct 23 19:30:27 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 23 Oct 2023 19:30:27 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> Message-ID: On Thu, 19 Oct 2023 16:11:56 GMT, Leela Mohan Venati wrote: >> src/hotspot/share/gc/shenandoah/shenandoahVMOperations.cpp line 64: >> >>> 62: OopMapCache::cleanup_old_entries(); >>> 63: } >>> 64: >> >> Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. If yes, i recommend adding OopMapCache::cleanup_old_entries() in VM_ShenandoahOperation::doit_epilogue(). And this would make the change simple and also revert the change in this [PR](https://github.com/openjdk/jdk/pull/15921) > > I stand corrected. > > My question is still relevant >>> Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. > > My recommendation is incorrect. No, `VM_ShenandoahFinalMarkStartEvac ` does not walk the stack roots, it signals the end of mark phase. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1369175629 From zgu at openjdk.org Mon Oct 23 19:30:30 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 23 Oct 2023 19:30:30 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> References: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> Message-ID: On Mon, 16 Oct 2023 19:56:52 GMT, Leela Mohan Venati wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup old oop map cache entry after class redefinition > > src/hotspot/share/oops/method.cpp line 311: > >> 309: void Method::mask_for(int bci, InterpreterOopMap* mask) { >> 310: methodHandle h_this(Thread::current(), this); >> 311: method_holder()->mask_for(h_this, bci, mask); > > Removing this condition allows all the threads including java threads to use/mutate oopMapCache. > > For ex: Java threads calls [JVM_CallStackWalk](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/jvm.cpp#L586) which walks the stack and calls locals() and expressions [here](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/stackwalk.cpp#L345) which access oopMapCache. The `oopMapCache` now is fully concurrent, it can be used/modified by Java threads. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1369176823 From duke at openjdk.org Mon Oct 23 20:56:36 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Mon, 23 Oct 2023 20:56:36 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: On Wed, 11 Oct 2023 18:50:14 GMT, Zhengyu Gu wrote: >> Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. >> >> GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. >> >> This patch is intended to enable `OopMapCache` for concurrent GCs. >> >> Test: >> tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup old oop map cache entry after class redefinition Before of this change, VM_Operations which enabled "is_gc_active" and walks stack to collect roots had a call to OopMapCache::cleanup_old_entries() in its corresponding doit_epilogue(). This is to free the old entries which are collected during safepoint. Now scope of the change isn't clear. We seem to extend them to concurrent GCs during their concurrent phases (Not just safepoints of concurrent GCs). Calling OopMapCache::cleanup_old_entries() in doit_epilogue() would effectively cleanup old entries accumulated during concurrent phase of the GC and also during safepoint. But change also allows java threads to accumulate old entries. When/Who calls cleanup_old_entries() in this case ? These needs to wait until future GC which does cleanup in doit_epilogue(). However, at least theoretically, we can have large time windows without GCs. Old entries accumulated by java threads can be seen as a leak (until next GC) A separate thought to make change simpler. Can cleanup_old_entries become a [list of Cleanup tasks](https://github.com/openjdk/jdk/blob/8d9a4b43f4fff30fd217dab2c224e641cb913c18/src/hotspot/share/runtime/safepoint.hpp#L72) VM thread does during cleanup phase. ------------- PR Review: https://git.openjdk.org/jdk/pull/16074#pullrequestreview-1693431578 From duke at openjdk.org Mon Oct 23 20:56:38 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Mon, 23 Oct 2023 20:56:38 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> Message-ID: <5-zr90Y1fr5aTEKhVxgYJOgqiwGYtil5jW66h3mgE1w=.ae65f301-52ac-4ce1-93c2-a39b31d8d56f@github.com> On Mon, 23 Oct 2023 19:26:03 GMT, Zhengyu Gu wrote: >> I stand corrected. >> >> My question is still relevant >>>> Do you think, VM_ShenandoahFinalMarkStartEvac walks the stack roots. >> >> My recommendation is incorrect. > > No, `VM_ShenandoahFinalMarkStartEvac ` does not walk the stack roots, it signals the end of mark phase. Note we don't need to call OopMapCache::cleanup_old_entries() after this [PR](https://github.com/openjdk/jdk/pull/15921). VM_ShenandoahFinalMarkStartEvac is derived from VM_ShenandoahReferenceOperation. VM_ShenandoahReferenceOperation::doit_epilogue calls OopMapCache::cleanup_old_entries(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1369207915 From duke at openjdk.org Mon Oct 23 20:56:41 2023 From: duke at openjdk.org (Leela Mohan Venati) Date: Mon, 23 Oct 2023 20:56:41 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: <39f-0nlOdQBABHr1cQOq7jITuRLzM9yLDUEUm1--0N8=.f3aeed11-7151-4885-8376-7d91ca84e8a7@github.com> Message-ID: On Mon, 23 Oct 2023 19:27:26 GMT, Zhengyu Gu wrote: >> src/hotspot/share/oops/method.cpp line 311: >> >>> 309: void Method::mask_for(int bci, InterpreterOopMap* mask) { >>> 310: methodHandle h_this(Thread::current(), this); >>> 311: method_holder()->mask_for(h_this, bci, mask); >> >> Removing this condition allows all the threads including java threads to use/mutate oopMapCache. >> >> For ex: Java threads calls [JVM_CallStackWalk](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/jvm.cpp#L586) which walks the stack and calls locals() and expressions [here](https://github.com/openjdk/jdk/blob/741ae06c55de65dcdfe38e328022bd8dde4fa007/src/hotspot/share/prims/stackwalk.cpp#L345) which access oopMapCache. > > The `oopMapCache` now is fully concurrent, it can be used/modified by Java threads. Which thread takes up responsibility of cleaning up old entries accumulated by java threads. Allowing for java threads definitely increases the scope (than it is mentioned in the bug ). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16074#discussion_r1369212468 From tonyp at openjdk.org Mon Oct 23 22:42:50 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 23 Oct 2023 22:42:50 GMT Subject: Withdrawn: 8315794: RISC-V: build fails with GCC 12.3 In-Reply-To: References: Message-ID: On Wed, 6 Sep 2023 14:29:56 GMT, Antonios Printezis wrote: > The build failure happens when building on RISC-V with GCC 12.3. Is there a better way to address this than disabling the stringop-overflow warnings for the two files in question? This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15593 From tonyp at openjdk.org Mon Oct 23 22:42:49 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 23 Oct 2023 22:42:49 GMT Subject: RFR: 8315794: RISC-V: build fails with GCC 12.3 [v2] In-Reply-To: References: Message-ID: <3A1ge4vdgpt4K3Lpkc9nP45nND-u72oCTEnlQ_L1Ru4=.638816db-1215-4aca-abd8-36977c6cb9bb@github.com> On Tue, 19 Sep 2023 14:14:35 GMT, Antonios Printezis wrote: >> The build failure happens when building on RISC-V with GCC 12.3. Is there a better way to address this than disabling the stringop-overflow warnings for the two files in question? > > Antonios Printezis has updated the pull request incrementally with one additional commit since the last revision: > > changes based on code review feedback I'm actually going to close this. The issue seems to have disappeared. I did review some of the changes that went in and I'm not sure which one is responsible for fixing it. But I can build the current HEAD on both a RISC-V vm and my VisionFive 2 board with GCC 12.3 and with no failures. If it re-occurs, the fix suggested by @stefank is probably the best trade-off. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15593#issuecomment-1776133936 From zgu at openjdk.org Mon Oct 23 23:48:34 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 23 Oct 2023 23:48:34 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v2] In-Reply-To: References: Message-ID: <1u5Usl2mr-Q2jvGdBW0V0qVeC4u-Aqh0MdZS03lIUJk=.45e08b12-60f0-41b0-a76f-5ad2a0aff4ea@github.com> On Mon, 23 Oct 2023 20:53:39 GMT, Leela Mohan Venati wrote: > Before of this change, VM_Operations which enabled "is_gc_active" and walks stack to collect roots had a call to OopMapCache::cleanup_old_entries() in its corresponding doit_epilogue(). This is to free the old entries which are collected during safepoint. > > Now scope of the change isn't clear. > > We seem to extend them to concurrent GCs during their concurrent phases (Not just safepoints of concurrent GCs). Calling OopMapCache::cleanup_old_entries() in doit_epilogue() would effectively cleanup old entries accumulated during concurrent phase of the GC and also during safepoint. > > But change also allows java threads to accumulate old entries. When/Who calls cleanup_old_entries() in this case ? These needs to wait until future GC which does cleanup in doit_epilogue(). However, at least theoretically, we can have large time windows without GCs. Old entries accumulated by java threads can be seen as a leak (until next GC) > Yes, it could accumulate a few. But from what I saw, there are not many. They are cleanup by: - Serial, Parallel and G1: any `VM_GC_Operation`s - ZGC: any `VM_ZOperation`s - Shenandoah: I should have added `OopMapCache::cleanup_old_entries()` in `VM_ShenandoahOperation`, where should help jdk11u backport. > A separate thought to make change simpler. Can cleanup_old_entries become a [list of Cleanup tasks](https://github.com/openjdk/jdk/blob/8d9a4b43f4fff30fd217dab2c224e641cb913c18/src/hotspot/share/runtime/safepoint.hpp#L72) VM thread does during cleanup phase. No. There is no point to add safepoint latency when it can not done outside of safepoint. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16074#issuecomment-1776227534 From rrich at openjdk.org Tue Oct 24 07:09:00 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 24 Oct 2023 07:09:00 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v21] In-Reply-To: <-tM40FGW10TWooaxzhFrtU7Xx9sQga4nhvZdUqDRLnQ=.dc86d37b-3502-43b4-8818-6ed13454d31b@github.com> References: <1s9_eK30_SkOiLxFIRRv5w_JEbmEz93C3zsZpNaYK0Q=.9c91e63a-9122-4e81-82b3-93104d9444a2@github.com> <-tM40FGW10TWooaxzhFrtU7Xx9sQga4nhvZdUqDRLnQ=.dc86d37b-3502-43b4-8818-6ed13454d31b@github.com> Message-ID: On Fri, 13 Oct 2023 14:48:23 GMT, Albert Mingkun Yang wrote: >> Should https://bugs.openjdk.org/browse/JDK-8309960 be reverted? Better in a follow-up I guess. > >> Should https://bugs.openjdk.org/browse/JDK-8309960 be reverted? Better in a follow-up I guess. > > Better in its own PR, IMO. Thanks to @albertnetymk the final version is a good deal simpler than the _baseline_. The extra effort was totally woth it. In the end parallel processing of large arrays is like a side effect of the refactoring. Thanks everybody for the review work and feedback! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1776641794 From rrich at openjdk.org Tue Oct 24 07:09:01 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 24 Oct 2023 07:09:01 GMT Subject: Integrated: 8310031: Parallel: Implement better work distribution for large object arrays in old gen In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 08:05:59 GMT, Richard Reingruber wrote: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > #### Implementation (Updated 2023-10-20) > > Comment copied from `PSCardTable::scavenge_contents_parallel`: > > ```c++ > // Scavenging and accesses to the card table are strictly limited to the stripe. > // In particular scavenging of an object crossing stripe boundaries is shared > // among the threads assigned to the stripes it resides on. This reduces > // complexity and enables shared scanning of large objects. > // It requires preprocessing of the card table though where imprecise card marks of > // objects crossing stripe boundaries are propagated to the first card of > // each stripe covered by the individual object. > > > The baseline was refactored to make use of a read-only copy of the card table. That "shadow" table (`PSStripeShadowCardTable`) separates reading, clearing and redirtying of table entries which allows for a much simpler implementation. > > Scanning of object arrays is limited to dirty card chunks. > > ## Everything below refers to the Outdated Initial Implementation > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the g... This pull request has now been integrated. Changeset: 4bfe2268 Author: Richard Reingruber URL: https://git.openjdk.org/jdk/commit/4bfe226870a15306b1e015c38fe3835f26b41fe6 Stats: 334 lines in 5 files changed: 177 ins; 70 del; 87 mod 8310031: Parallel: Implement better work distribution for large object arrays in old gen Co-authored-by: Albert Mingkun Yang Reviewed-by: tschatzl, ayang ------------- PR: https://git.openjdk.org/jdk/pull/14846 From tschatzl at openjdk.org Tue Oct 24 09:19:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Oct 2023 09:19:03 GMT Subject: RFR: 8318702: G1: Fix nonstandard indentation in g1HeapTransition.cpp Message-ID: Hi all, please review this trivial change to indentation in [g1HeapTransition.cpp](https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:submit/8318702-indentation-g1heaptransition#diff-8b92b84617e96611627362128515c65c52cadc1cfec3ec6c986c780a6d744e5f). Test: local compilation. Thanks, Thomas ------------- Commit messages: - 8318702 initial version Changes: https://git.openjdk.org/jdk/pull/16339/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16339&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318702 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16339.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16339/head:pull/16339 PR: https://git.openjdk.org/jdk/pull/16339 From ayang at openjdk.org Tue Oct 24 11:35:54 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 24 Oct 2023 11:35:54 GMT Subject: RFR: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms Message-ID: One line fix in max-age used in predicting the num of surviving bytes. ------------- Commit messages: - g1-eden-count Changes: https://git.openjdk.org/jdk/pull/16344/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16344&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318713 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16344.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16344/head:pull/16344 PR: https://git.openjdk.org/jdk/pull/16344 From iwalulya at openjdk.org Tue Oct 24 13:50:36 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 24 Oct 2023 13:50:36 GMT Subject: RFR: 8318702: G1: Fix nonstandard indentation in g1HeapTransition.cpp In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 09:13:03 GMT, Thomas Schatzl wrote: > Hi all, > > please review this trivial change to indentation in [g1HeapTransition.cpp](https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:submit/8318702-indentation-g1heaptransition#diff-8b92b84617e96611627362128515c65c52cadc1cfec3ec6c986c780a6d744e5f). > > Test: local compilation. > > Thanks, > Thomas Trivial! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16339#pullrequestreview-1694915013 From tschatzl at openjdk.org Tue Oct 24 14:51:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Oct 2023 14:51:50 GMT Subject: RFR: 8318702: G1: Fix nonstandard indentation in g1HeapTransition.cpp In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 13:48:16 GMT, Ivan Walulya wrote: >> Hi all, >> >> please review this trivial change to indentation in [g1HeapTransition.cpp](https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:submit/8318702-indentation-g1heaptransition#diff-8b92b84617e96611627362128515c65c52cadc1cfec3ec6c986c780a6d744e5f). >> >> Test: local compilation. >> >> Thanks, >> Thomas > > Trivial! Thanks @walulyai ------------- PR Comment: https://git.openjdk.org/jdk/pull/16339#issuecomment-1777390585 From tschatzl at openjdk.org Tue Oct 24 14:51:51 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Oct 2023 14:51:51 GMT Subject: Integrated: 8318702: G1: Fix nonstandard indentation in g1HeapTransition.cpp In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 09:13:03 GMT, Thomas Schatzl wrote: > Hi all, > > please review this trivial change to indentation in [g1HeapTransition.cpp](https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:submit/8318702-indentation-g1heaptransition#diff-8b92b84617e96611627362128515c65c52cadc1cfec3ec6c986c780a6d744e5f). > > Test: local compilation. > > Thanks, > Thomas This pull request has now been integrated. Changeset: 6f352740 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/6f352740cb5e7c47d226fd4039cfb977c0622488 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod 8318702: G1: Fix nonstandard indentation in g1HeapTransition.cpp Reviewed-by: iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16339 From tschatzl at openjdk.org Tue Oct 24 15:05:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Oct 2023 15:05:28 GMT Subject: RFR: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 11:29:07 GMT, Albert Mingkun Yang wrote: > One line fix in max-age used in predicting the num of surviving bytes. I would prefer to have spaces between the operator and the operands like surrounding code, but lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16344#pullrequestreview-1695161510 From duke at openjdk.org Tue Oct 24 16:26:40 2023 From: duke at openjdk.org (duke) Date: Tue, 24 Oct 2023 16:26:40 GMT Subject: Withdrawn: JDK-8310111: Shenandoah wastes memory when running with very large page sizes In-Reply-To: References: Message-ID: <6u_CRw9ULZ1L6bh854iwWjfi3aZO1USonxA8JZlLk2Q=.59a6e485-3fc1-47cf-811a-b6bc60049baf@github.com> On Tue, 20 Jun 2023 12:55:26 GMT, Thomas Stuefe wrote: > This proposal changes the reservation of bitmaps and region storage to reduce the wastage associated with running with very large page sizes (e.g. 1GB on x64, 512M on arm64) for both non-THP- and THP-mode. > > This patch does: > - introducing the notion of "allowed overdraft factor" - allocations for a given page size are rejected if they would cause more wastage than the factor allows > - if it makes sense, it places mark- and aux-bitmap into a contiguous region to let them share a ginourmous page. E.g. for a heap of 16G, both bitmaps would now share a single GB page. > > Examples: > > Note: annoyingly, huge page usage does not show up in RSS. I therefore use a script that parses /proc/pid/smaps and counts hugepage usage to count cost for the following examples: > > Example 1: > > A machine configured with 1G pages (and nothing else). Heap is allocated with 1G pages, the bitmaps fall back to 4K pages because JVM figures 1GB would be wasted: > > > thomas at starfish$ ./images/jdk/bin/java -Xmx4600m -Xlog:pagesize -XX:+UseShenandoahGC -XX:+UseLargePages > ... > [0.028s][info][pagesize] Mark Bitmap: req_size=160M req_page_size=4K base=0x00007f8149fff000 size=160M page_size=4K > [0.028s][info][pagesize] Aux Bitmap: req_size=160M req_page_size=4K base=0x00007f813fffe000 size=160M page_size=4K > [0.028s][info][pagesize] Region Storage: req_size=320K req_page_size=4K base=0x00007f817c06f000 size=320K page_size=4K > > > Cost before: 8GB. Cost now: 5GB + (2*160M) > > Example 2: JVM with 14GB heap: mark and aux bitmap together are large enough to justify another 1G page, so they share it. Notice how we also place the region storage on this page: > > > thomas at starfish:/shared/projects/openjdk/jdk-jdk/output-release$ ./images/jdk/bin/java -Xmx14g -Xlog:pagesize > -XX:+UseShenandoahGC -XX:+UseLargePages -cp $REPROS_JAR de.stuefe.repros.Simple > [0.003s][info][pagesize] Heap: req_size=14G req_page_size=1G base=0x0000000480000000 size=14G page_size=1G > [0.003s][info][pagesize] Mark+Aux Bitmap: req_size=896M req_page_size=1G base=0x00007fee00000000 size=1G page_size=1G > [0.003s][info][pagesize] Region Storage: piggy-backing on Mark Bitmap: base=0x00007fee38000000 size=1792 > > > > Cost before: 17GB. Cost now: 15GB. > > From a bang-for-hugepages-buck multiples of 16GB are a sweet spot here since (on x64 with 1GB pages) since this allows us to put both 512m bitmaps onto a single huge page. > > ----------- > > No test yet, since I wanted to see if people reject this proposal. It does add a bit of complexity. If this is well-r... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14559 From duke at openjdk.org Tue Oct 24 19:41:00 2023 From: duke at openjdk.org (Elif Aslan) Date: Tue, 24 Oct 2023 19:41:00 GMT Subject: RFR: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests Message-ID: The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. Below are the before and after test run comparisons: Before: 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total After: linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total ------------- Commit messages: - 8318727:Enable parallelism in vmTestbase/vm/gc/concurrent tests Changes: https://git.openjdk.org/jdk/pull/16350/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16350&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318727 Stats: 943 lines in 41 files changed: 0 ins; 943 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16350.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16350/head:pull/16350 PR: https://git.openjdk.org/jdk/pull/16350 From duke at openjdk.org Tue Oct 24 19:41:01 2023 From: duke at openjdk.org (Elif Aslan) Date: Tue, 24 Oct 2023 19:41:01 GMT Subject: RFR: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 19:34:52 GMT, Elif Aslan wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. > > Below are the before and after test run comparisons: > > Before: > 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total > After: > linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total @lmesnik could you please test this to your system and report the results? TIA ------------- PR Comment: https://git.openjdk.org/jdk/pull/16350#issuecomment-1777907514 From zgu at openjdk.org Tue Oct 24 23:10:42 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Tue, 24 Oct 2023 23:10:42 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 Message-ID: Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. Test: tier1 and tier2 (fastdebug and release) on Linux x86_64 ------------- Commit messages: - Remove empty line - 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 Changes: https://git.openjdk.org/jdk/pull/16352/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16352&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318720 Stats: 19 lines in 1 file changed: 18 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16352/head:pull/16352 PR: https://git.openjdk.org/jdk/pull/16352 From ayang at openjdk.org Wed Oct 25 08:09:30 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 08:09:30 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> References: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> Message-ID: On Mon, 23 Oct 2023 14:37:59 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this simple refactoring? >> Thanks! > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16300#pullrequestreview-1696637598 From tschatzl at openjdk.org Wed Oct 25 08:15:30 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 08:15:30 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 23:04:28 GMT, Zhengyu Gu wrote: > Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. > > Test: > tier1 and tier2 (fastdebug and release) on Linux x86_64 Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/g1/g1CodeRootSet.cpp line 164: > 162: Atomic::store(&_num_entries, (size_t)0); > 163: } > 164: _table.unsafe_reset(); Thanks for catching this! I believe something like a call to `_table.clean([] (nmethod** value) { return true; });` should do roughly the same. (It is fine to name the lambda with something meaningful, like `always_true`. Or `delete_all`. `do_value` is just too unspecific to me). ------------- PR Review: https://git.openjdk.org/jdk/pull/16352#pullrequestreview-1696648989 PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371335267 From ayang at openjdk.org Wed Oct 25 08:19:04 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 08:19:04 GMT Subject: RFR: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms [v2] In-Reply-To: References: Message-ID: > One line fix in max-age used in predicting the num of surviving bytes. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16344/files - new: https://git.openjdk.org/jdk/pull/16344/files/b7096f37..d1a36583 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16344&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16344&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16344.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16344/head:pull/16344 PR: https://git.openjdk.org/jdk/pull/16344 From iwalulya at openjdk.org Wed Oct 25 08:19:05 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 25 Oct 2023 08:19:05 GMT Subject: RFR: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms [v2] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 08:16:13 GMT, Albert Mingkun Yang wrote: >> One line fix in max-age used in predicting the num of surviving bytes. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16344#pullrequestreview-1696656626 From tschatzl at openjdk.org Wed Oct 25 08:23:39 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 08:23:39 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> References: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> Message-ID: On Mon, 23 Oct 2023 14:37:59 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this simple refactoring? >> Thanks! > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code Fwiw, the original intent of the existing code structure has been to be _very_ obvious and clear with what happens to which region to avoid mistakes. Imo this change makes the code more clear, but if more people think it does not matter I'm not opposed to it. src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp line 49: > 47: } > 48: // Always collect remembered set for young regions. > 49: // Collect remembered sets for humongous regions by default to allow eager reclaim. Suggestion: // Always collect remembered set for young regions and for humongous regions. Humongous regions need that for eager reclaim. ------------- PR Review: https://git.openjdk.org/jdk/pull/16300#pullrequestreview-1696671890 PR Review Comment: https://git.openjdk.org/jdk/pull/16300#discussion_r1371343740 From shade at openjdk.org Wed Oct 25 08:31:40 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 25 Oct 2023 08:31:40 GMT Subject: RFR: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests In-Reply-To: References: Message-ID: <2q2NwpNvVz6hMl1ery5uhFjbUzmmay27zetu6wEj9ps=.19e0fee2-1d77-4baf-93ce-17ae9a27a623@github.com> On Tue, 24 Oct 2023 19:34:52 GMT, Elif Aslan wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. > > Below are the before and after test run comparisons: > > Before: > 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total > After: > linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total This look good to me. @lmesnik should approve before we integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16350#issuecomment-1778770047 From ayang at openjdk.org Wed Oct 25 08:33:47 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 08:33:47 GMT Subject: RFR: 8318801: Parallel: Remove unused verify_all_young_refs_precise Message-ID: Trivial removing dead code. ------------- Commit messages: - pgc-precise Changes: https://git.openjdk.org/jdk/pull/16355/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16355&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318801 Stats: 76 lines in 3 files changed: 0 ins; 76 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16355.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16355/head:pull/16355 PR: https://git.openjdk.org/jdk/pull/16355 From ayang at openjdk.org Wed Oct 25 08:46:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 08:46:45 GMT Subject: RFR: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms [v2] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 08:19:04 GMT, Albert Mingkun Yang wrote: >> One line fix in max-age used in predicting the num of surviving bytes. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16344#issuecomment-1778792620 From ayang at openjdk.org Wed Oct 25 08:46:46 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 08:46:46 GMT Subject: Integrated: 8318713: G1: Use more accurate age in predict_eden_copy_time_ms In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 11:29:07 GMT, Albert Mingkun Yang wrote: > One line fix in max-age used in predicting the num of surviving bytes. This pull request has now been integrated. Changeset: d2d1592d Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/d2d1592dd94e897fae6fc4098e43b4fffb6d6750 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8318713: G1: Use more accurate age in predict_eden_copy_time_ms Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/16344 From tschatzl at openjdk.org Wed Oct 25 09:35:38 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 09:35:38 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> References: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> Message-ID: On Mon, 23 Oct 2023 14:37:59 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this simple refactoring? >> Thanks! > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code Then again, it's not that important, but please fix the comment first to not sound that disjointed. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16300#pullrequestreview-1696823862 From mli at openjdk.org Wed Oct 25 09:52:54 2023 From: mli at openjdk.org (Hamlin Li) Date: Wed, 25 Oct 2023 09:52:54 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v3] In-Reply-To: References: Message-ID: > Hi, > Can you help to review this simple refactoring? > Thanks! Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: refine comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16300/files - new: https://git.openjdk.org/jdk/pull/16300/files/9c4149e1..ce6f9892 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16300&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16300&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16300/head:pull/16300 PR: https://git.openjdk.org/jdk/pull/16300 From mli at openjdk.org Wed Oct 25 09:52:54 2023 From: mli at openjdk.org (Hamlin Li) Date: Wed, 25 Oct 2023 09:52:54 GMT Subject: RFR: 8318629: G1: Refine code a bit in G1RemSetTrackingPolicy::update_at_allocate [v2] In-Reply-To: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> References: <_HMUkoI3-kS-CLYli6CfHXnvtcs_uuczDdTbsfqKuX4=.d440fa75-fb3b-44cd-801b-4df08f98c5f8@github.com> Message-ID: <4kARTLjL9eVkZQDf-8gHHr6SOBPDQB4qahpsC2kx5Eo=.e885fc7b-1e40-4ee2-9270-570b99e127f2@github.com> On Mon, 23 Oct 2023 14:37:59 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this simple refactoring? >> Thanks! > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code Thanks Thomas for the suggestions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16300#issuecomment-1778902851 From tschatzl at openjdk.org Wed Oct 25 10:18:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 10:18:35 GMT Subject: RFR: 8318801: Parallel: Remove unused verify_all_young_refs_precise In-Reply-To: References: Message-ID: <8oI7wCazkep2CdclvfFYzB-rrLWaqRYYhuVmV_T5aKs=.57051dc0-fd09-46b0-b8ba-47e1c791a349@github.com> On Wed, 25 Oct 2023 08:27:04 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Trivial and lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16355#pullrequestreview-1696905902 From tschatzl at openjdk.org Wed Oct 25 11:01:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 11:01:43 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable In-Reply-To: References: Message-ID: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> On Mon, 23 Oct 2023 08:14:46 GMT, Albert Mingkun Yang wrote: > The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. > > There is some duplication with G1's implementation, which should probably be dealt with in its own PR. Initial batch of comments. src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 84: > 82: } > 83: > 84: // Write the backskip value for each region. Suggestion: // Write the backskip value for the given region. src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 120: > 118: HeapWord* blk_end) { > 119: HeapWord* const cur_card_boundary = align_up_by_card_size(blk_start); > 120: size_t const offset_card = _array->index_for(cur_card_boundary); Suggestion: size_t const offset_card = _array->index_for(cur_card_boundary); src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 131: > 129: if (offset_card != end_card) { > 130: // Handling remaining cards. > 131: size_t start_card_for_region = offset_card + 1; Maybe `start_card_for/to_update` is a better name? src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 135: > 133: // -1 so that the reach ends in this region and not at the start > 134: // of the next. > 135: size_t reach = offset_card + BOTConstants::power_to_cards_back(i+1) - 1; Suggestion: size_t reach = offset_card + BOTConstants::power_to_cards_back(i + 1) - 1; src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 43: > 41: // systems using card-table-based write barriers, the efficiency of this > 42: // operation may be important. Implementations of the BlockOffsetTable > 43: // class may be useful in providing such efficient implementations. The comment is completely outdated, talking about subtypes where there are none any more of this class. src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 45: > 43: // class may be useful in providing such efficient implementations. > 44: > 45: class BlockOffsetSharedArray: public CHeapObj { `BlockOffsetSharedArray` could probably be moved to the .cpp file. src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 45: > 43: // class may be useful in providing such efficient implementations. > 44: > 45: class BlockOffsetSharedArray: public CHeapObj { Should be prefixed by `Serial`, same as `BlockOffsetTable` (since this is a complete reimplementation anyway). src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 55: > 53: // address. > 54: VirtualSpace _vs; > 55: u_char* _offset_array; // byte array keeping backwards offsets I always wondered whether `u_char` is what we want here, maybe `uint8_t` is more appropriate? src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 141: > 139: HeapWord* const obj_end) { > 140: HeapWord* cur_card_boundary = align_up_by_card_size(obj_start); > 141: // strictly greater-than This comment should not only mention what the code below does but also why. src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 157: > 155: }; > 156: > 157: extra newline src/hotspot/share/gc/serial/serialBlockOffsetTable.inline.hpp line 31: > 29: > 30: #include "gc/shared/space.hpp" > 31: #include "runtime/safepoint.hpp" These includes seem superfluous. src/hotspot/share/gc/serial/serialBlockOffsetTable.inline.hpp line 35: > 33: ////////////////////////////////////////////////////////////////////////// > 34: // BlockOffsetSharedArray inlines > 35: ////////////////////////////////////////////////////////////////////////// I think this comment does not add value and should be removed. ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16304#pullrequestreview-1696951303 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371541243 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371540936 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371545013 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371542603 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371530149 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371527773 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371531382 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371545829 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371535236 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371535817 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371539016 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371537058 From tschatzl at openjdk.org Wed Oct 25 11:01:45 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 11:01:45 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable In-Reply-To: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: On Wed, 25 Oct 2023 10:54:25 GMT, Thomas Schatzl wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 84: > >> 82: } >> 83: >> 84: // Write the backskip value for each region. > > Suggestion: > > // Write the backskip value for the given region. Note that region isn't defined very exactly. Here it is (apparently) the memory from `blk_start` to `blk_end`, but later it is also used as "area containing the same backskip value". Maybe there is some better wording to be found. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371544873 From ayang at openjdk.org Wed Oct 25 11:20:36 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 11:20:36 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable In-Reply-To: References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: On Wed, 25 Oct 2023 10:57:45 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 84: >> >>> 82: } >>> 83: >>> 84: // Write the backskip value for each region. >> >> Suggestion: >> >> // Write the backskip value for the given region. > > Note that region isn't defined very exactly. Here it is (apparently) the memory from `blk_start` to `blk_end`, but later it is also used as "area containing the same backskip value". Maybe there is some better wording to be found. "region" in this file always means "logarithmic region". If we refer to `[blk_start, blk_end]`, "block" should be used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371565110 From ayang at openjdk.org Wed Oct 25 11:20:38 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 11:20:38 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable In-Reply-To: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: <37m3F1LozKUUwBuLMGMlQ-w-6k3I3qJAorgo2dPwF7o=.d3769416-4459-4d1e-a0bb-4287fe002e9d@github.com> On Wed, 25 Oct 2023 10:43:56 GMT, Thomas Schatzl wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 43: > >> 41: // systems using card-table-based write barriers, the efficiency of this >> 42: // operation may be important. Implementations of the BlockOffsetTable >> 43: // class may be useful in providing such efficient implementations. > > The comment is completely outdated, talking about subtypes where there are none any more of this class. I think the subtypes refer to heaps from different collectors. Not sth around `BlockOffsetArray`. > src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 45: > >> 43: // class may be useful in providing such efficient implementations. >> 44: >> 45: class BlockOffsetSharedArray: public CHeapObj { > > Should be prefixed by `Serial`, same as `BlockOffsetTable` (since this is a complete reimplementation anyway). Maybe this renaming can be done in its own (trivial) PR for easier reviewing. > src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 55: > >> 53: // address. >> 54: VirtualSpace _vs; >> 55: u_char* _offset_array; // byte array keeping backwards offsets > > I always wondered whether `u_char` is what we want here, maybe `uint8_t` is more appropriate? Can probably be done in its own PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371567566 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371566528 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371565659 From ayang at openjdk.org Wed Oct 25 11:26:04 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 11:26:04 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: References: Message-ID: > The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. > > There is some duplication with G1's implementation, which should probably be dealt with in its own PR. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16304/files - new: https://git.openjdk.org/jdk/pull/16304/files/40513f3e..2d12fb0d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=00-01 Stats: 10 lines in 3 files changed: 0 ins; 7 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16304.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16304/head:pull/16304 PR: https://git.openjdk.org/jdk/pull/16304 From ayang at openjdk.org Wed Oct 25 11:26:04 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 11:26:04 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: <9Q47rlxSChpI6HMiEmqDhNMDA57sSP3RPvQEDq8W5m4=.9a4e4a7a-0079-4faa-a2e8-a5d22f2f2bf8@github.com> On Wed, 25 Oct 2023 10:41:36 GMT, Thomas Schatzl wrote: >> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 45: > >> 43: // class may be useful in providing such efficient implementations. >> 44: >> 45: class BlockOffsetSharedArray: public CHeapObj { > > `BlockOffsetSharedArray` could probably be moved to the .cpp file. `resize` is called by `TenuredGeneration` for heap-resizing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371570721 From simonis at openjdk.org Wed Oct 25 11:30:44 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 25 Oct 2023 11:30:44 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v27] In-Reply-To: References: <0AHhZD1JncFwC7z5rp1uufF4FAMrKR8mQXdDuwKNL4s=.02114c12-5a6b-4d3e-8b7a-421bb3eb8d47@github.com> Message-ID: <2leJ4ArjJxwon1DwmNr470VEWeXBt6AH780uIY5orcU=.00e89ec2-6a4d-450c-9d8d-58765d5916b1@github.com> On Thu, 5 Oct 2023 03:04:39 GMT, Jonathan Joo wrote: >> Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: >> >> add comment and change if defined to ifdef > > Resolved comments and sanity checks pass on all builds: https://github.com/jjoo172/jdk/actions/runs/6411637099 > > I believe this PR should be RFR once again. @jjoo172, https://github.com/openjdk/jdk/pull/16252 has now been integrated. Can you please merge it in and use the 64-bit Atomics for incrementing the counters? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1779058633 From tschatzl at openjdk.org Wed Oct 25 11:55:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 11:55:36 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 11:26:04 GMT, Albert Mingkun Yang wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 143: > 141: } > 142: _array->set_offset_array(start_card_for_region, reach, value); > 143: start_card_for_region = reach + 1; Suggestion: _array->set_offset_array(start_card_for_region, MIN2(end_card, reach), value); start_card_for_region = reach + 1; if (reach >= end_card) { // Set all the cards covered by this region. break; } I would probably try to merge these two paths as much as possible as suggested above (untested). ------------- PR Review: https://git.openjdk.org/jdk/pull/16304#pullrequestreview-1697078033 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371607397 From tschatzl at openjdk.org Wed Oct 25 12:08:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 12:08:35 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: <37m3F1LozKUUwBuLMGMlQ-w-6k3I3qJAorgo2dPwF7o=.d3769416-4459-4d1e-a0bb-4287fe002e9d@github.com> References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> <37m3F1LozKUUwBuLMGMlQ-w-6k3I3qJAorgo2dPwF7o=.d3769416-4459-4d1e-a0bb-4287fe002e9d@github.com> Message-ID: On Wed, 25 Oct 2023 11:18:10 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 43: >> >>> 41: // systems using card-table-based write barriers, the efficiency of this >>> 42: // operation may be important. Implementations of the BlockOffsetTable >>> 43: // class may be useful in providing such efficient implementations. >> >> The comment is completely outdated, talking about subtypes where there are none any more of this class. > > I think the subtypes refer to heaps from different collectors. Not sth around `BlockOffsetArray`. Still it isn't worth (and somewhat confusing) imo to talk about subtypes when there are none, particularly because the change deleted the comment about subtypes below. >> src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 45: >> >>> 43: // class may be useful in providing such efficient implementations. >>> 44: >>> 45: class BlockOffsetSharedArray: public CHeapObj { >> >> Should be prefixed by `Serial`, same as `BlockOffsetTable` (since this is a complete reimplementation anyway). > > Maybe this renaming can be done in its own (trivial) PR for easier reviewing. sure np. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371633574 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371634485 From tschatzl at openjdk.org Wed Oct 25 12:08:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 12:08:29 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: On Wed, 25 Oct 2023 11:15:42 GMT, Albert Mingkun Yang wrote: >> Note that region isn't defined very exactly. Here it is (apparently) the memory from `blk_start` to `blk_end`, but later it is also used as "area containing the same backskip value". Maybe there is some better wording to be found. > > "region" in this file always means "logarithmic region". If we refer to `[blk_start, blk_end]`, "block" should be used. I noticed just now. It would be nice to define that somewhere. ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371630310 From tschatzl at openjdk.org Wed Oct 25 12:13:37 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 12:13:37 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v2] In-Reply-To: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> Message-ID: On Wed, 25 Oct 2023 10:57:52 GMT, Thomas Schatzl wrote: >> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/gc/serial/serialBlockOffsetTable.cpp line 131: > >> 129: if (offset_card != end_card) { >> 130: // Handling remaining cards. >> 131: size_t start_card_for_region = offset_card + 1; > > Maybe `start_card_for/to_update` is a better name? Obsolete given a definition of a "region". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371644612 From tschatzl at openjdk.org Wed Oct 25 12:17:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 12:17:35 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 08:12:17 GMT, Thomas Schatzl wrote: >> Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. >> >> Test: >> tier1 and tier2 (fastdebug and release) on Linux x86_64 > > src/hotspot/share/gc/g1/g1CodeRootSet.cpp line 164: > >> 162: Atomic::store(&_num_entries, (size_t)0); >> 163: } >> 164: _table.unsafe_reset(); > > Thanks for catching this! > > I believe something like a call to `_table.clean([] (nmethod** value) { return true; });` should do roughly the same. (It is fine to name the lambda with something meaningful, like `always_true`. Or `delete_all`. `do_value` is just too unspecific to me). Another option is to provide a reset function in the CHT itself that roughly does what the destructor does right now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371652342 From zgu at openjdk.org Wed Oct 25 12:51:58 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 12:51:58 GMT Subject: RFR: 8317466: Enable interpreter oopMapCache for concurrent GCs [v3] In-Reply-To: References: Message-ID: <3oEH8zFRDSOC9kcENyJXxj0Gb2rbk2Dimnzr__4d4jE=.ccfde890-e51b-4b1c-83c8-34516307c5cb@github.com> > Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. > > GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. > > This patch is intended to enable `OopMapCache` for concurrent GCs. > > Test: > tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Refactor ShenandoahOperation to be symmetric to other GCs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16074/files - new: https://git.openjdk.org/jdk/pull/16074/files/015d4fb3..b63a105e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16074&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16074&range=01-02 Stats: 6 lines in 2 files changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16074.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16074/head:pull/16074 PR: https://git.openjdk.org/jdk/pull/16074 From zgu at openjdk.org Wed Oct 25 13:21:48 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 13:21:48 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v2] In-Reply-To: References: Message-ID: > Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. > > Test: > tier1 and tier2 (fastdebug and release) on Linux x86_64 Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Thomas' comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16352/files - new: https://git.openjdk.org/jdk/pull/16352/files/f98d3967..6469fc1f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16352&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16352&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16352/head:pull/16352 PR: https://git.openjdk.org/jdk/pull/16352 From zgu at openjdk.org Wed Oct 25 13:34:38 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 13:34:38 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v2] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 12:14:29 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1CodeRootSet.cpp line 164: >> >>> 162: Atomic::store(&_num_entries, (size_t)0); >>> 163: } >>> 164: _table.unsafe_reset(); >> >> Thanks for catching this! >> >> I believe something like a call to `_table.clean([] (nmethod** value) { return true; });` should do roughly the same. (It is fine to name the lambda with something meaningful, like `always_true`. Or `delete_all`. `do_value` is just too unspecific to me). > > Another option is to provide a reset function in the CHT itself that roughly does what the destructor does right now. That's the first place I looked. But `free_nodes` is not MT-safe, I am not sure it will fit well with other APIs that all seem to be MT-safe. I also thought about move `G1CodeRootSetHashTable::clear()` implementation to CHT, but likely I would need another version for safepoint? (e.g. use `_table.bulk_delete` instead). Hope to have a chance to return to JVM development fulltime, so I will have time to think about it. For now, I will let you to worry about the refactoring :-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371769768 From shade at openjdk.org Wed Oct 25 13:51:30 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 25 Oct 2023 13:51:30 GMT Subject: RFR: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 19:34:52 GMT, Elif Aslan wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. > > Below are the before and after test run comparisons: > > Before: > 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total > After: > linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16350#pullrequestreview-1697406435 From ayang at openjdk.org Wed Oct 25 14:01:55 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 14:01:55 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v3] In-Reply-To: References: Message-ID: > The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. > > There is some duplication with G1's implementation, which should probably be dealt with in its own PR. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16304/files - new: https://git.openjdk.org/jdk/pull/16304/files/2d12fb0d..a4f6ea8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=01-02 Stats: 15 lines in 2 files changed: 4 ins; 10 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16304.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16304/head:pull/16304 PR: https://git.openjdk.org/jdk/pull/16304 From ayang at openjdk.org Wed Oct 25 14:01:56 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Oct 2023 14:01:56 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v3] In-Reply-To: References: <9V1MfyYeT2Vv5KMJoXETNwnU8dUYlpt9lWidoB-04W0=.1221e7fc-a04a-4e3d-8b67-180898ec48a1@github.com> <37m3F1LozKUUwBuLMGMlQ-w-6k3I3qJAorgo2dPwF7o=.d3769416-4459-4d1e-a0bb-4287fe002e9d@github.com> Message-ID: <5-7mKVC1MFf0zMrB-u4ecmovNNWBxWIWwmRrvdB9Zjs=.09292da0-71a7-4183-9ecf-ffb360e3a481@github.com> On Wed, 25 Oct 2023 12:05:06 GMT, Thomas Schatzl wrote: >> I think the subtypes refer to heaps from different collectors. Not sth around `BlockOffsetArray`. > > Still it isn't worth (and somewhat confusing) imo to talk about subtypes when there are none, particularly because the change deleted the comment about subtypes below. OK, removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1371812324 From tschatzl at openjdk.org Wed Oct 25 14:21:05 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 14:21:05 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 Message-ID: The JEP covers the idea very well, so I'm only covering some implementation details here: * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: `GC(6) Pause Young (Normal) (Evacuation Failure) 1M->1M(22M) 36.16ms` there is that new tag `(Pinned)` that indicates that one or more regions that were pinned were encountered during gc. E.g. `GC(6) Pause Young (Normal) (Pinned) (Evacuation Failure) 1M->1M(22M) 36.16ms` `Pinned` and `Evacuation Failure` tags are not exclusive. GC might have encountered both pinned regions and evacuation failed regions in the same collection or even in the same region. (I am open to a better name for the `(Pinned)` tag) Testing: tier1-8 ------------- Commit messages: - Improve somewhat unstable test - Fix typo in src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegion.java so that resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java does not fail - Fix minimal build - Region pinning in G1/JEP-423 Changes: https://git.openjdk.org/jdk/pull/16342/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318706 Stats: 1547 lines in 54 files changed: 922 ins; 422 del; 203 mod Patch: https://git.openjdk.org/jdk/pull/16342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16342/head:pull/16342 PR: https://git.openjdk.org/jdk/pull/16342 From tschatzl at openjdk.org Wed Oct 25 14:21:06 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 14:21:06 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 09:56:57 GMT, Thomas Schatzl wrote: > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... The new [TestPinnedOldObjectsEvacuation.java](https://github.com/openjdk/jdk/pull/16342/files#diff-b141a3be9b9c2ba59c78f42ee1af4f65f04a32a15db245c1ed68711953939258) test isn't stable, otherwise passes tier1-8. No perf changes. I'm opening this PR for review even if this is the case, this is not a blocker for review, and fix it later. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16342#issuecomment-1779381807 From tschatzl at openjdk.org Wed Oct 25 14:54:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 14:54:28 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v2] In-Reply-To: References: Message-ID: <7sV3VdBEktiFbdUavF__gDroCNhmS-QDVvrFY8CKv70=.3300acc9-1ca6-4836-bdd0-13a4ac06de39@github.com> On Wed, 25 Oct 2023 13:31:44 GMT, Zhengyu Gu wrote: >> Another option is to provide a reset function in the CHT itself that roughly does what the destructor does right now. > > That's the first place I looked. But `free_nodes` is not MT-safe, I am not sure it will fit well with other APIs that all seem to be MT-safe. > > I also thought about move `G1CodeRootSetHashTable::clear()` implementation to CHT, but likely I would need another version for safepoint? (e.g. use `_table.bulk_delete` instead). > > Hope to have a chance to return to JVM development fulltime, so I will have time to think about it. For now, I will let you to worry about the refactoring :-) I would strongly prefer something like [this](https://github.com/openjdk/jdk/commit/521b66294bc22e7c0760c60e7ce2f1ce31347fc3) after some "worrying about the refactoring" ;) Passes gc/g1 testing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371899503 From tschatzl at openjdk.org Wed Oct 25 14:54:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 14:54:29 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v2] In-Reply-To: <7sV3VdBEktiFbdUavF__gDroCNhmS-QDVvrFY8CKv70=.3300acc9-1ca6-4836-bdd0-13a4ac06de39@github.com> References: <7sV3VdBEktiFbdUavF__gDroCNhmS-QDVvrFY8CKv70=.3300acc9-1ca6-4836-bdd0-13a4ac06de39@github.com> Message-ID: On Wed, 25 Oct 2023 14:48:47 GMT, Thomas Schatzl wrote: >> That's the first place I looked. But `free_nodes` is not MT-safe, I am not sure it will fit well with other APIs that all seem to be MT-safe. >> >> I also thought about move `G1CodeRootSetHashTable::clear()` implementation to CHT, but likely I would need another version for safepoint? (e.g. use `_table.bulk_delete` instead). >> >> Hope to have a chance to return to JVM development fulltime, so I will have time to think about it. For now, I will let you to worry about the refactoring :-) > > I would strongly prefer something like [this](https://github.com/openjdk/jdk/commit/521b66294bc22e7c0760c60e7ce2f1ce31347fc3) after some "worrying about the refactoring" ;) > Passes gc/g1 testing. It's small enough to copy it in here: void clear() { // Remove all entries. auto always_true = [] (nmethod** value) { return true; }; clean(always_true); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371903515 From zgu at openjdk.org Wed Oct 25 15:13:51 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 15:13:51 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v3] In-Reply-To: References: Message-ID: > Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. > > Test: > tier1 and tier2 (fastdebug and release) on Linux x86_64 Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Adopt Thomas' code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16352/files - new: https://git.openjdk.org/jdk/pull/16352/files/6469fc1f..205300b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16352&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16352&range=01-02 Stats: 19 lines in 1 file changed: 0 ins; 14 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/16352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16352/head:pull/16352 PR: https://git.openjdk.org/jdk/pull/16352 From zgu at openjdk.org Wed Oct 25 15:13:52 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 15:13:52 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v3] In-Reply-To: References: <7sV3VdBEktiFbdUavF__gDroCNhmS-QDVvrFY8CKv70=.3300acc9-1ca6-4836-bdd0-13a4ac06de39@github.com> Message-ID: On Wed, 25 Oct 2023 14:51:06 GMT, Thomas Schatzl wrote: >> I would strongly prefer something like [this](https://github.com/openjdk/jdk/commit/521b66294bc22e7c0760c60e7ce2f1ce31347fc3) after some "worrying about the refactoring" ;) >> Passes gc/g1 testing. > > It's small enough to copy it in here: > > void clear() { > // Remove all entries. > auto always_true = [] (nmethod** value) { > return true; > }; > > clean(always_true); > } Ah, I was confused by `_table.clean()` and could not find `clean()` in CHT. Updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16352#discussion_r1371930031 From lmesnik at openjdk.org Wed Oct 25 15:22:29 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 25 Oct 2023 15:22:29 GMT Subject: RFR: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 19:34:52 GMT, Elif Aslan wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. > > Below are the before and after test run comparisons: > > Before: > 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total > After: > linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total Testing is also good. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16350#pullrequestreview-1697659383 From tschatzl at openjdk.org Wed Oct 25 15:23:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 25 Oct 2023 15:23:36 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v3] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 15:13:51 GMT, Zhengyu Gu wrote: >> Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. >> >> Test: >> tier1 and tier2 (fastdebug and release) on Linux x86_64 > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Adopt Thomas' code Lgtm :) ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16352#pullrequestreview-1697663397 From duke at openjdk.org Wed Oct 25 15:26:44 2023 From: duke at openjdk.org (Elif Aslan) Date: Wed, 25 Oct 2023 15:26:44 GMT Subject: Integrated: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 19:34:52 GMT, Elif Aslan wrote: > The commit includes changes to unblock parallelism for more `hotspot:tier4 tests.` in `test/hotspot/jtreg/vmTestbase/gc/concurrent` tests. > > Below are the before and after test run comparisons: > > Before: > 'linux-x86_64-server-release' 9066.56s user 2289.94s system 880% cpu 21:29.15 total > After: > linux-x86_64-server-release': 3159.71s user 264.93s system 2891% cpu 1:58.45 total This pull request has now been integrated. Changeset: 29d462a0 Author: Elif Aslan Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/29d462a07239a57b83850b9a8662573291fdbdf7 Stats: 943 lines in 41 files changed: 0 ins; 943 del; 0 mod 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests Reviewed-by: shade, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/16350 From jjoo at openjdk.org Wed Oct 25 18:30:07 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Wed, 25 Oct 2023 18:30:07 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v31] In-Reply-To: References: Message-ID: > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: - Merge branch 'openjdk:master' into master - Add call to publish in parallel gc and update counter names - Add Copyright header to test and formatting changes - Fix test - add comment and change if defined to ifdef - Remove header and fix long to jlong - Update logic to use cmpxchg rather than add - Fix build issues - Fix logic for publishing total cpu time and convert atomic jlong to long - Fix more broken headers for sanity checks - ... and 21 more: https://git.openjdk.org/jdk/compare/a3b77dde...b57aa467 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/19fe9b3a..b57aa467 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=29-30 Stats: 128020 lines in 3548 files changed: 59061 ins; 26067 del; 42892 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From jjoo at openjdk.org Wed Oct 25 20:37:46 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Wed, 25 Oct 2023 20:37:46 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: Use 64-bit atomic add for incrementing counters ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/b57aa467..ebafa2b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=31 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=30-31 Stats: 13 lines in 1 file changed: 0 ins; 13 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From manc at openjdk.org Wed Oct 25 21:25:38 2023 From: manc at openjdk.org (Man Cao) Date: Wed, 25 Oct 2023 21:25:38 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 20:37:46 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Use 64-bit atomic add for incrementing counters The change LGTM except a small suggestion. Could some one from hotspot-gc-dev also take a look as @albertnetymk added the tag? src/hotspot/share/gc/shared/stringdedup/stringDedupProcessor.cpp line 200: > 198: log_statistics(); > 199: if (UsePerfData && os::is_thread_cpu_time_supported()) { > 200: ThreadTotalCPUTimeClosure tttc(_concurrent_dedup_thread_cpu_time, true); I think it is better to not classify StringDedup thread as a GC thread, so remove the "true" parameter. Although StringDedup requests are initiated during GC (search `_string_dedup_requests.add`), they are not part of the GC. ------------- Marked as reviewed by manc (Committer). PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1698265647 PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1372333757 From zgu at openjdk.org Wed Oct 25 23:50:49 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 23:50:49 GMT Subject: RFR: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 [v3] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 15:20:52 GMT, Thomas Schatzl wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Adopt Thomas' code > > Lgtm :) Thanks, @tschatzl ------------- PR Comment: https://git.openjdk.org/jdk/pull/16352#issuecomment-1780206409 From zgu at openjdk.org Wed Oct 25 23:50:50 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 25 Oct 2023 23:50:50 GMT Subject: Integrated: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 23:04:28 GMT, Zhengyu Gu wrote: > Calling `ConcurrentHashTable:table.unsafe_reset()` in `G1CodeRootSet::clear()` does not free the entries that still in the table. This patch empties the table before the `unsafe_reset()` call to avoid memory leak. > > Test: > tier1 and tier2 (fastdebug and release) on Linux x86_64 This pull request has now been integrated. Changeset: 811b436e Author: Zhengyu Gu URL: https://git.openjdk.org/jdk/commit/811b436e5de972bedd3a0fa25952b2e1beddd9c3 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503 Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16352 From ayang at openjdk.org Thu Oct 26 10:55:40 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Oct 2023 10:55:40 GMT Subject: RFR: 8318801: Parallel: Remove unused verify_all_young_refs_precise In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 08:27:04 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16355#issuecomment-1780881467 From ayang at openjdk.org Thu Oct 26 10:55:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Oct 2023 10:55:41 GMT Subject: Integrated: 8318801: Parallel: Remove unused verify_all_young_refs_precise In-Reply-To: References: Message-ID: <7wF-L4wywTbQ1DlJIhPur-7HtZoQLJB1FW7FFhkjhus=.edde45ea-d145-40cf-81ee-9a9569541af4@github.com> On Wed, 25 Oct 2023 08:27:04 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: ec1bf23d Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/ec1bf23d012f007c126cb472fcff146cf7f41b1a Stats: 76 lines in 3 files changed: 0 ins; 76 del; 0 mod 8318801: Parallel: Remove unused verify_all_young_refs_precise Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16355 From ayang at openjdk.org Thu Oct 26 11:32:16 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Oct 2023 11:32:16 GMT Subject: RFR: 8318894: G1: Use uint for age in G1SurvRateGroup Message-ID: Simple refactoring to use unsigned type for region-age. ------------- Commit messages: - g1-age-uint Changes: https://git.openjdk.org/jdk/pull/16376/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16376&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318894 Stats: 41 lines in 5 files changed: 6 ins; 10 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/16376.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16376/head:pull/16376 PR: https://git.openjdk.org/jdk/pull/16376 From simonis at openjdk.org Thu Oct 26 12:10:38 2023 From: simonis at openjdk.org (Volker Simonis) Date: Thu, 26 Oct 2023 12:10:38 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 20:37:46 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Use 64-bit atomic add for incrementing counters Thanks for your patience :) Looks good to me. ------------- Marked as reviewed by simonis (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1699438077 From ayang at openjdk.org Thu Oct 26 12:52:42 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Oct 2023 12:52:42 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 20:37:46 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Use 64-bit atomic add for incrementing counters src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 138: > 136: _vtime_accum = (os::elapsedVTime() - _vtime_start); > 137: > 138: cm()->update_concurrent_mark_threads_cpu_time(); Is there some overlapping btw this and the existing `_vtime_accum`. If so, can they be consolidated somehow? I believe the purpose of calling `update_concurrent_mark_threads_cpu_time` in multiple places is to get more up-to-date conc-cpu-time. Reading through the JBS ticket, I don't see the motivation for maintaining such a "fresh" value. Finally, is CSR required for this feature? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1373116720 From simonis at openjdk.org Thu Oct 26 14:34:40 2023 From: simonis at openjdk.org (Volker Simonis) Date: Thu, 26 Oct 2023 14:34:40 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 12:49:21 GMT, Albert Mingkun Yang wrote: >> Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: >> >> Use 64-bit atomic add for incrementing counters > > src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 138: > >> 136: _vtime_accum = (os::elapsedVTime() - _vtime_start); >> 137: >> 138: cm()->update_concurrent_mark_threads_cpu_time(); > > Is there some overlapping btw this and the existing `_vtime_accum`. If so, can they be consolidated somehow? > > I believe the purpose of calling `update_concurrent_mark_threads_cpu_time` in multiple places is to get more up-to-date conc-cpu-time. Reading through the JBS ticket, I don't see the motivation for maintaining such a "fresh" value. > > Finally, is CSR required for this feature? @albertnetymk, the hsperf counters are a non-public API and the new counters have been added to the non-standard `sun.threads.cpu_time` name space which is "[unstable and unsupported](https://github.com/openjdk/jdk/blob/9864951dceb0ddc4479ced04b6d5a2363f1e307d/src/hotspot/share/runtime/perfData.cpp#L56)" so I don't think a CSR is required. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1373278962 From ayang at openjdk.org Thu Oct 26 15:10:42 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Oct 2023 15:10:42 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue Message-ID: Simple removing some dead code; after this, a card is either clean or dirty. ------------- Commit messages: - pgc-trim-card-enum Changes: https://git.openjdk.org/jdk/pull/16380/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16380&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318908 Stats: 37 lines in 2 files changed: 0 ins; 35 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16380.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16380/head:pull/16380 PR: https://git.openjdk.org/jdk/pull/16380 From wkemper at openjdk.org Thu Oct 26 18:28:17 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 26 Oct 2023 18:28:17 GMT Subject: RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded [v7] In-Reply-To: References: Message-ID: <_VKFq1r7AzgsP99bp0MtA2IbF1RmggmqLE1LlknyDBY=.e6886387-1dc5-4907-aa3b-9c1210ea85d6@github.com> > Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision: - Merge remote-tracking branch 'openjdk/master' into shenandoah-oome-redux - Remove unnecessary nesting and other differences - Update formatting - Make explicit field for Shenandoah - Restore original retry logic, pull gc overhead check back over retry - Merge branch 'openjdk-master' into shenandoah-oome-redux - Merge check for no-progress into retry allocation block - Revert change to TEST.groups - Merge remote-tracking branch 'openjdk/master' into shenandoah-oome-redux - Extend exemption for EATests that rely on timely OOME to Shenandoah - ... and 6 more: https://git.openjdk.org/jdk/compare/1787b19e...95f3379a ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15852/files - new: https://git.openjdk.org/jdk/pull/15852/files/4154370c..95f3379a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15852&range=05-06 Stats: 17285 lines in 730 files changed: 10879 ins; 3673 del; 2733 mod Patch: https://git.openjdk.org/jdk/pull/15852.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15852/head:pull/15852 PR: https://git.openjdk.org/jdk/pull/15852 From manc at openjdk.org Thu Oct 26 20:36:40 2023 From: manc at openjdk.org (Man Cao) Date: Thu, 26 Oct 2023 20:36:40 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 14:31:23 GMT, Volker Simonis wrote: >> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 138: >> >>> 136: _vtime_accum = (os::elapsedVTime() - _vtime_start); >>> 137: >>> 138: cm()->update_concurrent_mark_threads_cpu_time(); >> >> Is there some overlapping btw this and the existing `_vtime_accum`. If so, can they be consolidated somehow? >> >> I believe the purpose of calling `update_concurrent_mark_threads_cpu_time` in multiple places is to get more up-to-date conc-cpu-time. Reading through the JBS ticket, I don't see the motivation for maintaining such a "fresh" value. >> >> Finally, is CSR required for this feature? > > @albertnetymk, the hsperf counters are a non-public API and the new counters have been added to the non-standard `sun.threads.cpu_time` name space which is "[unstable and unsupported](https://github.com/openjdk/jdk/blob/9864951dceb0ddc4479ced04b6d5a2363f1e307d/src/hotspot/share/runtime/perfData.cpp#L56)" so I don't think a CSR is required. > Is there some overlapping btw this and the existing _vtime_accum. If so, can they be consolidated somehow? We don't really like `_vtime_accum` for monitoring: - `os::elapsedVTime()` could silently fall back from CPU time to wall time, according to os_linux.cpp. We'd rather to have true CPU time, or nothing. Mixing up CPU time and wall time is confusing to users. - `_vtime_accum` only tracks the time consumed by the concurrent marking main thread, but not the concurrent worker threads. There's a `G1ConcurrentMarkThread::vtime_accum()` that seems to account for concurrent worker threads. It looks only used for logging. It might be possible to replace the existing logging code with the value of the `sun.threads.cpu_time.gc_conc_mark` hsperf counter. However, it is better to do that separately, and I'll create an RFE. > I believe the purpose of calling update_concurrent_mark_threads_cpu_time in multiple places is to get more up-to-date conc-cpu-time. Reading through the JBS ticket, I don't see the motivation for maintaining such a "fresh" value. Yes. The reason is that a concurrent mark cycle could take several minutes for a large heap. For tools like [AHS](https://mail.openjdk.org/pipermail/hotspot-dev/2022-September/064190.html) that reads these CPU hsperf counters, they could read these CPU data every 1 to 5 seconds. The current mechanism still needs to wait for the whole `G1ConcurrentMark::mark_from_roots()` to complete, which could still take minutes. I can create a separate RFE to make it update more frequently inside `G1CMConcurrentMarkingTask::work()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1373784494 From jjoo at openjdk.org Thu Oct 26 21:01:53 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Thu, 26 Oct 2023 21:01:53 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: References: Message-ID: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: Remove StringDedup from GC thread list ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/ebafa2b5..2fc508f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=32 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=31-32 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From manc at openjdk.org Thu Oct 26 22:46:36 2023 From: manc at openjdk.org (Man Cao) Date: Thu, 26 Oct 2023 22:46:36 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> References: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> Message-ID: On Thu, 26 Oct 2023 21:01:53 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Remove StringDedup from GC thread list Marked as reviewed by manc (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1700652459 From manc at openjdk.org Thu Oct 26 22:46:38 2023 From: manc at openjdk.org (Man Cao) Date: Thu, 26 Oct 2023 22:46:38 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v32] In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 20:33:31 GMT, Man Cao wrote: >> @albertnetymk, the hsperf counters are a non-public API and the new counters have been added to the non-standard `sun.threads.cpu_time` name space which is "[unstable and unsupported](https://github.com/openjdk/jdk/blob/9864951dceb0ddc4479ced04b6d5a2363f1e307d/src/hotspot/share/runtime/perfData.cpp#L56)" so I don't think a CSR is required. > >> Is there some overlapping btw this and the existing _vtime_accum. If so, can they be consolidated somehow? > > We don't really like `_vtime_accum` for monitoring: > - `os::elapsedVTime()` could silently fall back from CPU time to wall time, according to os_linux.cpp. We'd rather to have true CPU time, or nothing. Mixing up CPU time and wall time is confusing to users. > - `_vtime_accum` only tracks the time consumed by the concurrent marking main thread, but not the concurrent worker threads. > > There's a `G1ConcurrentMarkThread::vtime_accum()` that seems to account for concurrent worker threads. It looks only used for logging. It might be possible to replace the existing logging code with the value of the `sun.threads.cpu_time.gc_conc_mark` hsperf counter. However, it is better to do that separately, and I'll create an RFE. > >> I believe the purpose of calling update_concurrent_mark_threads_cpu_time in multiple places is to get more up-to-date conc-cpu-time. Reading through the JBS ticket, I don't see the motivation for maintaining such a "fresh" value. > > Yes. The reason is that a concurrent mark cycle could take many seconds or even minutes for a large heap. For tools like [AHS](https://mail.openjdk.org/pipermail/hotspot-dev/2022-September/064190.html) that reads these CPU hsperf counters, they could read these CPU data every 1 to 5 seconds. > > The current mechanism still needs to wait for the whole `G1ConcurrentMark::mark_from_roots()` to complete, which could still take minutes. I can create a separate RFE to make it update more frequently inside `G1CMConcurrentMarkingTask::work()`. I've created https://bugs.openjdk.org/browse/JDK-8318937 and https://bugs.openjdk.org/browse/JDK-8318941. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1373893351 From dholmes at openjdk.org Fri Oct 27 04:35:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 27 Oct 2023 04:35:41 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> References: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> Message-ID: <6XB0v77G_T6Expnc20NX5dyeJMFuxdTrb_IsSyHurTo=.a7351987-1928-4f67-bb30-693b6bec60fd@github.com> On Thu, 26 Oct 2023 21:01:53 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Remove StringDedup from GC thread list > `os::elapsedVTime()` could silently fall back from CPU time to wall time, according to os_linux.cpp The only allowed failure modes for `getrusage` are: EFAULT usage points outside the accessible address space. EINVAL who is invalid. neither of which are going to occur in practice ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1782275672 From wkemper at openjdk.org Fri Oct 27 08:21:42 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 27 Oct 2023 08:21:42 GMT Subject: Integrated: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded In-Reply-To: References: Message-ID: On Wed, 20 Sep 2023 22:41:52 GMT, William Kemper wrote: > Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3). This pull request has now been integrated. Changeset: 5b5fd369 Author: William Kemper Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/5b5fd3694ac6ef224af311a7ab62547dac976da4 Stats: 73 lines in 7 files changed: 54 ins; 2 del; 17 mod 8316632: Shenandoah: Raise OOME when gc threshold is exceeded Reviewed-by: kdnilsen, ysr, shade ------------- PR: https://git.openjdk.org/jdk/pull/15852 From ayang at openjdk.org Fri Oct 27 14:03:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 27 Oct 2023 14:03:37 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> References: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> Message-ID: <9tV7khSiXH_2_Ju1_egmea6dyYQMC6HKSmcfblg0xSw=.18b97f9d-0fda-4043-8d71-eac0e857fd19@github.com> On Thu, 26 Oct 2023 21:01:53 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Remove StringDedup from GC thread list > For tools like AHS that reads these CPU hsperf counters, they could read these CPU data every 1 to 5 seconds. Okay, these counters can be accessed frequently, but is it necessary for them to provide up-to-date information on every access? If not, what level of delay is acceptable? I assume this depends on how often AHS resizes the heap. (In the Parallel case, I believe the counters can be outdated for the duration of the full-gc.) My primary concern is that the change in G1 is too intrusive -- the logic for tracking/collecting thread-CPU is scattered in many places. Additionally, the situation is expected to worsen in the near future, based on the statement "I can create a separate RFE to make it update more frequently..." Also, why isn't `G1ServiceThread` part of the change? I would expect all subclasses of `ConcurrentGCThread` to be taken into account. Is this omission intentional? Finally, thread-CPU time is something tracked at the OS level, so it's a bit odd that one has to instrument the VM to get that information. > - Looking into the `/proc/` method, it also is not super intuitive as to which value corresponds to GC time, and it might be difficult to reliably identify all PIDs of the relevant GC threads. According to https://man7.org/linux/man-pages/man5/proc.5.html, "(14) utime" + "(15) stime %lu" == thread-cpu-time. `cat /proc//task/*/stat` lists all VM internal threads, including GC, JIT, and etc. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1782969884 From tschatzl at openjdk.org Mon Oct 30 08:50:31 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 08:50:31 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: References: Message-ID: <4wbcOpVKWRVQ88cWECDjooLFtnkAveHzxgv75hqdE2U=.4616e0a3-76ad-4d82-9a5c-7a73bb5f0f15@github.com> On Thu, 26 Oct 2023 15:03:27 GMT, Albert Mingkun Yang wrote: > Simple removing some dead code; after this, a card is either clean or dirty. lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16380#pullrequestreview-1703562832 From tschatzl at openjdk.org Mon Oct 30 09:02:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 09:02:35 GMT Subject: RFR: 8318894: G1: Use uint for age in G1SurvRateGroup In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 11:23:57 GMT, Albert Mingkun Yang wrote: > Simple refactoring to use unsigned type for region-age. lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16376#pullrequestreview-1703584926 From tschatzl at openjdk.org Mon Oct 30 09:08:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 09:08:32 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v3] In-Reply-To: References: Message-ID: <3FyIV17xeD_Rd94n04AAXG9QA1d_hDNM30KiqPd4CEc=.6cf27b91-227f-4cd0-ba2a-8ee710250464@github.com> On Wed, 25 Oct 2023 14:01:55 GMT, Albert Mingkun Yang wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review I would have renamed the classes as suggested as well when renaming the files; there does not seem to be much meaning to just renaming the files to me. However assuming you will finish the renaming in an upcoming change anyway, I'll approve this. src/hotspot/share/gc/serial/serialBlockOffsetTable.hpp line 111: > 109: // how far back one must go to find the start of the chunk that includes the > 110: // first word of the subregion. > 111: class BlockOffsetTable { The file has been renamed which is great, but the class is still called `BlockOffsetTable` without the prefix. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16304#pullrequestreview-1703590658 PR Review Comment: https://git.openjdk.org/jdk/pull/16304#discussion_r1375879808 From ayang at openjdk.org Mon Oct 30 09:27:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Oct 2023 09:27:37 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 In-Reply-To: References: Message-ID: On Tue, 24 Oct 2023 09:56:57 GMT, Thomas Schatzl wrote: > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 444: > 442: } > 443: > 444: if (succeeded) { Can these two `if`s can be merged into one, `if (succeeded) { return result; }`? src/hotspot/share/gc/g1/g1EvacFailureRegions.hpp line 36: > 34: class HeapRegionClaimer; > 35: > 36: // This class records for every region on the heap whether it has to be retained I feel the term "retain" has two diff meanings in this PR: 1. retain == pinned or evac-fail 2. should_retain_evac_failed_region 1 is during scavenging while 2 is after scavenging. src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 82: > 80: } else { > 81: assert(hr->containing_set() == nullptr, "already cleared by PrepareRegionsClosure"); > 82: if (hr->has_pinned_objects() || This `do_heap_region` method is hard to follow; there multiple occurrences of same predicates. I wonder if one can reorganize these if-else a bit. Inlining `should_compact` should make all `if` on the same level at least. src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 494: > 492: // undo_allocation() method too. > 493: undo_allocation(dest_attr, obj_ptr, word_sz, node_index); > 494: return handle_evacuation_failure_par(old, old_mark, word_sz, true /* cause_pinned */); Why is this `cause_pinned == true`? This obj can be arbitrary, not necessarily type-array. src/hotspot/share/gc/g1/heapRegion.cpp line 734: > 732: // ranges passed in here corresponding to the space between live objects, it is > 733: // possible that there is a pinned object that is not any more referenced by > 734: // Java code (only by native). Can such obj becomes referenced by java again later on? IOW, a pointer passed from native to java. src/hotspot/share/gc/g1/heapRegion.inline.hpp line 262: > 260: } > 261: > 262: inline bool HeapRegion::can_reclaim() const { I'd suggest inline this method to callers, because "can reclaim" is sth caller context sensitive. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1374835226 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1375035713 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1375023324 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1375009476 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1375304500 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1375030685 From ayang at openjdk.org Mon Oct 30 10:57:59 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Oct 2023 10:57:59 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v4] In-Reply-To: References: Message-ID: > The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. > > There is some duplication with G1's implementation, which should probably be dealt with in its own PR. Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - uint8 - s1-rename - s1-rename - Merge branch 'master' into s1-bot - review - review - s1-bot ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16304/files - new: https://git.openjdk.org/jdk/pull/16304/files/a4f6ea8f..eabe68de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16304&range=02-03 Stats: 17445 lines in 1039 files changed: 9157 ins; 3690 del; 4598 mod Patch: https://git.openjdk.org/jdk/pull/16304.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16304/head:pull/16304 PR: https://git.openjdk.org/jdk/pull/16304 From ayang at openjdk.org Mon Oct 30 10:58:01 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Oct 2023 10:58:01 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Wed, 25 Oct 2023 14:01:55 GMT, Albert Mingkun Yang wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Renamed class and `u_char`. Split into their own commits for easier reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16304#issuecomment-1784938755 From tschatzl at openjdk.org Mon Oct 30 11:10:37 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 11:10:37 GMT Subject: RFR: 8318647: Serial: Refactor BlockOffsetTable [v4] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 10:57:59 GMT, Albert Mingkun Yang wrote: >> The diff is too large; maybe it's better to read the new impl directly, which I believe is much easier to follow. >> >> There is some duplication with G1's implementation, which should probably be dealt with in its own PR. > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - uint8 > - s1-rename > - s1-rename > - Merge branch 'master' into s1-bot > - review > - review > - s1-bot Thank you. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16304#pullrequestreview-1703833288 From ayang at openjdk.org Mon Oct 30 11:46:39 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Oct 2023 11:46:39 GMT Subject: RFR: 8319106: Remove unimplemented TaskTerminator::do_delay_step Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/16415/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16415&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319106 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16415.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16415/head:pull/16415 PR: https://git.openjdk.org/jdk/pull/16415 From tschatzl at openjdk.org Mon Oct 30 12:53:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 12:53:36 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 In-Reply-To: References: Message-ID: On Fri, 27 Oct 2023 20:18:24 GMT, Albert Mingkun Yang wrote: >> The JEP covers the idea very well, so I'm only covering some implementation details here: >> >> * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. >> >> * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: >> >> * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. >> >> * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). >> >> * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. >> >> * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) >> >> The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. >> >> I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. >> >> * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in a... > > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 494: > >> 492: // undo_allocation() method too. >> 493: undo_allocation(dest_attr, obj_ptr, word_sz, node_index); >> 494: return handle_evacuation_failure_par(old, old_mark, word_sz, true /* cause_pinned */); > > Why is this `cause_pinned == true`? This obj can be arbitrary, not necessarily type-array. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376162391 From tschatzl at openjdk.org Mon Oct 30 13:19:39 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 13:19:39 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 In-Reply-To: References: Message-ID: <7hjNlLCs4_Vwl-pz8bmXSCZV3EakunUIFNt8Q6yDIcs=.fe32d0f5-ba84-4b2a-a629-6491b99bc627@github.com> On Fri, 27 Oct 2023 20:38:19 GMT, Albert Mingkun Yang wrote: >> The JEP covers the idea very well, so I'm only covering some implementation details here: >> >> * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. >> >> * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: >> >> * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. >> >> * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). >> >> * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. >> >> * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) >> >> The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. >> >> I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. >> >> * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in a... > > src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 82: > >> 80: } else { >> 81: assert(hr->containing_set() == nullptr, "already cleared by PrepareRegionsClosure"); >> 82: if (hr->has_pinned_objects() || > > This `do_heap_region` method is hard to follow; there multiple occurrences of same predicates. I wonder if one can reorganize these if-else a bit. Inlining `should_compact` should make all `if` on the same level at least. Apart from having an early return in the `should_compact`-if, one option would be making `has_pinned_objects()` more clever by stating something like: bool has_pinned_objects() const { return pinned_count() > 0 || (is_continues_humongous() && humongous_start_region()->pinned_count() > 0); } Then this predicate would get shorter. Or add a local helper for that (as suggested in the next commit). Either is fine with me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376208039 From tschatzl at openjdk.org Mon Oct 30 13:27:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 13:27:32 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 In-Reply-To: References: Message-ID: On Fri, 27 Oct 2023 20:53:29 GMT, Albert Mingkun Yang wrote: >> The JEP covers the idea very well, so I'm only covering some implementation details here: >> >> * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. >> >> * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: >> >> * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. >> >> * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). >> >> * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. >> >> * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) >> >> The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. >> >> I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. >> >> * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in a... > > src/hotspot/share/gc/g1/g1EvacFailureRegions.hpp line 36: > >> 34: class HeapRegionClaimer; >> 35: >> 36: // This class records for every region on the heap whether it has to be retained > > I feel the term "retain" has two diff meanings in this PR: > > 1. retain == pinned or evac-fail > 2. should_retain_evac_failed_region > > 1 is during scavenging while 2 is after scavenging. Maybe rename `should_retain_evac_failed_region` to `should_keep_retained_region[_in_old]` or something? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376219339 From ayang at openjdk.org Mon Oct 30 13:50:14 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Oct 2023 13:50:14 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v2] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 13:25:07 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1EvacFailureRegions.hpp line 36: >> >>> 34: class HeapRegionClaimer; >>> 35: >>> 36: // This class records for every region on the heap whether it has to be retained >> >> I feel the term "retain" has two diff meanings in this PR: >> >> 1. retain == pinned or evac-fail >> 2. should_retain_evac_failed_region >> >> 1 is during scavenging while 2 is after scavenging. > > Maybe rename `should_retain_evac_failed_region` to `should_keep_retained_region[_in_old]` or something? Is it possible to drop 1 so that an obj is evac-fail iff it's pinned or OOM? (I feel "retain" is on region-level.) >> src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 82: >> >>> 80: } else { >>> 81: assert(hr->containing_set() == nullptr, "already cleared by PrepareRegionsClosure"); >>> 82: if (hr->has_pinned_objects() || >> >> This `do_heap_region` method is hard to follow; there multiple occurrences of same predicates. I wonder if one can reorganize these if-else a bit. Inlining `should_compact` should make all `if` on the same level at least. > > Apart from having an early return in the `should_compact`-if, one option would be making `has_pinned_objects()` more clever by stating something like: > > > bool has_pinned_objects() const { > return pinned_count() > 0 || (is_continues_humongous() && humongous_start_region()->pinned_count() > 0); > } > > > Then this predicate would get shorter. Or add a local helper for that (as suggested in the next commit). Either is fine with me. A local helper is possibly clearer here, IMO. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376247318 PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376251206 From tschatzl at openjdk.org Mon Oct 30 13:50:14 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 13:50:14 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v2] In-Reply-To: References: Message-ID: <3UQDBc7erChTvsbzlTxJNd5umBOxAVkSCiooAAJLGUo=.6a670f22-bb90-402e-9a85-e23d75056704@github.com> > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16342/files - new: https://git.openjdk.org/jdk/pull/16342/files/b882dd60..e6646399 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=00-01 Stats: 69 lines in 8 files changed: 12 ins; 25 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/16342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16342/head:pull/16342 PR: https://git.openjdk.org/jdk/pull/16342 From tschatzl at openjdk.org Mon Oct 30 13:50:15 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 13:50:15 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v2] In-Reply-To: References: Message-ID: On Sat, 28 Oct 2023 18:32:56 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review1 > > src/hotspot/share/gc/g1/heapRegion.cpp line 734: > >> 732: // ranges passed in here corresponding to the space between live objects, it is >> 733: // possible that there is a pinned object that is not any more referenced by >> 734: // Java code (only by native). > > Can such obj becomes referenced by java again later on? IOW, a pointer passed from native to java. I do not think so. I will do some more testing about this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376243855 From tschatzl at openjdk.org Mon Oct 30 16:33:25 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 16:33:25 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v3] In-Reply-To: References: Message-ID: > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Move tests into gc.g1.pinnedobjs package ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16342/files - new: https://git.openjdk.org/jdk/pull/16342/files/e6646399..1b1d8ba9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=01-02 Stats: 11 lines in 5 files changed: 2 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/16342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16342/head:pull/16342 PR: https://git.openjdk.org/jdk/pull/16342 From tschatzl at openjdk.org Mon Oct 30 17:14:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 17:14:32 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v3] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 13:41:02 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/heapRegion.cpp line 734: >> >>> 732: // ranges passed in here corresponding to the space between live objects, it is >>> 733: // possible that there is a pinned object that is not any more referenced by >>> 734: // Java code (only by native). >> >> Can such obj becomes referenced by java again later on? IOW, a pointer passed from native to java. > > I do not think so. I will do some more testing about this. I (still) do not think it is possible after some more re-testing. There are the following situations I can think of: * string deduplication is a need-to-be-supported case where only the C code may have a reference to a pinned object: thread A critical sections a string, gets the char array address, locking the region containing the char array. Then string dedup goes ahead and replaces the original char array with something else. Now the C code has the only reference to that char array. There is no API to convert a raw array pointer back to a Java object so destroying the header is fine; unpinning does not need the header. * there is some other case I can think of that could be problematic, but is actually a spec violation: the array is critical-locked by thread A, then shared with other C code (not critical-unlocked), resumes with Java code that forgets that reference. At some point other C code accesses that locked memory and (hopefully) critically-unlocks it. Again, there is no API to convert a raw array pointer back to a Java object so destroying the header is fine. In all other cases I can think of there is always a reference to the encapsulating java object either from the stack frame (when passing in the object into the JNI function they are part of the oop maps) or if you create a new array object (via `NewArray` and lock it, the VM will add a handle to it. There is also no API to inspect the array header using the raw pointer (e.g. passing the raw pointer to `GetArrayLength` - doesn't compile as it expects a `jarray`, and in debug VMs there is actually a check that the passed argument is something that resembles a handle), so modifications are already invalid, and the change is fine imo. hth, Thomas ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376560372 From tschatzl at openjdk.org Mon Oct 30 17:41:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 17:41:32 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v3] In-Reply-To: References: Message-ID: <8fZa33eLsVSrL-9U6EiETW9yRSQBN-YbefzFVD__DMs=.31e7e53d-547f-46bb-ab56-a49654fc9282@github.com> On Mon, 30 Oct 2023 13:46:21 GMT, Albert Mingkun Yang wrote: >> Apart from having an early return in the `should_compact`-if, one option would be making `has_pinned_objects()` more clever by stating something like: >> >> >> bool has_pinned_objects() const { >> return pinned_count() > 0 || (is_continues_humongous() && humongous_start_region()->pinned_count() > 0); >> } >> >> >> Then this predicate would get shorter. Or add a local helper for that (as suggested in the next commit). Either is fine with me. > > A local helper is possibly clearer here, IMO. Done in one of the latest commits. Resolving. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376594853 From tschatzl at openjdk.org Mon Oct 30 17:58:34 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 17:58:34 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v3] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 13:43:33 GMT, Albert Mingkun Yang wrote: >> Maybe rename `should_retain_evac_failed_region` to `should_keep_retained_region[_in_old]` or something? > > Is it possible to drop 1 so that an obj is evac-fail iff it's pinned or OOM? (I feel "retain" is on region-level.) The `G1EvacFailureRegions` class is on region level. There is a need for a term that encompasses both pinned and evac-failed regions. So far we used "retained" (i.e. contents partially not moved), also in logging. I am obviously open for suggestions, but I am not sure "OOM" in any variant is a good name to replace current "evac-fail" regions. Right now I don't see a good name. Maybe if we change `handle_evacuation_failure_par()` to something else? Like `handle_unmovable_object` or so to get rid of "evacuation failure" for objects? To some degree, we don't "fail" evacuation due to pinning, pinning is a conscious decision. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1376612711 From tschatzl at openjdk.org Mon Oct 30 18:33:31 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Oct 2023 18:33:31 GMT Subject: RFR: 8319106: Remove unimplemented TaskTerminator::do_delay_step In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 11:41:08 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16415#pullrequestreview-1704829047 From jjoo at openjdk.org Mon Oct 30 23:25:37 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Mon, 30 Oct 2023 23:25:37 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: <9tV7khSiXH_2_Ju1_egmea6dyYQMC6HKSmcfblg0xSw=.18b97f9d-0fda-4043-8d71-eac0e857fd19@github.com> References: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> <9tV7khSiXH_2_Ju1_egmea6dyYQMC6HKSmcfblg0xSw=.18b97f9d-0fda-4043-8d71-eac0e857fd19@github.com> Message-ID: On Fri, 27 Oct 2023 14:00:43 GMT, Albert Mingkun Yang wrote: > Okay, these counters can be accessed frequently, but is it necessary for them to provide up-to-date information on every access? If not, what level of delay is acceptable? I assume this depends on how often AHS resizes the heap. (In the Parallel case, I believe the counters can be outdated for the duration of the full-gc.) The way AHS is implemented, we access these counters upon every heap resize, so in theory every call to `expand()` and `shrink()` will rely to some extent on these counters. If the counters do happen to be stale, AHS won't break, but will certainly be less effective. For the rest of the points, Man do you have any additional insight? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1786197139 From manc at openjdk.org Tue Oct 31 00:55:39 2023 From: manc at openjdk.org (Man Cao) Date: Tue, 31 Oct 2023 00:55:39 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v33] In-Reply-To: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> References: <2VxMwwQKNjq7EeXnyQ7fo_aXYj2EGCJyW2MrAUkoSQs=.902af266-1470-43ed-8e4c-a8b7611cc154@github.com> Message-ID: <8Ln25BTyHKzq_AfWrvQ_pZweFj6AkT1uvUa8GFndfmM=.c9d28dc5-daec-44ce-a6b2-4a8da1230db0@github.com> On Thu, 26 Oct 2023 21:01:53 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Remove StringDedup from GC thread list > Okay, these counters can be accessed frequently, but is it necessary for them to provide up-to-date information on every access? If not, what level of delay is acceptable? (In the Parallel case, I believe the counters can be outdated for the duration of the full-gc.) Besides Jonathan's point, in our experience updating these counters once every 1 second is good enough for AHS. It might even be OK for once every 2-3 seconds. We just don't want the counters to be outdated for tens or hundreds of seconds. Also the "tens or hundreds of seconds" delay is mainly a problem for concurrent mark, but less of a problem for GC pauses like the full-GC, because: - Multi-second GC pauses are uncommon compared to multi-second concurrent mark. - GC pauses are frequent. It is OK for AHS to get slightly outdated info for the ongoing GC pause, because AHS can quickly influence the next GC pause after the ongoing GC pause finishes and updates the counters. This is also the reason we only refresh `sun.threads.total_gc_cpu_time` after a GC pause. > My primary concern is that the change in G1 is too intrusive -- the logic for tracking/collecting thread-CPU is scattered in many places. Additionally, the situation is expected to worsen in the near future, based on the statement "I can create a separate RFE to make it update more frequently..." I think most G1 changes in this PR are straightforward and easy to maintain. With the exception of concurrent mark (`sun.threads.cpu_time.gc_conc_mark`), each counter is only updated at exactly one place. As part of [JDK-8318941](https://bugs.openjdk.org/browse/JDK-8318941), we could find a way to update `sun.threads.cpu_time.gc_conc_mark` at only one place as well. In addition, we think it is well worth the effort for G1 (or any modern garbage collector) to keep track of CPU time spent by their GC threads. Besides monitoring benefits of users and external tools like AHS, it could open up opportunity for G1 to develop better heuristics based on CPU time. We find CPU time spent by GC threads is a better measure for GC overhead, than wall (pause) time. There was [a discussion](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2021-May/035241.html) about `GCTimeRatio` and CPU time. Today even after [JDK-8253413](https://bugs.openjdk.org/browse/JDK-8253413), `GCTimeRatio` still only accounts for pause time. We hope JVM could provide a flag `GCCpuRatio` and resizes its heap to respect `GCCpuRatio`. AHS actually tries to partly achieve the effect of a `GCCpuRatio` flag. > Also, why isn't `G1ServiceThread` part of the change? I would expect all subclasses of ConcurrentGCThread to be taken into account. Is this omission intentional? Thanks. We missed this thread in our accounting due to oversight. It was named `G1YoungRemSetSamplingThread` before. @jjoo172, could you add an hsperf counter for `G1ServiceThread`? > Finally, thread-CPU time is something tracked at the OS level, so it's a bit odd that one has to instrument the VM to get that information. > According to https://man7.org/linux/man-pages/man5/proc.5.html, "(14) utime" + "(15) stime %lu" == thread-cpu-time. cat `/proc//task/*/stat` lists all VM internal threads, including GC, JIT, and etc. It is possible, but it is a lot of work for the users. `/proc//task/*/stat` lists all Java threads from the application as well. Users would need deep JVM knowledge to find out which threads are GC threads, which are JIT threads, etc. Quite a few other issues come along as well: - What if the JVM's internal threads are renamed across different JDK versions? (E.g. `GC Thread#N` was named `Gang worker#N` in JDK 8.) - Different tools that need to read CPU time would each need to implement a parser for `/proc//task/*/stat`, and deal with similar problems. - What if a tool needs to support multiple OSes like Windows, which do not have `/proc` FS? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1786267159 From jjoo at openjdk.org Tue Oct 31 04:17:07 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Tue, 31 Oct 2023 04:17:07 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v34] In-Reply-To: References: Message-ID: > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: Implement hsperf counter for G1ServiceThread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/2fc508f7..0ef70468 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=33 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=32-33 Stats: 19 lines in 4 files changed: 17 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From jjoo at openjdk.org Tue Oct 31 04:23:13 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Tue, 31 Oct 2023 04:23:13 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v35] In-Reply-To: References: Message-ID: > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: Replace NULL with nullptr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/0ef70468..be104e16 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=34 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=33-34 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From ayang at openjdk.org Tue Oct 31 09:17:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 31 Oct 2023 09:17:41 GMT Subject: RFR: 8319106: Remove unimplemented TaskTerminator::do_delay_step In-Reply-To: References: Message-ID: <__CoJ3HkSAU9WTro5zGf0X4hDLfUVU39CBJGdJnCZRs=.aced4715-966b-4e31-8885-3d1cd0358fc4@github.com> On Mon, 30 Oct 2023 11:41:08 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16415#issuecomment-1786804534 From ayang at openjdk.org Tue Oct 31 09:17:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 31 Oct 2023 09:17:41 GMT Subject: Integrated: 8319106: Remove unimplemented TaskTerminator::do_delay_step In-Reply-To: References: Message-ID: <3tcjvMFBO7bLOKF17LKGCT8oT6ak53weuOORQdN4dao=.aef8cae0-cd92-43f5-8a7e-08e6c83207fb@github.com> On Mon, 30 Oct 2023 11:41:08 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 5411ad2a Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/5411ad2a5ca3abcc663778f903c6f2f3e8a18431 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod 8319106: Remove unimplemented TaskTerminator::do_delay_step Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16415 From lkorinth at openjdk.org Tue Oct 31 10:02:11 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 31 Oct 2023 10:02:11 GMT Subject: RFR: 8319158: Parallel: Make TestObjectTenuringFlags use createTestJavaProcessBuilder Message-ID: Tested with: `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS='` -> pass 1 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC'` -> pass 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+NeverTenure'` -> pass 0 I will later (before integrating) run this test together with more tests through high tier testing to see that we do not fail when rotating flags. ------------- Commit messages: - 8319158: Parallel: Make TestObjectTenuringFlags use createTestJavaProcessBuilder Changes: https://git.openjdk.org/jdk/pull/16432/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16432&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319158 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16432.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16432/head:pull/16432 PR: https://git.openjdk.org/jdk/pull/16432 From sjohanss at openjdk.org Tue Oct 31 12:47:31 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 31 Oct 2023 12:47:31 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: References: Message-ID: <8zHCKMqfSJNvQeyW4z4N4MTaTctQThvj8rZdBwgBtiY=.fc4027d2-6762-480f-8a02-0228696ba41c@github.com> On Thu, 26 Oct 2023 15:03:27 GMT, Albert Mingkun Yang wrote: > Simple removing some dead code; after this, a card is either clean or dirty. Looks good, just a comment/question? Feel free to integrate if you don't see a need to change the name. src/hotspot/share/gc/parallel/psCardTable.cpp line 390: > 388: } > 389: > 390: bool PSCardTable::addr_is_marked_imprecise(void *addr) { Does it make sense to keep this name when we don't have the "precise" counterpart and what all it does is checking if the card is dirty? ------------- Marked as reviewed by sjohanss (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16380#pullrequestreview-1706212146 PR Review Comment: https://git.openjdk.org/jdk/pull/16380#discussion_r1377530439 From ayang at openjdk.org Tue Oct 31 13:03:33 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 31 Oct 2023 13:03:33 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: <8zHCKMqfSJNvQeyW4z4N4MTaTctQThvj8rZdBwgBtiY=.fc4027d2-6762-480f-8a02-0228696ba41c@github.com> References: <8zHCKMqfSJNvQeyW4z4N4MTaTctQThvj8rZdBwgBtiY=.fc4027d2-6762-480f-8a02-0228696ba41c@github.com> Message-ID: On Tue, 31 Oct 2023 12:44:24 GMT, Stefan Johansson wrote: >> Simple removing some dead code; after this, a card is either clean or dirty. > > src/hotspot/share/gc/parallel/psCardTable.cpp line 390: > >> 388: } >> 389: >> 390: bool PSCardTable::addr_is_marked_imprecise(void *addr) { > > Does it make sense to keep this name when we don't have the "precise" counterpart and what all it does is checking if the card is dirty? Indeed odd. I will rename it to sth else in a follow-up PR. Even "marked" is weird here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16380#discussion_r1377550662 From sjohanss at openjdk.org Tue Oct 31 13:31:34 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 31 Oct 2023 13:31:34 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: References: <8zHCKMqfSJNvQeyW4z4N4MTaTctQThvj8rZdBwgBtiY=.fc4027d2-6762-480f-8a02-0228696ba41c@github.com> Message-ID: <96vHs7z-GKU-Qs1JzfahSeYeT3-JZaH1t2N9RPhU3f0=.0313348e-679e-4275-9b55-3aaffc15525e@github.com> On Tue, 31 Oct 2023 13:00:46 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/parallel/psCardTable.cpp line 390: >> >>> 388: } >>> 389: >>> 390: bool PSCardTable::addr_is_marked_imprecise(void *addr) { >> >> Does it make sense to keep this name when we don't have the "precise" counterpart and what all it does is checking if the card is dirty? > > Indeed odd. I will rename it to sth else in a follow-up PR. Even "marked" is weird here. Sounds good, my first thought was `addr_is_dirty(void* addr)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16380#discussion_r1377588481 From ayang at openjdk.org Tue Oct 31 14:23:34 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 31 Oct 2023 14:23:34 GMT Subject: RFR: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 15:03:27 GMT, Albert Mingkun Yang wrote: > Simple removing some dead code; after this, a card is either clean or dirty. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16380#issuecomment-1787317055 From ayang at openjdk.org Tue Oct 31 14:26:42 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 31 Oct 2023 14:26:42 GMT Subject: Integrated: 8318908: Parallel: Remove ExtendedCardValue In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 15:03:27 GMT, Albert Mingkun Yang wrote: > Simple removing some dead code; after this, a card is either clean or dirty. This pull request has now been integrated. Changeset: f4c5db92 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/f4c5db92ea0546e331d6c8dcebb5a48b052bba23 Stats: 37 lines in 2 files changed: 0 ins; 35 del; 2 mod 8318908: Parallel: Remove ExtendedCardValue Reviewed-by: tschatzl, sjohanss ------------- PR: https://git.openjdk.org/jdk/pull/16380 From mli at openjdk.org Tue Oct 31 14:57:33 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 31 Oct 2023 14:57:33 GMT Subject: RFR: 8319158: Parallel: Make TestObjectTenuringFlags use createTestJavaProcessBuilder In-Reply-To: References: Message-ID: On Tue, 31 Oct 2023 09:55:17 GMT, Leo Korinth wrote: > Tested with: > `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS='` -> pass 1 > `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC'` -> pass 0 > `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestObjectTenuringFlags.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+NeverTenure'` -> pass 0 > > I will later (before integrating) run this test together with more tests through high tier testing to see that we do not fail when rotating flags. LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16432#pullrequestreview-1706525778 From lkorinth at openjdk.org Tue Oct 31 16:32:43 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 31 Oct 2023 16:32:43 GMT Subject: RFR: 8319161: GC: Make TestParallelGCThreads use createTestJavaProcessBuilder Message-ID: `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS='` -> pass 1 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:ParallelGCThreads=42' ` --> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseG1GC' ` --> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseParallelGC' ` -> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseSerialGC' ` -> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseZGC' `-> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UseShenandoahGC' `-> total 0 `make run-test TEST=jtreg:test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java JTREG='VERBOSE=all;JAVA_OPTIONS=-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC' ` -> total 0 I will later (before integrating) run this test together with more tests through high tier testing to see that we do not fail when rotating flags. ------------- Commit messages: - 8319161: GC: Make TestParallelGCThreads use createTestJavaProcessBuilder Changes: https://git.openjdk.org/jdk/pull/16434/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16434&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319161 Stats: 9 lines in 1 file changed: 4 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/16434.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16434/head:pull/16434 PR: https://git.openjdk.org/jdk/pull/16434 From tschatzl at openjdk.org Tue Oct 31 18:08:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 31 Oct 2023 18:08:00 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v4] In-Reply-To: References: Message-ID: <8BH2UtnHn-DYz3c80Su4v9BF_v0w-N4fHkASCXP_E2c=.70c7ff8f-32e2-4970-87e3-fe22f7b08e6b@github.com> > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Improve TestPinnedOldObjectsEvacuation test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16342/files - new: https://git.openjdk.org/jdk/pull/16342/files/1b1d8ba9..78cb9df0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=02-03 Stats: 206 lines in 2 files changed: 190 ins; 7 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/16342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16342/head:pull/16342 PR: https://git.openjdk.org/jdk/pull/16342 From tschatzl at openjdk.org Tue Oct 31 18:57:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 31 Oct 2023 18:57:32 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v4] In-Reply-To: <8BH2UtnHn-DYz3c80Su4v9BF_v0w-N4fHkASCXP_E2c=.70c7ff8f-32e2-4970-87e3-fe22f7b08e6b@github.com> References: <8BH2UtnHn-DYz3c80Su4v9BF_v0w-N4fHkASCXP_E2c=.70c7ff8f-32e2-4970-87e3-fe22f7b08e6b@github.com> Message-ID: On Tue, 31 Oct 2023 18:08:00 GMT, Thomas Schatzl wrote: >> The JEP covers the idea very well, so I'm only covering some implementation details here: >> >> * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. >> >> * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: >> >> * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. >> >> * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). >> >> * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. >> >> * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) >> >> The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. >> >> I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. >> >> * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in a... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Improve TestPinnedOldObjectsEvacuation test Had a discussion with @albertnetymk and we came to the following agreement about naming: "allocation failure" - allocation failed in the to-space due to memory exhaustion "pinned" - the region/object has been pinned "evacuation failure" - either pinned or allocation failure I will apply this new naming asap. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16342#issuecomment-1787818668 From manc at openjdk.org Tue Oct 31 19:01:40 2023 From: manc at openjdk.org (Man Cao) Date: Tue, 31 Oct 2023 19:01:40 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v35] In-Reply-To: References: Message-ID: On Tue, 31 Oct 2023 04:23:13 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: > > Replace NULL with nullptr LGTM still, thanks! ------------- Marked as reviewed by manc (Committer). PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1707049281 From tschatzl at openjdk.org Tue Oct 31 19:14:13 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 31 Oct 2023 19:14:13 GMT Subject: RFR: 8318706: Implementation of JDK-8276094: JEP 423: Region Pinning for G1 [v5] In-Reply-To: References: Message-ID: > The JEP covers the idea very well, so I'm only covering some implementation details here: > > * regions get a "pin count" (reference count). As long as it is non-zero, we conservatively never reclaim that region even if there is no reference in there. JNI code might have references to it. > > * the JNI spec only requires us to provide pinning support for typeArrays, nothing else. This implementation uses this in various ways: > > * when evacuating from a pinned region, we evacuate everything live but the typeArrays to get more empty regions to clean up later. > > * when formatting dead space within pinned regions we use filler objects. Pinned regions may be referenced by JNI code only, so we can't overwrite contents of any dead typeArray either. These dead but referenced typeArrays luckily have the same header size of our filler objects, so we can use their headers for our fillers. The problem is that previously there has been that restriction that filler objects are half a region size at most, so we can end up with the need for placing a filler object header inside a typeArray. The code could be clever and handle this situation by splitting the to be filled area so that this can't happen, but the solution taken here is allowing filler arrays to cover a whole region. They are not referenced by Java code anyway, so there is no harm in doing so (i.e. gc code never touches them anyway). > > * G1 currently only ever actually evacuates young pinned regions. Old pinned regions of any kind are never put into the collection set and automatically skipped. However assuming that the pinning is of short length, we put them into the candidates when we can. > > * there is the problem that if an applications pins a region for a long time g1 will skip evacuating that region over and over. that may lead to issues with the current policy in marking regions (only exit mixed phase when there are no marking candidates) and just waste of processing time (when the candidate stays in the retained candidates) > > The cop-out chosen here is to "age out" the regions from the candidates and wait until the next marking happens. > > I.e. pinned marking candidates are immediately moved to retained candidates, and if in total the region has been pinned for `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the candidates. Its current value is fairly random. > > * G1 pauses got a new tag if there were pinned regions in the collection set. I.e. in addition to something like: > > `GC(6) P... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Fix compilation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16342/files - new: https://git.openjdk.org/jdk/pull/16342/files/78cb9df0..e5dfbb73 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16342&range=03-04 Stats: 3 lines in 1 file changed: 1 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16342/head:pull/16342 PR: https://git.openjdk.org/jdk/pull/16342