From ayang at openjdk.org Thu Jan 2 06:34:43 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 2 Jan 2025 06:34:43 GMT Subject: RFR: 8345374: Ubsan: runtime error: division by zero In-Reply-To: References: Message-ID: On Mon, 30 Dec 2024 01:07:44 GMT, Kim Barrett wrote: > Please review this change to G1HeapSizingPolicy to avoid a float division by > zero when calculating the maximum desired capacity with a MaxHeapFreeRatio > value of 100%. > > Testing: mach5 tier1 with G1 and MaxHeapFreeRatio=100. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22893#pullrequestreview-2527075314 From ayang at openjdk.org Thu Jan 2 06:38:43 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 2 Jan 2025 06:38:43 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC Message-ID: This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the `GCTimeRatio`. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously. Test: tier1-3 ------------- Commit messages: - s1-gc-time-ratio Changes: https://git.openjdk.org/jdk/pull/22899/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22899&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346920 Stats: 26 lines in 3 files changed: 23 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/22899.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22899/head:pull/22899 PR: https://git.openjdk.org/jdk/pull/22899 From amitkumar at openjdk.org Thu Jan 2 06:54:37 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 2 Jan 2025 06:54:37 GMT Subject: RFR: 8345374: Ubsan: runtime error: division by zero In-Reply-To: References: Message-ID: On Mon, 30 Dec 2024 01:07:44 GMT, Kim Barrett wrote: > Please review this change to G1HeapSizingPolicy to avoid a float division by > zero when calculating the maximum desired capacity with a MaxHeapFreeRatio > value of 100%. > > Testing: mach5 tier1 with G1 and MaxHeapFreeRatio=100. Marked as reviewed by amitkumar (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22893#pullrequestreview-2527088351 From kbarrett at openjdk.org Thu Jan 2 08:14:39 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 2 Jan 2025 08:14:39 GMT Subject: RFR: 8345374: Ubsan: runtime error: division by zero In-Reply-To: <5J21YcGO5FJukhpN1W3G1dYu1KQudSVANgR2jUTF6JI=.4a46b4cf-dea2-473a-a036-a29004b722e9@github.com> References: <5J21YcGO5FJukhpN1W3G1dYu1KQudSVANgR2jUTF6JI=.4a46b4cf-dea2-473a-a036-a29004b722e9@github.com> Message-ID: On Tue, 31 Dec 2024 06:55:40 GMT, Julian Waters wrote: >> Please review this change to G1HeapSizingPolicy to avoid a float division by >> zero when calculating the maximum desired capacity with a MaxHeapFreeRatio >> value of 100%. >> >> Testing: mach5 tier1 with G1 and MaxHeapFreeRatio=100. > > Looks alright, but I think the title needs to be changed to match the one on the tracker Thanks for reviews @TheShermanTanker , @albertnetymk , and @offamitkumar ------------- PR Comment: https://git.openjdk.org/jdk/pull/22893#issuecomment-2567406190 From kbarrett at openjdk.org Thu Jan 2 08:14:40 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 2 Jan 2025 08:14:40 GMT Subject: Integrated: 8345374: Ubsan: runtime error: division by zero In-Reply-To: References: Message-ID: On Mon, 30 Dec 2024 01:07:44 GMT, Kim Barrett wrote: > Please review this change to G1HeapSizingPolicy to avoid a float division by > zero when calculating the maximum desired capacity with a MaxHeapFreeRatio > value of 100%. > > Testing: mach5 tier1 with G1 and MaxHeapFreeRatio=100. This pull request has now been integrated. Changeset: a87bc7e4 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/a87bc7e4f0e797a108f447a1c9801abe39b700da Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod 8345374: Ubsan: runtime error: division by zero Reviewed-by: jwaters, ayang, amitkumar ------------- PR: https://git.openjdk.org/jdk/pull/22893 From zgu at openjdk.org Thu Jan 2 16:22:35 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Thu, 2 Jan 2025 16:22:35 GMT Subject: RFR: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References:

Message-ID: On Thu, 19 Dec 2024 23:33:04 GMT, William Kemper wrote: > Good catch! How'd you find this? Thank you for the review. I have a script to capture allocations that have not seen before, I guess it is largely obsoleted by --enable-lsan. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22812#issuecomment-2568026495 From zgu at openjdk.org Thu Jan 2 16:22:36 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Thu, 2 Jan 2025 16:22:36 GMT Subject: RFR: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References: Message-ID: <_g_TssIoBU2kwwm1XvAO-4noeadthqn1pPuVnNtW8jg=.dfec6482-ce98-4f8b-9e1b-c56a195cd309@github.com> On Wed, 18 Dec 2024 14:46:57 GMT, Zhengyu Gu wrote: > Worker thread initializes ShenandoahThreadLocalData twice, from Thread's constructor and ShenandoahWorkerThreads::on_create_worker(), that results in leaking ShenandoahEvacuationStats. Can I have a (R)eview? @rkennke and @shipilev? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22812#issuecomment-2568028950 From zgu at openjdk.org Fri Jan 3 21:48:47 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Fri, 3 Jan 2025 21:48:47 GMT Subject: RFR: 8339668: Parallel: Adopt PartialArrayState to consolidate marking stack in Full GC [v4] In-Reply-To: References: Message-ID: > Please review this patch that adopts `PartialArrayState`introduced by [JDK-8337709](https://bugs.openjdk.org/browse/JDK-8337709) to consolidate `_oop_task_queues` and `_objarray_task_queues` into single `_marking_stacks`. > > The change mirrors Kim's [JDK-8311163](https://bugs.openjdk.org/browse/JDK-8311163) work, therefore, there are methods can be consolidated and simplified, but I would like defer to a followup CR. Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - Adopt latest PartialArrayStats changes - Merge branch 'master' into JDK-8339668 - Merge branch 'master' into JDK-8339668 - @tschatzl's ScannerTask changes - @tschatzl's comment - v8 - v7 - v6 - v5 - v4 - ... and 5 more: https://git.openjdk.org/jdk/compare/37e5a031...0ed1a358 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/21089/files - new: https://git.openjdk.org/jdk/pull/21089/files/fd756b3b..0ed1a358 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=21089&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=21089&range=02-03 Stats: 832172 lines in 10732 files changed: 537899 ins; 230171 del; 64102 mod Patch: https://git.openjdk.org/jdk/pull/21089.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/21089/head:pull/21089 PR: https://git.openjdk.org/jdk/pull/21089 From shade at openjdk.org Mon Jan 6 09:57:35 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 6 Jan 2025 09:57:35 GMT Subject: RFR: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References: Message-ID: On Wed, 18 Dec 2024 14:46:57 GMT, Zhengyu Gu wrote: > Worker thread initializes ShenandoahThreadLocalData twice, from Thread's constructor and ShenandoahWorkerThreads::on_create_worker(), that results in leaking ShenandoahEvacuationStats. This makes sense, thanks. I see that in all other implementations, `BarrierSet` is responsible for creating thread-local data. AFAICS, this only becomes a problem when we run with generational mode that leaks `ShenandoahEvacuationStats`. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22812#pullrequestreview-2531783206 From zgu at openjdk.org Mon Jan 6 13:47:42 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 6 Jan 2025 13:47:42 GMT Subject: RFR: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References: Message-ID: <5yrh2oRRSs-L4QZTgyFUTxd-jS0hDSkgWp-Uke5Cg4U=.41fadf56-e372-4b9d-a966-f4803fb6a235@github.com> On Wed, 18 Dec 2024 14:46:57 GMT, Zhengyu Gu wrote: > Worker thread initializes ShenandoahThreadLocalData twice, from Thread's constructor and ShenandoahWorkerThreads::on_create_worker(), that results in leaking ShenandoahEvacuationStats. Thanks, @shipilev ------------- PR Comment: https://git.openjdk.org/jdk/pull/22812#issuecomment-2573140414 From zgu at openjdk.org Mon Jan 6 13:47:42 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 6 Jan 2025 13:47:42 GMT Subject: Integrated: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References: Message-ID: <-O5aGBTWtR__shbWdwHgYg-vWEmktBh59kxQhss9O88=.4e238976-5e83-4fb1-8b3e-5c28f7b2340f@github.com> On Wed, 18 Dec 2024 14:46:57 GMT, Zhengyu Gu wrote: > Worker thread initializes ShenandoahThreadLocalData twice, from Thread's constructor and ShenandoahWorkerThreads::on_create_worker(), that results in leaking ShenandoahEvacuationStats. This pull request has now been integrated. Changeset: dfaa8916 Author: Zhengyu Gu URL: https://git.openjdk.org/jdk/commit/dfaa89162a35acd20b1ed35e147f9626a181510a Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak Reviewed-by: wkemper, shade ------------- PR: https://git.openjdk.org/jdk/pull/22812 From gli at openjdk.org Mon Jan 6 14:17:37 2025 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 6 Jan 2025 14:17:37 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC In-Reply-To: References: Message-ID: On Thu, 2 Jan 2025 06:30:00 GMT, Albert Mingkun Yang wrote: > This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the `GCTimeRatio`. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously. > > Test: tier1-3 Nice improvement. src/hotspot/share/gc/serial/defNewGeneration.hpp line 53: > 51: class DefNewGeneration: public Generation { > 52: friend class VMStructs; > 53: friend class SerialHeap; Not really know whether it is a good way. Maybe we can export a method `DefNewGeneration::gc_timer`. src/hotspot/share/gc/serial/serialHeap.cpp line 334: > 332: > 333: bool first_only = !should_try_older_generation_allocation(size) > 334: && is_long_enough_from_prev_gc_pause_end(); Is the `first_only` actually `young_only`? It means that the allocation is only attempted in young generation? src/hotspot/share/gc/serial/serialHeap.cpp line 643: > 641: Ticks prev_gc_pause_end; > 642: Tickspan gc_pause; > 643: if (full_gc_pause_end < young_gc_pause_end ) { A unnecessary space after `young_gc_pause_end`. ------------- Changes requested by gli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22899#pullrequestreview-2532209941 PR Review Comment: https://git.openjdk.org/jdk/pull/22899#discussion_r1904201771 PR Review Comment: https://git.openjdk.org/jdk/pull/22899#discussion_r1904191931 PR Review Comment: https://git.openjdk.org/jdk/pull/22899#discussion_r1904196305 From ayang at openjdk.org Mon Jan 6 14:36:23 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 6 Jan 2025 14:36:23 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC [v2] In-Reply-To: References: Message-ID: > This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the `GCTimeRatio`. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously. > > Test: tier1-3 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - review - Merge branch 'master' into s1-gc-time-ratio - s1-gc-time-ratio ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22899/files - new: https://git.openjdk.org/jdk/pull/22899/files/ee6300d7..52aa4ce4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22899&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22899&range=00-01 Stats: 1173 lines in 32 files changed: 31 ins; 1082 del; 60 mod Patch: https://git.openjdk.org/jdk/pull/22899.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22899/head:pull/22899 PR: https://git.openjdk.org/jdk/pull/22899 From ayang at openjdk.org Mon Jan 6 14:40:36 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 6 Jan 2025 14:40:36 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC [v2] In-Reply-To: References:

Message-ID: On Mon, 6 Jan 2025 14:05:04 GMT, Guoxiong Li wrote: >> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - review >> - Merge branch 'master' into s1-gc-time-ratio >> - s1-gc-time-ratio > > src/hotspot/share/gc/serial/serialHeap.cpp line 334: > >> 332: >> 333: bool first_only = !should_try_older_generation_allocation(size) >> 334: && is_long_enough_from_prev_gc_pause_end(); > > Is the `first_only` actually `young_only`? It means that the allocation is only attempted in young generation? That's true. I believe Serial used to support >2 generations, so some generic names are left as is. Fixing it requires updating in multiple places, such as the method name `older_generation`, arg in `attempt_allocation`, etc. How about we do that invasive but pure refactoring in its own PR? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22899#discussion_r1904230427 From wkemper at openjdk.org Mon Jan 6 18:08:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 6 Jan 2025 18:08:08 GMT Subject: [jdk24] RFR: 8345970: pthread_getcpuclockid related crashes in shenandoah tests Message-ID: <3queiTTYxaqjTtFWIMIQ6AMERNOr4BF4iLpp_5iVvRs=.506092ae-2458-4e41-97af-4e90630456fb@github.com> Clean backport. Fixes acute issue with musl libc (used by Alpine Linux). ------------- Commit messages: - Backport 2ce53e88481659734bc5424c643c5e31c116bc5d Changes: https://git.openjdk.org/jdk/pull/22933/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22933&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345970 Stats: 18 lines in 4 files changed: 15 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22933.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22933/head:pull/22933 PR: https://git.openjdk.org/jdk/pull/22933 From shade at openjdk.org Mon Jan 6 18:19:40 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 6 Jan 2025 18:19:40 GMT Subject: [jdk24] RFR: 8345970: pthread_getcpuclockid related crashes in shenandoah tests In-Reply-To: <3queiTTYxaqjTtFWIMIQ6AMERNOr4BF4iLpp_5iVvRs=.506092ae-2458-4e41-97af-4e90630456fb@github.com> References: <3queiTTYxaqjTtFWIMIQ6AMERNOr4BF4iLpp_5iVvRs=.506092ae-2458-4e41-97af-4e90630456fb@github.com> Message-ID: On Mon, 6 Jan 2025 18:03:20 GMT, William Kemper wrote: > Clean backport. Fixes acute issue with musl libc (used by Alpine Linux). Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22933#pullrequestreview-2532708736 From wkemper at openjdk.org Mon Jan 6 18:27:41 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 6 Jan 2025 18:27:41 GMT Subject: [jdk24] Integrated: 8345970: pthread_getcpuclockid related crashes in shenandoah tests In-Reply-To: <3queiTTYxaqjTtFWIMIQ6AMERNOr4BF4iLpp_5iVvRs=.506092ae-2458-4e41-97af-4e90630456fb@github.com> References: <3queiTTYxaqjTtFWIMIQ6AMERNOr4BF4iLpp_5iVvRs=.506092ae-2458-4e41-97af-4e90630456fb@github.com> Message-ID: On Mon, 6 Jan 2025 18:03:20 GMT, William Kemper wrote: > Clean backport. Fixes acute issue with musl libc (used by Alpine Linux). This pull request has now been integrated. Changeset: cc7c293b Author: William Kemper URL: https://git.openjdk.org/jdk/commit/cc7c293bce8a564943606dbbcad64db96909d68a Stats: 18 lines in 4 files changed: 15 ins; 3 del; 0 mod 8345970: pthread_getcpuclockid related crashes in shenandoah tests Reviewed-by: shade Backport-of: 2ce53e88481659734bc5424c643c5e31c116bc5d ------------- PR: https://git.openjdk.org/jdk/pull/22933 From kdnilsen at openjdk.org Mon Jan 6 19:02:35 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 6 Jan 2025 19:02:35 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Wed, 11 Dec 2024 19:08:08 GMT, William Kemper wrote: > Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. > > The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. > > Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. Thanks. Looks very clean for how significant the change is in behavior... ------------- Marked as reviewed by kdnilsen (Author). PR Review: https://git.openjdk.org/jdk/pull/22688#pullrequestreview-2532779642 From duke at openjdk.org Mon Jan 6 20:11:43 2025 From: duke at openjdk.org (duke) Date: Mon, 6 Jan 2025 20:11:43 GMT Subject: Withdrawn: 8343658: Parallel: Implement block_start for Young generation In-Reply-To: References: Message-ID: On Wed, 6 Nov 2024 08:04:17 GMT, Albert Mingkun Yang wrote: > Simple block_start implementation for Parallel young-gen. Related to https://github.com/openjdk/jdk/pull/21870 > > Test: tier1-3 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/21919 From gli at openjdk.org Tue Jan 7 06:56:34 2025 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 7 Jan 2025 06:56:34 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC [v2] In-Reply-To: References:

Message-ID: <28lTO6UU1Ro2zUtCYj5Fwnw_4W0JLlnkEsGBVTfP55Q=.32a8761d-4ca5-4b00-8b1b-c71d288600a5@github.com> On Mon, 6 Jan 2025 14:37:47 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/serial/serialHeap.cpp line 334: >> >>> 332: >>> 333: bool first_only = !should_try_older_generation_allocation(size) >>> 334: && is_long_enough_from_prev_gc_pause_end(); >> >> Is the `first_only` actually `young_only`? It means that the allocation is only attempted in young generation? > > That's true. I believe Serial used to support >2 generations, so some generic names are left as is. Fixing it requires updating in multiple places, such as the method name `older_generation`, arg in `attempt_allocation`, etc. How about we do that invasive but pure refactoring in its own PR? Yes, such refactoring should be finished in other issues. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22899#discussion_r1904984586 From gli at openjdk.org Tue Jan 7 07:08:35 2025 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 7 Jan 2025 07:08:35 GMT Subject: RFR: 8346920: Serial: Support allocation in old generation before GC [v2] In-Reply-To: References:

Message-ID: <7v0-msW0bLb0qqi1dPLZeCWJH1yRkQ6Wi2tS3eNG2YQ=.12c38d9c-5de0-4f37-a7bb-df253a620f8a@github.com> On Mon, 6 Jan 2025 14:36:23 GMT, Albert Mingkun Yang wrote: >> This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the `GCTimeRatio`. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - review > - Merge branch 'master' into s1-gc-time-ratio > - s1-gc-time-ratio Looks good. ------------- Marked as reviewed by gli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22899#pullrequestreview-2533524913 From ayang at openjdk.org Tue Jan 7 08:53:47 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 7 Jan 2025 08:53:47 GMT Subject: RFR: 8343658: Parallel: Implement block_start for Young generation In-Reply-To: References: Message-ID: On Wed, 6 Nov 2024 08:04:17 GMT, Albert Mingkun Yang wrote: > Simple block_start implementation for Parallel young-gen. Related to https://github.com/openjdk/jdk/pull/21870 > > Test: tier1-3 @simonis My intention was simply to align Parallel with the existing implementations in Serial and G1, thereby removing the oddly structured `DebuggingContext` code. While it?s possible to invest additional effort in object iteration to make failure-reporting more robust, it?s worth noting that the default GCs (Serial and G1) haven?t, to my knowledge, required such measures. Therefore, I wonder if we might be overthinking this without a specific, concrete need. If you still think this is a step in the wrong direction, I will close the corresponding ticket. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21919#issuecomment-2574718495 From ayang at openjdk.org Tue Jan 7 08:54:23 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 7 Jan 2025 08:54:23 GMT Subject: RFR: 8347094: Inline CollectedHeap::increment_total_full_collections Message-ID: Trivial inlining a method to its sole caller. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/22940/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22940&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347094 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22940.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22940/head:pull/22940 PR: https://git.openjdk.org/jdk/pull/22940 From stefank at openjdk.org Tue Jan 7 11:59:34 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 7 Jan 2025 11:59:34 GMT Subject: RFR: 8347094: Inline CollectedHeap::increment_total_full_collections In-Reply-To: References: Message-ID: <3y5lqLyLoBVQ1mNdfcbC5f7wNRLsz4sJmUYBxMlorwE=.f534955a-0cdc-4d20-8138-de967d94c3e2@github.com> On Tue, 7 Jan 2025 08:48:22 GMT, Albert Mingkun Yang wrote: > Trivial inlining a method to its sole caller. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22940#pullrequestreview-2534154583 From eosterlund at openjdk.org Tue Jan 7 12:05:43 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Jan 2025 12:05:43 GMT Subject: RFR: 8347094: Inline CollectedHeap::increment_total_full_collections In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 08:48:22 GMT, Albert Mingkun Yang wrote: > Trivial inlining a method to its sole caller. Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22940#pullrequestreview-2534167170 From simonis at openjdk.org Tue Jan 7 16:07:42 2025 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 7 Jan 2025 16:07:42 GMT Subject: RFR: 8343658: Parallel: Implement block_start for Young generation In-Reply-To: References:

Message-ID: On Tue, 7 Jan 2025 08:51:21 GMT, Albert Mingkun Yang wrote: >> Simple block_start implementation for Parallel young-gen. Related to https://github.com/openjdk/jdk/pull/21870 >> >> Test: tier1-3 > > @simonis My intention was simply to align Parallel with the existing implementations in Serial and G1, thereby removing the oddly structured `DebuggingContext` code. > > While it?s possible to invest additional effort in object iteration to make failure-reporting more robust, it?s worth noting that the default GCs (Serial and G1) haven?t, to my knowledge, required such measures. Therefore, I wonder if we might be overthinking this without a specific, concrete need. > > If you still think this is a step in the wrong direction, I will close the corresponding ticket. @albertnetymk, I will not veto if you want to proceed but as I've already wrote, the current implementation for Serial is definitely wrong and can lead to secondary crashes during error reporting (which I regularly see in hs_err files) and even wrong, to infinite loops in the worst case. I haven't analyzed the G1 code yet because it is more complex, but I'm pretty sure it is also wrong if called at arbitrary places during VMError reporting (e.g. it depends on `G1HeapRegion::_parsable_bottom` being set correctly which isn't necessarily the case if the VM crashes). If you proceed with this PR, I suggest to at least use [`LocationPrinter::is_valid_obj()`](https://github.com/openjdk/jdk/blob/ac82a8f89c7066fb1d379b12bcfd68053cb39ba4/src/hotspot/share/gc/shared/locationPrinter.cpp#L33) which internally uses [`Klass::is_valid()`](https://github.com/openjdk/jdk/blob/ac82a8f89c7066fb1d379b12bcfd68053cb39ba4/src/hotspot/share/oops/klass.cpp#L1038) to check the validity of an oop in `MutableSpace::block_start()` instead of just using `cast_to_oop()` and asserting `oopDesc::is_oop()`. This will give us at least some safety belts. Also, if you proceed with this PR, please update the Serial implementation to use `LocationPrinter::is_valid_obj()` as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21919#issuecomment-2575665010 From shade at openjdk.org Tue Jan 7 18:13:12 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 7 Jan 2025 18:13:12 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed Message-ID: One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. The solution is cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/22954/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22954&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347126 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/22954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22954/head:pull/22954 PR: https://git.openjdk.org/jdk/pull/22954 From shade at openjdk.org Tue Jan 7 18:13:12 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 7 Jan 2025 18:13:12 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 18:08:33 GMT, Aleksey Shipilev wrote: > One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. > > The solution is cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. On my M1, the test now configures to much more reasonable heap size. # Before 0s: Using 7 workers, each allocating: ~936M # After 0s: Using 10 workers, each allocating: ~81M ------------- PR Comment: https://git.openjdk.org/jdk/pull/22954#issuecomment-2575932718 From gli at openjdk.org Wed Jan 8 06:46:36 2025 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 8 Jan 2025 06:46:36 GMT Subject: RFR: 8347094: Inline CollectedHeap::increment_total_full_collections In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 08:48:22 GMT, Albert Mingkun Yang wrote: > Trivial inlining a method to its sole caller. Marked as reviewed by gli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22940#pullrequestreview-2536137690 From gli at openjdk.org Wed Jan 8 07:43:35 2025 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 8 Jan 2025 07:43:35 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 18:08:33 GMT, Aleksey Shipilev wrote: > One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. > > I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. test/hotspot/jtreg/gc/stress/TestStressG1Uncommit.java line 84: > 82: > 83: // Figure out suitable number of workers (~1 per 100M). > 84: int allocationChunk = (int) Math.ceil((double) allocationSize * 100 / M); If we want to use one worker per 100M, should the equation be `allocationSize / (100 * M)`? Did I miss anything? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22954#discussion_r1906618799 From aboldtch at openjdk.org Wed Jan 8 08:21:34 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 8 Jan 2025 08:21:34 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed In-Reply-To: References:

Message-ID: On Wed, 8 Jan 2025 07:41:22 GMT, Guoxiong Li wrote: >> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. >> >> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. > > test/hotspot/jtreg/gc/stress/TestStressG1Uncommit.java line 84: > >> 82: >> 83: // Figure out suitable number of workers (~1 per 100M). >> 84: int allocationChunk = (int) Math.ceil((double) allocationSize * 100 / M); > > If we want to use one worker per 100M, should the equation be `allocationSize / (100 * M)`? Did I miss anything? Yes. It looked weird that he got 10 workers. This test should now always result in `min(8, num_procs)` workers. Especially since this uses `executeLimitedTestJava`, so it will not propagate flags. So I am not sure how we ever run this test with a different heap without editing the test file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22954#discussion_r1906708622 From lgxbslgx at gmail.com Wed Jan 8 08:30:02 2025 From: lgxbslgx at gmail.com (Guoxiong Li) Date: Wed, 8 Jan 2025 16:30:02 +0800 Subject: [Discussion] Serial GC: Expand young generation size Message-ID: Hi all, Currently, the young generation in SerialGC can't be expanded now and is always the initial young generation size. It is not a very big problem generally. But in some situations, like JDK-8333386 [1], it will crash the VM (unnecessary OutOfMemoryError). So I want to implement the feature to expand the young generation size (absolutely, not exceeding the max young generation size). What do you think about it? Any ideas will be appreciated. Best Regards, -- Guoxiong Related links: [1] https://bugs.openjdk.org/browse/JDK-8333386 [2] https://bugs.openjdk.org/browse/JDK-8335925 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at openjdk.org Wed Jan 8 09:20:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 09:20:16 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: References: Message-ID: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> > One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. > > I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Fix math ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22954/files - new: https://git.openjdk.org/jdk/pull/22954/files/35e051ad..75eed4ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22954&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22954&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22954/head:pull/22954 PR: https://git.openjdk.org/jdk/pull/22954 From shade at openjdk.org Wed Jan 8 09:20:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 09:20:16 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: References:

Message-ID: <00OyVsTrmwh3IUqWmD_WXrQpC8LQ5K5pRYhDKezqjfo=.8a34be18-d1ac-4c70-bc67-72669e22ba29@github.com> On Wed, 8 Jan 2025 08:18:41 GMT, Axel Boldt-Christmas wrote: >> test/hotspot/jtreg/gc/stress/TestStressG1Uncommit.java line 84: >> >>> 82: >>> 83: // Figure out suitable number of workers (~1 per 100M). >>> 84: int allocationChunk = (int) Math.ceil((double) allocationSize * 100 / M); >> >> If we want to use one worker per 100M, should the equation be `allocationSize / (100 * M)`? Did I miss anything? > > Yes. It looked weird that he got 10 workers. This test should now always result in `min(8, num_procs)` workers. Especially since this uses `executeLimitedTestJava`, so it will not propagate flags. So I am not sure how we ever run this test with a different heap without editing the test file. Well, "he" made a math mistake that Guixiong noticed. New calculation yields 9 workers on M1, because we take `ceil(0.8*Xmx)`: 0s: Using 9 workers, each allocating: ~91M 0s: Allocation size 858993459, allocation chunks: 9 Fixed in new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22954#discussion_r1906859432 From ayang at openjdk.org Wed Jan 8 09:49:44 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 8 Jan 2025 09:49:44 GMT Subject: RFR: 8347094: Inline CollectedHeap::increment_total_full_collections In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 08:48:22 GMT, Albert Mingkun Yang wrote: > Trivial inlining a method to its sole caller. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22940#issuecomment-2577230627 From ayang at openjdk.org Wed Jan 8 09:49:44 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 8 Jan 2025 09:49:44 GMT Subject: Integrated: 8347094: Inline CollectedHeap::increment_total_full_collections In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 08:48:22 GMT, Albert Mingkun Yang wrote: > Trivial inlining a method to its sole caller. This pull request has now been integrated. Changeset: 98724219 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/98724219a87c1cdb1e7942ade1a4d49b201a0a94 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod 8347094: Inline CollectedHeap::increment_total_full_collections Reviewed-by: stefank, eosterlund, gli ------------- PR: https://git.openjdk.org/jdk/pull/22940 From ayang at openjdk.org Wed Jan 8 10:07:40 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 8 Jan 2025 10:07:40 GMT Subject: RFR: 8343658: Parallel: Implement block_start for Young generation In-Reply-To: References:

Message-ID: On Tue, 7 Jan 2025 16:04:45 GMT, Volker Simonis wrote: > the current implementation for Serial is definitely wrong and can lead to secondary crashes during error reporting (which I regularly see in hs_err files) @simonis Can you create a bug/ticket for that secondary crash with repro info? That sounds like a real bug. (If not all necessary info/files are readily available, maybe you can collect them in the next occurrence.) Depending on how it goes with Serial, we can probably do sth similar in other GCs (e.g. Parallel). What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/21919#issuecomment-2577275582 From albert.m.yang at oracle.com Wed Jan 8 10:27:39 2025 From: albert.m.yang at oracle.com (Albert Yang) Date: Wed, 8 Jan 2025 10:27:39 +0000 Subject: [Discussion] Serial GC: Expand young generation size In-Reply-To: References: Message-ID: Re https://bugs.openjdk.org/browse/JDK-8333386, I think your suggestion, "add the option `-XX:NewSize=65m`", is the way to go. As for adding young-gen expansion support to Serial, it probably should have its own enhancement ticket. I am currently working on placing from/to spaces before eden inside young-gen as part of Parallel heap-auto-sizing. I believe Serial can use the same layout (from/to/eden, instead of eden/from/to) to facilitate eden/young-gen expansion. My 2c. /Albert ________________________________________ From: hotspot-gc-dev on behalf of Guoxiong Li Sent: Wednesday, January 8, 2025 09:30 To: hotspot-gc-dev at openjdk.org Subject: [Discussion] Serial GC: Expand young generation size Hi all, Currently, the young generation in SerialGC can't be expanded now and is always the initial young generation size. It is not a very big problem generally. But in some situations, like JDK-8333386 [1], it will crash the VM (unnecessary OutOfMemoryError). So I want to implement the feature to expand the young generation size (absolutely, not exceeding the max young generation size). What do you think about it? Any ideas will be appreciated. Best Regards, -- Guoxiong Related links: [1] https://bugs.openjdk.org/browse/JDK-8333386 [2] https://bugs.openjdk.org/browse/JDK-8335925 From shade at openjdk.org Wed Jan 8 11:03:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 11:03:17 GMT Subject: RFR: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level Message-ID: For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. Output before: $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups Hello! $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java [0.008s][info][gc] Using Epsilon [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups Hello! [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used Output after: $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java Hello! $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java [0.009s][info][gc] Using Epsilon [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups Hello! [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used ------------- Commit messages: - Leave it in epsilonInitLogger - Fix Changes: https://git.openjdk.org/jdk/pull/22965/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22965&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347256 Stats: 24 lines in 1 file changed: 12 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22965.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22965/head:pull/22965 PR: https://git.openjdk.org/jdk/pull/22965 From tschatzl at openjdk.org Wed Jan 8 14:36:34 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 8 Jan 2025 14:36:34 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> References: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> Message-ID: On Wed, 8 Jan 2025 09:20:16 GMT, Aleksey Shipilev wrote: >> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. >> >> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix math Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22954#pullrequestreview-2537314050 From tschatzl at openjdk.org Wed Jan 8 15:03:35 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 8 Jan 2025 15:03:35 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> References: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> Message-ID: On Wed, 8 Jan 2025 09:20:16 GMT, Aleksey Shipilev wrote: >> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. >> >> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix math Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22954#pullrequestreview-2537386212 From gli at openjdk.org Wed Jan 8 15:10:49 2025 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 8 Jan 2025 15:10:49 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> References: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> Message-ID: On Wed, 8 Jan 2025 09:20:16 GMT, Aleksey Shipilev wrote: >> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. >> >> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix math Marked as reviewed by gli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22954#pullrequestreview-2537417062 From tschatzl at openjdk.org Wed Jan 8 15:52:37 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 8 Jan 2025 15:52:37 GMT Subject: RFR: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 10:58:17 GMT, Aleksey Shipilev wrote: > For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. > > I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. > > Output before: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.008s][info][gc] Using Epsilon > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used > > > Output after: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.009s][info][gc] Using Epsilon > [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22965#pullrequestreview-2537531580 From ysr at openjdk.org Wed Jan 8 16:43:02 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 8 Jan 2025 16:43:02 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Wed, 11 Dec 2024 19:08:08 GMT, William Kemper wrote: > Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. > > The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. > > Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. Looks good... Left a few documentation request comments. I haven't fully wrapped my head around the correctness of this yet (sorry, slow start to the new year :-), and will go over it again and complete it a bit later today after I get to the office. src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1166: > 1164: } > 1165: > 1166: if (VerifyAfterGC) { What are the conventions of when to use Verify{Before,After,During}GC on the one hand, vs ShenandoahVerify, G1Verify* etc., on the other? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2650: > 2648: bool ShenandoahHeap::is_gc_state(GCState state) const { > 2649: return _gc_state_changed ? _gc_state.is_set(state) : ShenandoahThreadLocalData::is_gc_state(state); > 2650: } This needs a documentation comment, please; e.g. why we check `_gc_state_changed` before we check the global state. Is the transition of the local and global states wrt the phase described in a comment somewhere else already? src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 371: > 369: public: > 370: char gc_state() const; > 371: bool is_gc_state(GCState state) const; Can you write a 1-line documentation comment for this method? It would make its implementation clearer. (See my comment in the method's implementation.) src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 374: > 372: > 373: // This copies the global gc state into a thread local variable for all threads. > 374: // It is primarily intended to support quick access at barriers. All threads are Instead of "It ..." say "The thread local gc state ..." src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp line 150: > 148: return 3; > 149: } > 150: if (heap->is_concurrent_mark_in_progress() || heap->is_concurrent_weak_root_in_progress() || heap->is_full_gc_in_progress()) { naive question: where are the counters/encoding used? ------------- PR Review: https://git.openjdk.org/jdk/pull/22688#pullrequestreview-2536114812 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1907457433 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1906544298 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1906551170 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1906548845 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1906541100 From ysr at openjdk.org Wed Jan 8 16:43:02 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 8 Jan 2025 16:43:02 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Wed, 8 Jan 2025 06:30:45 GMT, Y. Srinivas Ramakrishna wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2650: > >> 2648: bool ShenandoahHeap::is_gc_state(GCState state) const { >> 2649: return _gc_state_changed ? _gc_state.is_set(state) : ShenandoahThreadLocalData::is_gc_state(state); >> 2650: } > > This needs a documentation comment, please; e.g. why we check `_gc_state_changed` before we check the global state. Is the transition of the local and global states wrt the phase described in a comment somewhere else already? Or is this a common idiom used elsewhere as well, and already well-documented? > src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 371: > >> 369: public: >> 370: char gc_state() const; >> 371: bool is_gc_state(GCState state) const; > > Can you write a 1-line documentation comment for this method? It would make its implementation clearer. (See my comment in the method's implementation.) (e.g. that, unlike comment at line 366, this must return the "right" value even at non-safepoints.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1907477306 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1906552223 From wkemper at openjdk.org Wed Jan 8 20:28:25 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 8 Jan 2025 20:28:25 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> <__kORuPC0guQED9-jn2Xg9CFIJ15wVRojwZoy_VqcPs=.0e5c812f-9e4e-4396-8acd-1e84a5e598c5@github.com>

Message-ID: On Sat, 21 Dec 2024 01:48:10 GMT, Xiaolong Peng wrote: >> The old cycle may be preempted by young collections, but it is only really _cancelled_ by global cycles or full GCs. Control thread will resume old marking, but this operates independently from young bitmap regions. I think we can reset young region bitmaps even when concurrent old marking is on going. > > I think we are taking about the same thing, old gen could be preempted by young gc and resumed after the cycle. I have seem crash from gc verification caused by this, an old gc was bootstrapped but it was preempted/canceled multiple times right after the old gc started, eventually caused a crash from verifier because it expected the object in young is marked. I will share the gc log on slack later. That sounds like an issue with the verifier then? Once a young cycle is complete, nothing should depend on the state of the bitmaps for young regions (if, for no other reason, evacuation could have moved objects so that the bitmaps no longer represent the addresses of marked objects that were evacuated). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907802968 From wkemper at openjdk.org Wed Jan 8 20:41:47 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 8 Jan 2025 20:41:47 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Wed, 8 Jan 2025 06:26:38 GMT, Y. Srinivas Ramakrishna wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp line 150: > >> 148: return 3; >> 149: } >> 150: if (heap->is_concurrent_mark_in_progress() || heap->is_concurrent_weak_root_in_progress() || heap->is_full_gc_in_progress()) { > > naive question: where are the counters/encoding used? They get put in `PerfData` variables. They also may be serialized in a log. The [Shenandoah Visualizer](https://github.com/openjdk/shenandoah-visualizer) is able to render them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1907839891 From wkemper at openjdk.org Wed Jan 8 20:45:48 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 8 Jan 2025 20:45:48 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Wed, 8 Jan 2025 16:25:23 GMT, Y. Srinivas Ramakrishna wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1166: > >> 1164: } >> 1165: >> 1166: if (VerifyAfterGC) { > > What are the conventions of when to use Verify{Before,After,During}GC on the one hand, vs ShenandoahVerify, G1Verify* etc., on the other? I don't really think there is a convention. In this particular case, it was "verifying" before concurrent reference processing was complete, which could lead to erroneous verification failures. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1907844225 From kdnilsen at openjdk.org Wed Jan 8 20:52:44 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 8 Jan 2025 20:52:44 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: On Mon, 30 Dec 2024 22:54:27 GMT, Xiaolong Peng wrote: >> Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. >> >> I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. >> >> GenShen: >> Before: >> >> [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) >> >> >> After: >> >> [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) >> [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) >> >> >> Shenandoah: >> Before: >> >> [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) >> >> After: >> >> [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) >> [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) >> >> >> Additional changes: >> * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. >> * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: >> - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 >> - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. >> * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. >> * Clean up FullGC code, remove duplicate code. >> >> ... > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: > > - Merge branch 'openjdk:master' into reset-bitmap > - Address review comments > - Merge branch 'openjdk:master' into reset-bitmap > - Remove ShenandoahResetUpdateRegionStateClosure > - Always set_mark_incomplete when reset mark bitmap > - Fix > - Add comments > - fix > - Not reset_mark_bitmap after cycle when is_concurrent_old_mark_in_progress or is_prepare_for_old_mark_in_progress > - Not invoke set_mark_incomplete when reset bitmap after cycle > - ... and 7 more: https://git.openjdk.org/jdk/compare/82c2f771...f82fdfaa src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 242: > 240: // Instead of always reset before collect, some reset can be done after collect to save > 241: // the time before before the cycle so the cycle can be started as soon as possible. > 242: entry_reset_after_collect(); For comment, I would say: "Instead of always resetting immediately before the start of a new GC, we can often reset at the end of the previous GC. This allows us to start the next GC cycle more quickly after a trigger condition is detected, reducing the likelihood that GC will degenerate." src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 592: > 590: // If it is old GC bootstrap cycle, always clear bitmap for global gen > 591: // to ensure bitmap for old gen is clear for old GC cycle after this. > 592: if (_do_old_gc_bootstrap) { This may deserve a comment. It seems we ought to clear the old-gen mark bitmap at the end of coalesce-and-fill. But that does not allow us to avoid clearing old-gen mark bitmaps at start of bootstrap because when young-gen regions are promoted in place, the mark bitmap is preserved for those regions, and since they are considered old at the end of the GC cycle during which they were promoted, those bitmaps will not be cleared by op_reset_after_collect(). Is there a way to improve this behavior? For example, in op_reset_after_collect(), maybe we should clear old-gen bitmaps also (at least for recently promoted in place regions) unless old marking is in process and/or mixed evacuations are in progress. Maybe this can be tackled in a separate PR, but would be good to file JBS ticket now if there is agreement on the approach. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907828996 PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907845214 From kdnilsen at openjdk.org Wed Jan 8 20:52:45 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 8 Jan 2025 20:52:45 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> <__kORuPC0guQED9-jn2Xg9CFIJ15wVRojwZoy_VqcPs=.0e5c812f-9e4e-4396-8acd-1e84a5e598c5@github.com>

Message-ID: <6kG3_NLd3D4G9fYnrmdKw-s25Fsmu1rLwOV_6eRDrfI=.cd23eb62-96b9-459f-a313-2d4ae7762284@github.com> On Wed, 8 Jan 2025 20:15:30 GMT, William Kemper wrote: >> I think we are taking about the same thing, old gen could be preempted by young gc and resumed after the cycle. I have seem crash from gc verification caused by this, an old gc was bootstrapped but it was preempted/canceled multiple times right after the old gc started, eventually caused a crash from verifier because it expected the object in young is marked. I will share the gc log on slack later. > > That sounds like an issue with the verifier then? Once a young cycle is complete, nothing should depend on the state of the bitmaps for young regions (if, for no other reason, evacuation could have moved objects so that the bitmaps no longer represent the addresses of marked objects that were evacuated). I agree with @earthling-amzn that we should be able to reset young-generation mark bitmap even if this is old_gc_bootstrap and even if old marking is in progress. We should dive deeper to figure out the crash you observed. It seems we don't fully understand the root cause. I also suggest rewording the comment. trigged? (See other comments about increasing generality of this approach.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907848997 From xpeng at openjdk.org Wed Jan 8 21:40:59 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 8 Jan 2025 21:40:59 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com>

Message-ID: On Wed, 8 Jan 2025 20:43:36 GMT, Kelvin Nilsen wrote: >> Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into reset-bitmap >> - Address review comments >> - Merge branch 'openjdk:master' into reset-bitmap >> - Remove ShenandoahResetUpdateRegionStateClosure >> - Always set_mark_incomplete when reset mark bitmap >> - Fix >> - Add comments >> - fix >> - Not reset_mark_bitmap after cycle when is_concurrent_old_mark_in_progress or is_prepare_for_old_mark_in_progress >> - Not invoke set_mark_incomplete when reset bitmap after cycle >> - ... and 7 more: https://git.openjdk.org/jdk/compare/5c258fa2...f82fdfaa > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 592: > >> 590: // If it is old GC bootstrap cycle, always clear bitmap for global gen >> 591: // to ensure bitmap for old gen is clear for old GC cycle after this. >> 592: if (_do_old_gc_bootstrap) { > > This may deserve a comment. It seems we ought to clear the old-gen mark bitmap at the end of coalesce-and-fill. But that does not allow us to avoid clearing old-gen mark bitmaps at start of bootstrap because when young-gen regions are promoted in place, the mark bitmap is preserved for those regions, and since they are considered old at the end of the GC cycle during which they were promoted, those bitmaps will not be cleared by op_reset_after_collect(). Is there a way to improve this behavior? > > For example, in op_reset_after_collect(), maybe we should clear old-gen bitmaps also (at least for recently promoted in place regions) unless old marking is in process and/or mixed evacuations are in progress. > > Maybe this can be tackled in a separate PR, but would be good to file JBS ticket now if there is agreement on the approach. Yes, We can reset bimap of old region when there in place promotion and all old regions after coalesce-and-fill for old gen. Thanks Kelvin, I'll create a JBS ticket for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907893752 From xpeng at openjdk.org Wed Jan 8 22:58:30 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 8 Jan 2025 22:58:30 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com>

Message-ID: On Wed, 8 Jan 2025 20:30:38 GMT, Kelvin Nilsen wrote: >> Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into reset-bitmap >> - Address review comments >> - Merge branch 'openjdk:master' into reset-bitmap >> - Remove ShenandoahResetUpdateRegionStateClosure >> - Always set_mark_incomplete when reset mark bitmap >> - Fix >> - Add comments >> - fix >> - Not reset_mark_bitmap after cycle when is_concurrent_old_mark_in_progress or is_prepare_for_old_mark_in_progress >> - Not invoke set_mark_incomplete when reset bitmap after cycle >> - ... and 7 more: https://git.openjdk.org/jdk/compare/099e4ed4...f82fdfaa > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 242: > >> 240: // Instead of always reset before collect, some reset can be done after collect to save >> 241: // the time before before the cycle so the cycle can be started as soon as possible. >> 242: entry_reset_after_collect(); > > For comment, I would say: "Instead of always resetting immediately before the start of a new GC, we can often reset at the end of the previous GC. This allows us to start the next GC cycle more quickly after a trigger condition is detected, reducing the likelihood that GC will degenerate." I'll update comments, thanks Kelvin! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1907947027 From wkemper at openjdk.org Wed Jan 8 23:31:51 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 8 Jan 2025 23:31:51 GMT Subject: [jdk24] RFR: 8346737: GenShen: Generational memory pools should not report zero for maximum capacity Message-ID: Clean backport. Fixes many SA tests. ------------- Commit messages: - Backport 249f141211c94afcce70d9d536d84e108e07b4e5 Changes: https://git.openjdk.org/jdk/pull/22984/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22984&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346737 Stats: 6 lines in 2 files changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22984/head:pull/22984 PR: https://git.openjdk.org/jdk/pull/22984 From kdnilsen at openjdk.org Thu Jan 9 00:22:35 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 9 Jan 2025 00:22:35 GMT Subject: [jdk24] RFR: 8346737: GenShen: Generational memory pools should not report zero for maximum capacity In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 23:26:53 GMT, William Kemper wrote: > Clean backport. Fixes many SA tests. Marked as reviewed by kdnilsen (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/22984#pullrequestreview-2538518161 From ysr at openjdk.org Thu Jan 9 00:59:42 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 9 Jan 2025 00:59:42 GMT Subject: [jdk24] RFR: 8346737: GenShen: Generational memory pools should not report zero for maximum capacity In-Reply-To: References: Message-ID: <2SlOW_Bx_aU_KyYjRVjHmZcmkgp0M1qjNV29EULbdtk=.41d35bd0-7a53-43cb-89fb-7fa820a49538@github.com> On Wed, 8 Jan 2025 23:26:53 GMT, William Kemper wrote: > Clean backport. Fixes many SA tests. Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22984#pullrequestreview-2538573609 From syan at openjdk.org Thu Jan 9 03:56:03 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 9 Jan 2025 03:56:03 GMT Subject: RFR: 8347279: Problemlist TestEvilSyncBug.java#generational Message-ID: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> Hi all, Test gc/shenandoah/TestEvilSyncBug.java#generational was observed times out on a lot of platforms (various Linux and Windows too) several times recently. Should we Problemlist this test before the failure root cause been fixed. ------------- Commit messages: - 8347279: Problemlist TestEvilSyncBug.java#generational Changes: https://git.openjdk.org/jdk/pull/22996/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22996&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347279 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22996.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22996/head:pull/22996 PR: https://git.openjdk.org/jdk/pull/22996 From shade at openjdk.org Thu Jan 9 09:51:40 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 9 Jan 2025 09:51:40 GMT Subject: RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed [v2] In-Reply-To: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> References: <5Fu1wC8AtzoNWOIR6b6y4wt9MNLh6df2o9m_weuLEnI=.dd2423b5-0394-4a57-9559-d89cdca8d5cb@github.com> Message-ID: On Wed, 8 Jan 2025 09:20:16 GMT, Aleksey Shipilev wrote: >> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. >> >> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix math Thanks for reviews! Here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22954#issuecomment-2579619953 From shade at openjdk.org Thu Jan 9 09:51:41 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 9 Jan 2025 09:51:41 GMT Subject: Integrated: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 18:08:33 GMT, Aleksey Shipilev wrote: > One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism. > > I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests. This pull request has now been integrated. Changeset: dff5719e Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/dff5719e6f95f9ce50a5d49adf13541e22f7b5b1 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed Reviewed-by: tschatzl, gli ------------- PR: https://git.openjdk.org/jdk/pull/22954 From shade at openjdk.org Thu Jan 9 10:05:40 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 9 Jan 2025 10:05:40 GMT Subject: RFR: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level In-Reply-To: References:

Message-ID: <4I5h1g5_Mn_qoCX8W8KCrCY6BxgbEPwnwR3MycwVaqw=.fa92dc34-71b7-4d35-9cc7-200c33038a00@github.com> On Wed, 8 Jan 2025 15:50:28 GMT, Thomas Schatzl wrote: >> For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. >> >> I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. >> >> Output before: >> >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java >> [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups >> [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups >> Hello! >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java >> [0.008s][info][gc] Using Epsilon >> [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups >> [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups >> Hello! >> [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used >> >> >> Output after: >> >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java >> Hello! >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java >> [0.009s][info][gc] Using Epsilon >> [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups >> [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups >> Hello! >> [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used > > lgtm. Thank you @tschatzl! I think I need another Review to merge this. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22965#issuecomment-2579655476 From tschatzl at openjdk.org Thu Jan 9 12:28:41 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Jan 2025 12:28:41 GMT Subject: RFR: 8347279: Problemlist TestEvilSyncBug.java#generational In-Reply-To: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> References: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> Message-ID: On Thu, 9 Jan 2025 03:46:29 GMT, SendaoYan wrote: > Hi all, > Test gc/shenandoah/TestEvilSyncBug.java#generational was observed times out on a lot of platforms (various Linux and Windows too) several times recently. Should we Problemlist this test before the failure root cause been fixed. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22996#pullrequestreview-2539789222 From simonis at openjdk.org Thu Jan 9 14:04:42 2025 From: simonis at openjdk.org (Volker Simonis) Date: Thu, 9 Jan 2025 14:04:42 GMT Subject: RFR: 8343658: Parallel: Implement block_start for Young generation In-Reply-To: References: Message-ID: On Wed, 6 Nov 2024 08:04:17 GMT, Albert Mingkun Yang wrote: > Simple block_start implementation for Parallel young-gen. Related to https://github.com/openjdk/jdk/pull/21870 > > Test: tier1-3 It's hard to create a reproducible test case for a crash in print location because it requires you to load a specific bogus value (which points into the unallocted part of the heap) into a register just before the crash. It's easy though to trigger it manually in the debugger as described in https://github.com/openjdk/jdk/pull/21919#issuecomment-2577275582. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21919#issuecomment-2580274187 From wkemper at openjdk.org Thu Jan 9 17:09:41 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 9 Jan 2025 17:09:41 GMT Subject: [jdk24] Integrated: 8346737: GenShen: Generational memory pools should not report zero for maximum capacity In-Reply-To: References: Message-ID: <0za45SpKfNjBbXsh5lBROU3kBFsbCZ-Cyh8uPoG7Mto=.9bff78a4-6a0a-4824-8757-0ce15eab06f9@github.com> On Wed, 8 Jan 2025 23:26:53 GMT, William Kemper wrote: > Clean backport. Fixes many SA tests. This pull request has now been integrated. Changeset: ff9b8e46 Author: William Kemper URL: https://git.openjdk.org/jdk/commit/ff9b8e4607e28cf2b165f3ff170b17e6b6d8a8a5 Stats: 6 lines in 2 files changed: 0 ins; 6 del; 0 mod 8346737: GenShen: Generational memory pools should not report zero for maximum capacity Reviewed-by: kdnilsen, ysr Backport-of: 249f141211c94afcce70d9d536d84e108e07b4e5 ------------- PR: https://git.openjdk.org/jdk/pull/22984 From wkemper at openjdk.org Thu Jan 9 17:21:38 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 9 Jan 2025 17:21:38 GMT Subject: RFR: 8347279: Problemlist TestEvilSyncBug.java#generational In-Reply-To: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> References: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> Message-ID: <82aA2LLwQY71hSKd48xTwMuN0YDFg8uP2hWUISGNJO4=.a79f224b-ad1c-409b-826e-baabf04d9a35@github.com> On Thu, 9 Jan 2025 03:46:29 GMT, SendaoYan wrote: > Hi all, > Test gc/shenandoah/TestEvilSyncBug.java#generational was observed times out on a lot of platforms (various Linux and Windows too) several times recently. Should we Problemlist this test before the failure root cause been fixed. Thank you for this. ------------- Marked as reviewed by wkemper (Committer). PR Review: https://git.openjdk.org/jdk/pull/22996#pullrequestreview-2540509493 From wkemper at openjdk.org Thu Jan 9 17:47:26 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 9 Jan 2025 17:47:26 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v2] In-Reply-To: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: > Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. > > The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. > > Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: - Improve comments - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint - Fix comments - Fix comment, revert unnecessary change - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint - Fix phase encoding to handle weak roots - WIP: Use Threads::threads_do for propagating gc state (consolidated) - WIP: Use Threads::threads_do for propagating gc state - Remove unnecessary gc state propagations - Encapsulate gc state - ... and 22 more: https://git.openjdk.org/jdk/compare/967c77a7...83ac7b49 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22688/files - new: https://git.openjdk.org/jdk/pull/22688/files/9aaef708..83ac7b49 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=00-01 Stats: 31726 lines in 2167 files changed: 20848 ins; 5389 del; 5489 mod Patch: https://git.openjdk.org/jdk/pull/22688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22688/head:pull/22688 PR: https://git.openjdk.org/jdk/pull/22688 From xpeng at openjdk.org Thu Jan 9 19:10:37 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 19:10:37 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: <6kG3_NLd3D4G9fYnrmdKw-s25Fsmu1rLwOV_6eRDrfI=.cd23eb62-96b9-459f-a313-2d4ae7762284@github.com> References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> <__kORuPC0guQED9-jn2Xg9CFIJ15wVRojwZoy_VqcPs=.0e5c812f-9e4e-4396-8acd-1e84a5e598c5@github.com>

<6kG3_NLd3D4G9fYnrmdKw-s25Fsmu1rLwOV_6eRDrfI=.cd23eb62-96b9-459f-a313-2d4ae7762284@github.com> Message-ID: On Wed, 8 Jan 2025 20:47:44 GMT, Kelvin Nilsen wrote: >> That sounds like an issue with the verifier then? Once a young cycle is complete, nothing should depend on the state of the bitmaps for young regions (if, for no other reason, evacuation could have moved objects so that the bitmaps no longer represent the addresses of marked objects that were evacuated). > > I agree with @earthling-amzn that we should be able to reset young-generation mark bitmap even if this is old_gc_bootstrap and even if old marking is in progress. We should dive deeper to figure out the crash you observed. It seems we don't fully understand the root cause. > > I also suggest rewording the comment. trigged? (See other comments about increasing generality of this approach.) I have tested it after removing `if (!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress())`, and always get crash in stress test like: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/codebuild/output/src48/src/s3/00/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp:1270), pid=1578, tid=1595 # Error: Remembered set violation at init-update-references; clean card should be dirty Referenced from: interior location: 0x00000007f8000008 inside Java heap not in collection set region: | 2528|R |O|BTE 7f8000000, 7f8400000, 7f8400000|TAMS 7f8400000|UWM 7f8400000|U 4096K|T 0B|G 4096K|P 0B|S 0B|L 672B|CP 0 Object: 0x00000007f5dc8b58 - klass 0x0000078000249400 java.lang.invoke.MethodType not allocated after mark start not after update watermark marked strong not marked weak not in collection set age: 8 mark: mark(is_unlocked no_hash age=8) region: | 2519|R |Y|BTE 7f5c00000, 7f6000000, 7f6000000|TAMS 7f6000000|UWM 7f6000000|U 4096K|T 0B|G 4096K|P 0B|S 0B|L 4091K|CP 0 It could be something wrong in remembered set scan, resetting young region bitmaps somehow tickles the issue. I have created another [JBS ticket](https://bugs.openjdk.org/browse/JDK-8347371) to track the issue in remembered set scan, and keep this test for now. I'll update the comments in code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1909312883 From xpeng at openjdk.org Thu Jan 9 19:28:08 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 19:28:08 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v4] In-Reply-To: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: <5u5owTlpSq3Y69GNr2LGLerK6uTR0i0_-rYZ1Q6wrnc=.1af37f2c-00ad-4589-a3d9-666a216fb1af@github.com> > Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. > > I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. > > GenShen: > Before: > > [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) > > > After: > > [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) > [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) > > > Shenandoah: > Before: > > [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) > > After: > > [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) > [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) > > > Additional changes: > * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. > * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: > - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 > - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. > * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. > * Clean up FullGC code, remove duplicate code. > > Additional tests: > - [x] CONF=macosx-aarch64-server-fastdebug make test T... Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: - Adding condition "!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress()" back and address some PR comments - Remove entry_reset_after_collect from ShenandoahOldGC - Remove condition check !_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress() from op_reset_after_collect ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22778/files - new: https://git.openjdk.org/jdk/pull/22778/files/f82fdfaa..04299a76 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=02-03 Stats: 13 lines in 2 files changed: 6 ins; 5 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22778.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22778/head:pull/22778 PR: https://git.openjdk.org/jdk/pull/22778 From xpeng at openjdk.org Thu Jan 9 19:28:08 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 19:28:08 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com>

Message-ID: On Wed, 8 Jan 2025 22:46:12 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 242: >> >>> 240: // Instead of always reset before collect, some reset can be done after collect to save >>> 241: // the time before before the cycle so the cycle can be started as soon as possible. >>> 242: entry_reset_after_collect(); >> >> For comment, I would say: "Instead of always resetting immediately before the start of a new GC, we can often reset at the end of the previous GC. This allows us to start the next GC cycle more quickly after a trigger condition is detected, reducing the likelihood that GC will degenerate." > > I'll update comments, thanks Kelvin! Fixed. thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1909330187 From xpeng at openjdk.org Thu Jan 9 19:40:45 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 19:40:45 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v3] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com>

Message-ID: On Wed, 8 Jan 2025 21:38:16 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 592: >> >>> 590: // If it is old GC bootstrap cycle, always clear bitmap for global gen >>> 591: // to ensure bitmap for old gen is clear for old GC cycle after this. >>> 592: if (_do_old_gc_bootstrap) { >> >> This may deserve a comment. It seems we ought to clear the old-gen mark bitmap at the end of coalesce-and-fill. But that does not allow us to avoid clearing old-gen mark bitmaps at start of bootstrap because when young-gen regions are promoted in place, the mark bitmap is preserved for those regions, and since they are considered old at the end of the GC cycle during which they were promoted, those bitmaps will not be cleared by op_reset_after_collect(). Is there a way to improve this behavior? >> >> For example, in op_reset_after_collect(), maybe we should clear old-gen bitmaps also (at least for recently promoted in place regions) unless old marking is in process and/or mixed evacuations are in progress. >> >> Maybe this can be tackled in a separate PR, but would be good to file JBS ticket now if there is agreement on the approach. > > Yes, We can reset bimap of old region when there in place promotion and all old regions after coalesce-and-fill for old gen. > > Thanks Kelvin, I'll create a JBS ticket for this. This is ShenandoahConcurrentGC::op_reset(), it is executed when a cycle starts. I have removed line 361 to 371, which traverse all regions and apply reset for old regions when `_do_old_gc_bootstrap` is true, so it used to iterate regions twice when `_do_old_gc_bootstrap` is true. With this change, it only iterate once and reset bitmap for all regions when `_do_old_gc_bootstrap` is true. Here is the ticket https://bugs.openjdk.org/browse/JDK-8347372 to follow up the possible improvements on old GC. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22778#discussion_r1909344212 From xpeng at openjdk.org Thu Jan 9 19:53:05 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 19:53:05 GMT Subject: [jdk24] RFR: 8345423: Shenandoah: Parallelize concurrent cleanup Message-ID: Clean backport, improve performance of concurrent cleanup of Shenandoah and GenShen, remove the use of heap lock from concurrent cleanup. ------------- Commit messages: - Backport 4da6fd4283a13be1711e7ad948f1d05a0a9148a5 Changes: https://git.openjdk.org/jdk/pull/22991/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22991&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345423 Stats: 228 lines in 13 files changed: 79 ins; 56 del; 93 mod Patch: https://git.openjdk.org/jdk/pull/22991.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22991/head:pull/22991 PR: https://git.openjdk.org/jdk/pull/22991 From xpeng at openjdk.org Thu Jan 9 22:44:45 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 9 Jan 2025 22:44:45 GMT Subject: [jdk24] Withdrawn: 8345423: Shenandoah: Parallelize concurrent cleanup In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 23:57:36 GMT, Xiaolong Peng wrote: > Clean backport, improve performance of concurrent cleanup of Shenandoah and GenShen, remove the use of heap lock from concurrent cleanup. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/22991 From phh at openjdk.org Thu Jan 9 23:26:35 2025 From: phh at openjdk.org (Paul Hohensee) Date: Thu, 9 Jan 2025 23:26:35 GMT Subject: RFR: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 10:58:17 GMT, Aleksey Shipilev wrote: > For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. > > I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. > > Output before: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.008s][info][gc] Using Epsilon > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used > > > Output after: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.009s][info][gc] Using Epsilon > [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used Marked as reviewed by phh (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22965#pullrequestreview-2541142051 From ysr at openjdk.org Fri Jan 10 00:42:47 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 10 Jan 2025 00:42:47 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v2] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Thu, 9 Jan 2025 17:47:26 GMT, William Kemper wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Improve comments > - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint > - Fix comments > - Fix comment, revert unnecessary change > - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint > - Fix phase encoding to handle weak roots > - WIP: Use Threads::threads_do for propagating gc state (consolidated) > - WIP: Use Threads::threads_do for propagating gc state > - Remove unnecessary gc state propagations > - Encapsulate gc state > - ... and 22 more: https://git.openjdk.org/jdk/compare/e0773235...83ac7b49 src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 368: > 366: // This updates the singular, global gc state. This call must happen on a safepoint. > 367: // However, in some cases (init update refs, e.g.), the gc state may change concurrently > 368: // and will be propagated to all threads by a handshake operation. I am a little bit confused by the statement starting at "However, ...". Did you mean that the "local copy of the global state" may be changed outside of a safepoint but not the global state itself? I notice that `set_gc_state()` still asserts that we are at a safepoint: https://github.com/openjdk/jdk/blob/83ac7b49d34081beb3ff58f1c159d22faacd077a/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L2000 Ah, now I see that you use a different API for setting the global gc state outside of a safepoint. If my understanding is correct, then we should probably rename the APIs such that the one that is expected to be set at a safepoint uses `set_gc_state_at_safepoint()` and the one that doesn't might use `set_gc_state_concurrent()` or something like that. That would be less confusing. It also brings up the issue of what specific state predicates it's safe to test when. E.g. whether `is_gc_state()` can be safely tested any time during a safepoint or concurrently. I think it is safe, but explicitly stating this might be useful, not least because we seem to have one state change API that still asserts that we should be at a safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1909623411 From syan at openjdk.org Fri Jan 10 01:41:49 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 10 Jan 2025 01:41:49 GMT Subject: RFR: 8347279: Problemlist TestEvilSyncBug.java#generational In-Reply-To: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> References: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> Message-ID: On Thu, 9 Jan 2025 03:46:29 GMT, SendaoYan wrote: > Hi all, > Test gc/shenandoah/TestEvilSyncBug.java#generational was observed times out on a lot of platforms (various Linux and Windows too) several times recently. Should we Problemlist this test before the failure root cause been fixed. Thanks all for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22996#issuecomment-2581565626 From syan at openjdk.org Fri Jan 10 01:41:49 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 10 Jan 2025 01:41:49 GMT Subject: Integrated: 8347279: Problemlist TestEvilSyncBug.java#generational In-Reply-To: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> References: <3Ytom-NSY6xE6P0PerrVxrCd4lM3LqJ2LYkXSm0NG6c=.3c9b94cb-5c43-498f-b80e-557aa61e2635@github.com> Message-ID: On Thu, 9 Jan 2025 03:46:29 GMT, SendaoYan wrote: > Hi all, > Test gc/shenandoah/TestEvilSyncBug.java#generational was observed times out on a lot of platforms (various Linux and Windows too) several times recently. Should we Problemlist this test before the failure root cause been fixed. This pull request has now been integrated. Changeset: f6492aa6 Author: SendaoYan URL: https://git.openjdk.org/jdk/commit/f6492aa63486393593ea8761cef5362ef46abf13 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8347279: Problemlist TestEvilSyncBug.java#generational Reviewed-by: tschatzl, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/22996 From shade at openjdk.org Fri Jan 10 08:45:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 10 Jan 2025 08:45:45 GMT Subject: RFR: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 10:58:17 GMT, Aleksey Shipilev wrote: > For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. > > I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. > > Output before: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.008s][info][gc] Using Epsilon > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used > > > Output after: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.009s][info][gc] Using Epsilon > [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22965#issuecomment-2582068172 From shade at openjdk.org Fri Jan 10 08:45:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 10 Jan 2025 08:45:45 GMT Subject: Integrated: 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 10:58:17 GMT, Aleksey Shipilev wrote: > For Epsilon, we have added `log_warning` messages when heap size and AlwaysPreTouch configuration is not great with [JDK-8232051](https://bugs.openjdk.org/browse/JDK-8232051). Unfortunately, this means we print this warning all the time, even though users might not actually run into problems there, or when users tried to implement these suggestions and still decided to run against them. > > I think we want to emit the suggestions in the normal GC log instead, so they are not printed all the time. Additionally, I moved the printing at the end of init logger, so it does not come in the middle of `gc,init` block. > > Output before: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.008s][info][gc] Using Epsilon > [0.008s][warning][gc,init] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.008s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.757s][info ][gc ] Heap: 1024M reserved, 129M (12.60%) committed, 23906K (2.28%) used > > > Output after: > > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC Hello.java > Hello! > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xlog:gc Hello.java > [0.009s][info][gc] Using Epsilon > [0.009s][info][gc] Consider setting -Xms equal to -Xmx to avoid resizing hiccups > [0.009s][info][gc] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups > Hello! > [0.753s][info][gc] Heap: 1024M reserved, 129M (12.60%) committed, 23908K (2.28%) used This pull request has now been integrated. Changeset: 1a0fe497 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/1a0fe49732187db9e8776f80feefab4373114f75 Stats: 24 lines in 1 file changed: 12 ins; 12 del; 0 mod 8347256: Epsilon: Demote heap size and AlwaysPreTouch warnings to info level Reviewed-by: tschatzl, phh ------------- PR: https://git.openjdk.org/jdk/pull/22965 From kbarrett at openjdk.org Fri Jan 10 11:25:55 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 10 Jan 2025 11:25:55 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds Message-ID: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Please review this change to PSStripeShadowCardTable to avoid several examples of UB in it's internal calculations. We avoid the UB by switching to the integer domain (using uintptr_t) for all of the internal calculations, with casts between pointers and uintptr_t as needed at the boundaries. This applies not just to the various pointer adjustments, but also to pointer comparisons. In particular, the prior range check assertions using pointer comparisons could have been partially or even completely "optimized" away based on the no-UB assumption. Testing: mach5 tier1-5 local (linux-x64) tier1 with -XX:+UseParallelGC ------------- Commit messages: - avoid UB Changes: https://git.openjdk.org/jdk/pull/23032/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23032&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346971 Stats: 35 lines in 1 file changed: 29 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/23032.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23032/head:pull/23032 PR: https://git.openjdk.org/jdk/pull/23032 From szaldana at redhat.com Fri Jan 10 14:15:34 2025 From: szaldana at redhat.com (Sonia Zaldana Calles) Date: Fri, 10 Jan 2025 09:15:34 -0500 Subject: GCOverheadLimit support for G1 Message-ID: Hi folks, Upon migration from ParallelGC to G1, we have a report that G1 is showcasing a slow death when too much time is spent in garbage collection, in contrast to ParallelGC, which would trigger an ?OutOfMemoryError: GC Overhead limit exceeded?. Note the single class reproducer below [0]. Running with java ... -Xlog:gc -Xmx4G, we can observe long pauses (~3-4 seconds on my machine), the VM attempts ~20 Full GC cycles where the last full GCs take a lot longer than the pause time goal of 200ms. Ideally, we would like the JVM to stop trying at some point early (similarly to ParallelGC) and we have not found a way to accomplish that. We found this is likely because UseGCOverheadLimit is only supported (and enabled by default) for the ParallelGC. We came across JDK-8212084 and we were wondering if there was a particular reason this didn?t move forward? [1] Is there anything we can do to help? [0] import java.util.LinkedList; public class GCOverheadReproducer { private static final LinkedList fixedData = new LinkedList(); private static final int FIXED_DATA_ITEM_SIZE = 32; private static final int TEMPORARY_DATA_ITEM_SIZE = FIXED_DATA_ITEM_SIZE / 2 - 1; public static void main(String[] args) throws InterruptedException { System.out.println("Consuming all memory"); while (true) { try { fixedData.add(new byte[FIXED_DATA_ITEM_SIZE]); } catch (OutOfMemoryError oome) { System.out.println(oome); System.out.printf("OOME triggered. Releasing %s bytes of memory.\n", FIXED_DATA_ITEM_SIZE); fixedData.removeLast(); break; } } System.out.println("Running allocate-release loop"); while (true) { Object temporaryData = new byte[TEMPORARY_DATA_ITEM_SIZE]; } } } [1] https://bugs.openjdk.org/browse/JDK-8212084 Thanks, Sonia -- Sonia Zalda?a Calles Software Engineer, OpenJDK Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From xpeng at openjdk.org Fri Jan 10 17:08:10 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 10 Jan 2025 17:08:10 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v5] In-Reply-To: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: > Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. > > I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. > > GenShen: > Before: > > [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) > > > After: > > [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) > [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) > > > Shenandoah: > Before: > > [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) > > After: > > [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) > [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) > > > Additional changes: > * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. > * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: > - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 > - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. > * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. > * Clean up FullGC code, remove duplicate code. > > Additional tests: > - [x] CONF=macosx-aarch64-server-fastdebug make test T... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: - Merge branch 'openjdk:master' into reset-bitmap - Adding condition "!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress()" back and address some PR comments - Remove entry_reset_after_collect from ShenandoahOldGC - Remove condition check !_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress() from op_reset_after_collect - Merge branch 'openjdk:master' into reset-bitmap - Address review comments - Merge branch 'openjdk:master' into reset-bitmap - Remove ShenandoahResetUpdateRegionStateClosure - Always set_mark_incomplete when reset mark bitmap - Fix - ... and 11 more: https://git.openjdk.org/jdk/compare/15c49f96...5a181473 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22778/files - new: https://git.openjdk.org/jdk/pull/22778/files/04299a76..5a181473 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=03-04 Stats: 20964 lines in 584 files changed: 5664 ins; 12721 del; 2579 mod Patch: https://git.openjdk.org/jdk/pull/22778.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22778/head:pull/22778 PR: https://git.openjdk.org/jdk/pull/22778 From wkemper at openjdk.org Fri Jan 10 19:35:46 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 10 Jan 2025 19:35:46 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v2] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com>

Message-ID: On Fri, 10 Jan 2025 00:40:26 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: >> >> - Improve comments >> - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint >> - Fix comments >> - Fix comment, revert unnecessary change >> - Merge remote-tracking branch 'jdk/master' into remove-init-update-refs-safepoint >> - Fix phase encoding to handle weak roots >> - WIP: Use Threads::threads_do for propagating gc state (consolidated) >> - WIP: Use Threads::threads_do for propagating gc state >> - Remove unnecessary gc state propagations >> - Encapsulate gc state >> - ... and 22 more: https://git.openjdk.org/jdk/compare/34b4faa5...83ac7b49 > > src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 368: > >> 366: // This updates the singular, global gc state. This call must happen on a safepoint. >> 367: // However, in some cases (init update refs, e.g.), the gc state may change concurrently >> 368: // and will be propagated to all threads by a handshake operation. > > I am a little bit confused by the statement starting at "However, ...". Did you mean that the "local copy of the global state" may be changed outside of a safepoint but not the global state itself? > I notice that `set_gc_state()` still asserts that we are at a safepoint: > https://github.com/openjdk/jdk/blob/83ac7b49d34081beb3ff58f1c159d22faacd077a/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L2000 > > Ah, now I see that you use a different API for setting the global gc state outside of a safepoint. > > If my understanding is correct, then we should probably rename the APIs such that the one that is expected to be set at a safepoint uses `set_gc_state_at_safepoint()` and the one that doesn't might use `set_gc_state_concurrent()` or something like that. That would be less confusing. > > It also brings up the issue of what specific state predicates it's safe to test when. E.g. whether `is_gc_state()` can be safely tested any time during a safepoint or concurrently. I think it is safe, but explicitly stating this might be useful, not least because we seem to have one state change API that still asserts that we should be at a safepoint. It is always safe to call `is_gc_state` because it uses the most recently changed value. That is, if the `_gc_state` changes on a safepoint, `is_gc_state` will use the value set during the safepoint. Otherwise, it uses the value which was either changed concurrently through the thread local handshake (init-update-refs) or the value which was propagated to all threads at the end of the safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1911094315 From wkemper at openjdk.org Fri Jan 10 19:49:17 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 10 Jan 2025 19:49:17 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: > Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. > > The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. > > Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Improve comments and method names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22688/files - new: https://git.openjdk.org/jdk/pull/22688/files/83ac7b49..89c20a14 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=01-02 Stats: 14 lines in 2 files changed: 2 ins; 2 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/22688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22688/head:pull/22688 PR: https://git.openjdk.org/jdk/pull/22688 From kirk at kodewerk.com Fri Jan 10 20:12:29 2025 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Fri, 10 Jan 2025 21:12:29 +0100 Subject: [Discussion] Serial GC: Expand young generation size In-Reply-To: References: Message-ID: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> Hi Albert, We are working on adding automated heap sizing (AHS) to the serial collector. One of the issues that we?re looking at is how to expand (or contract) heap as needed. One of the challenges in expanding young gen is the location of data in the heap after a full collection. One of the experiments that we?re currently working on is a rearrangement of the heap so that young can be expanded after a full collection without the need to copy all of the surviving data to accommodate an enlargement of young. Kind regards, Kirk > On Jan 8, 2025, at 11:27?AM, Albert Yang wrote: > > Re https://bugs.openjdk.org/browse/JDK-8333386, I think your suggestion, "add the option `-XX:NewSize=65m`", is the way to go. > > As for adding young-gen expansion support to Serial, it probably should have its own enhancement ticket. > > I am currently working on placing from/to spaces before eden inside young-gen as part of Parallel heap-auto-sizing. I believe Serial can use the same layout (from/to/eden, instead of eden/from/to) to facilitate eden/young-gen expansion. My 2c. > > /Albert > > ________________________________________ > From: hotspot-gc-dev on behalf of Guoxiong Li > Sent: Wednesday, January 8, 2025 09:30 > To: hotspot-gc-dev at openjdk.org > Subject: [Discussion] Serial GC: Expand young generation size > > Hi all, > > Currently, the young generation in SerialGC can't be expanded now > and is always the initial young generation size. It is not a very big problem generally. > But in some situations, like JDK-8333386 [1], it will crash the VM (unnecessary OutOfMemoryError). > > So I want to implement the feature to expand the young generation size > (absolutely, not exceeding the max young generation size). > What do you think about it? Any ideas will be appreciated. > > Best Regards, > -- Guoxiong > > Related links: > [1] https://bugs.openjdk.org/browse/JDK-8333386 > [2] https://bugs.openjdk.org/browse/JDK-8335925 > From ysr at openjdk.org Sat Jan 11 02:04:49 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Sat, 11 Jan 2025 02:04:49 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> On Fri, 10 Jan 2025 19:49:17 GMT, William Kemper wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Improve comments and method names Left a few more comments, and will make one more final readthrough and approve. Thanks for your continued patience with my slow review! src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 196: > 194: > 195: // Evacuation is complete, retire gc labs > 196: heap->concurrent_prepare_for_update_refs(); For consistency with other related method naming, can we use "updaterefs" instead of "update_refs" (makes IDE searches easier to locate related methods). src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1245: > 1243: void do_thread(Thread* thread) override { > 1244: _propagator.do_thread(thread); > 1245: if (ShenandoahThreadLocalData::gclab(thread) != nullptr) { Which thread may have this be null? (I am looking at the ShenandoahRetireGCLabClosure which insists that this should be non-null.) I assume we have some threads here that have a gc state that must be updated but which don't have a gc lab. I am wondering if the check for an initialized gclab and in the generational case the plab can be pushed down into the closure rather than being exposed here. At that place, we would want to document (or as needed assert) why some threads targeted by the closure may have null gclab or plab. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1267: > 1265: > 1266: // This will propagate the gc state and retire gclabs and plabs for threads that require it. > 1267: ShenandoahPrepareForUpdateRefs prepare_for_update_refs(_gc_state.raw_value()); In looking at this I see that we do not set `_gc_state_changed` here because we don't want individual threads to observe the global state, but only their local state (when it's propagated below). It would be good to emphasise this in the documetation of `_gc_state_changed` use protocol. Indeed, as I had suggested before, I think this might be better encapsulated with a `set_gc_state_concurrent()` that is analogous to `set_gc_state_at_safepoint()` that takes the appropriate state value as an argument, and uses the appropriate `_gc_state_changed` protocol. IIUC, this will be re-used when other safepoints are eliminated in the future. ------------- PR Review: https://git.openjdk.org/jdk/pull/22688#pullrequestreview-2544388098 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1911793884 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1911797849 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1911803283 From thomas.schatzl at oracle.com Mon Jan 13 09:24:50 2025 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 13 Jan 2025 10:24:50 +0100 Subject: GCOverheadLimit support for G1 In-Reply-To: References: Message-ID: Hi Sonia, On 10.01.25 15:15, Sonia Zaldana Calles wrote: > Hi folks, > > > Upon migration from ParallelGC to G1, we have a report that G1 is > showcasing a slow death when too much time is spent in garbage > collection, in contrast to ParallelGC, which would trigger an > ?OutOfMemoryError: GC Overhead limit exceeded?. > > > Note the single class reproducer below [0]. Running with java ... - > Xlog:gc -Xmx4G, we can observe long pauses (~3-4 seconds on my machine), > the VM attempts ~20 Full GC cycles where the last full GCs take a lot > longer than the pause time goal of 200ms. Ideally, we would like the JVM > to stop trying at some point early (similarly to ParallelGC) and we have > not found a way to accomplish that. > > > We found this is likely because UseGCOverheadLimit is only supported > (and enabled by default) for the ParallelGC. We came across JDK-8212084 > and we were wondering if there was a particular reason this didn?t move > forward? [1] Is there anything we can do to help? > > as most of the time, I do not think there is a particular reason this work has not been completed. Just as the comments in that CR indicate, that the original problem went away for them and interest has been lost. Hth, Thomas From thomas.schatzl at oracle.com Mon Jan 13 09:28:47 2025 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 13 Jan 2025 10:28:47 +0100 Subject: GCOverheadLimit support for G1 In-Reply-To: References:

Message-ID: Hi again, On 13.01.25 10:24, Thomas Schatzl wrote: > Hi Sonia, > > On 10.01.25 15:15, Sonia Zaldana Calles wrote: >> Hi folks, >> >> >> Upon migration from ParallelGC to G1, we have a report that G1 is >> showcasing a slow death when too much time is spent in garbage >> collection, in contrast to ParallelGC, which would trigger an >> ?OutOfMemoryError: GC Overhead limit exceeded?. >> >> >> Note the single class reproducer below [0]. Running with java ... - >> Xlog:gc -Xmx4G, we can observe long pauses (~3-4 seconds on my >> machine), the VM attempts ~20 Full GC cycles where the last full GCs >> take a lot longer than the pause time goal of 200ms. Ideally, we would >> like the JVM to stop trying at some point early (similarly to >> ParallelGC) and we have not found a way to accomplish that. >> >> >> We found this is likely because UseGCOverheadLimit is only supported >> (and enabled by default) for the ParallelGC. We came across >> JDK-8212084 and we were wondering if there was a particular reason >> this didn?t move forward? [1] Is there anything we can do to help? >> >> > > ? as most of the time, I do not think there is a particular reason this > work has not been completed. Just as the comments in that CR indicate, > that the original problem went away for them and interest has been lost. > Drop the last sentence - having re-read the discussion at https://mail.openjdk.org/pipermail/hotspot-gc-dev/2018-October/023525.html, I do not think there is a particular reason why it has not been implemented apart from losing interest. Thomas From rcastanedalo at openjdk.org Mon Jan 13 09:33:59 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 13 Jan 2025 09:33:59 GMT Subject: RFR: 8345067: C2: enable implicit null checks for ZGC reads In-Reply-To: References: Message-ID: On Wed, 11 Dec 2024 09:59:44 GMT, Roberto Casta?eda Lozano wrote: > Currently, C2 cannot exploit late-expanded GC memory accesses as implicit null checks because of their use of temporary operands, which prevents `PhaseCFG::implicit_null_check` from [hoisting the memory accesses to the test basic block](https://github.com/openjdk/jdk/blob/f88c1c6ff86b8f29a71647e46136b6432bb67619/src/hotspot/share/opto/lcm.cpp#L319-L335). > > This changeset extends the scope of the implicit null check optimization so that it can exploit ZGC object loads. It introduces a platform-dependent predicate (`MachNode::has_initial_implicit_null_check_candidate`) to mark late-expanded instructions that emit a suitable memory access as a first instruction as candidates, and extends the optimization to recognize and hoist candidate memory accesses that use temporary operands: > > ![example](https://github.com/user-attachments/assets/b5f9bbc8-d75d-4cf3-841e-73db3dbae753) > > Exploiting ZGC loads increases the effectiveness of the implicit null check optimization (measured in percent of explicit null checks turned into implicit ones at compile time) by around 10% in the DaCapo chopin benchmarks: > > ![C2-inc-hit-rate-jdk-25+1-vs-jdk-25+1-with-8345067](https://github.com/user-attachments/assets/8d114058-c6b2-4254-a374-0d0b220af718) > > The larger number of implicit null checks results in slight performance improvements (in the 1-2% range) in a few DaCapo and SPECjvm2008 benchmarks and an overall slight improvement across Renaissance benchmarks. > > A further extension of the optimization to arbitrary memory access instructions (including e.g. G1 object stores, which emit multiple memory accesses at arbitrary address offsets) will be investigated separately as part of [JDK-8344627](https://bugs.openjdk.org/browse/JDK-8344627). > > #### Testing > - tier1-5, compiler stress test (linux-x64, macosx-x64, windows-x64, linux-aarch64, macosx-aarch64; release and debug mode). Putting this PR on hold until the related RFE [JDK-8341611](https://bugs.openjdk.org/browse/JDK-8341611) (with PR https://github.com/openjdk/jdk/pull/22862 under review) is resolved. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22678#issuecomment-2586601599 From tschatzl at openjdk.org Mon Jan 13 09:35:57 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Jan 2025 09:35:57 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds In-Reply-To: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: On Fri, 10 Jan 2025 11:21:31 GMT, Kim Barrett wrote: > Please review this change to PSStripeShadowCardTable to avoid several examples > of UB in it's internal calculations. We avoid the UB by switching to the > integer domain (using uintptr_t) for all of the internal calculations, with > casts between pointers and uintptr_t as needed at the boundaries. > > This applies not just to the various pointer adjustments, but also to pointer > comparisons. In particular, the prior range check assertions using pointer > comparisons could have been partially or even completely "optimized" away > based on the no-UB assumption. > > Testing: mach5 tier1-5 > local (linux-x64) tier1 with -XX:+UseParallelGC src/hotspot/share/gc/parallel/psCardTable.cpp line 129: > 127: // Avoid UB pointer operations by using integers internally. > 128: > 129: static_assert(sizeof(intptr_t) == sizeof(CardValue*), "simplifying assumption"); Why check `sizeof(intptr_t)` instead of `uintptr_t` here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23032#discussion_r1912898906 From tschatzl at openjdk.org Mon Jan 13 10:22:35 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Jan 2025 10:22:35 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds In-Reply-To: References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: On Mon, 13 Jan 2025 09:33:30 GMT, Thomas Schatzl wrote: >> Please review this change to PSStripeShadowCardTable to avoid several examples >> of UB in it's internal calculations. We avoid the UB by switching to the >> integer domain (using uintptr_t) for all of the internal calculations, with >> casts between pointers and uintptr_t as needed at the boundaries. >> >> This applies not just to the various pointer adjustments, but also to pointer >> comparisons. In particular, the prior range check assertions using pointer >> comparisons could have been partially or even completely "optimized" away >> based on the no-UB assumption. >> >> Testing: mach5 tier1-5 >> local (linux-x64) tier1 with -XX:+UseParallelGC > > src/hotspot/share/gc/parallel/psCardTable.cpp line 129: > >> 127: // Avoid UB pointer operations by using integers internally. >> 128: >> 129: static_assert(sizeof(intptr_t) == sizeof(CardValue*), "simplifying assumption"); > > Why check `sizeof(intptr_t)` instead of `uintptr_t` here? I mean, the change uses `uintptr_t` throughput, and probably the size of `intptr_t` and `uintptr_t` must be(?) the same, but why not check the actually used type? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23032#discussion_r1912960879 From albert.m.yang at oracle.com Mon Jan 13 10:27:13 2025 From: albert.m.yang at oracle.com (Albert Yang) Date: Mon, 13 Jan 2025 10:27:13 +0000 Subject: [External] : Re: [Discussion] Serial GC: Expand young generation size In-Reply-To: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> References: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> Message-ID: Hi Kirk, > a rearrangement of the heap so that young can be expanded after a full collection without the need to copy all of the surviving data to accommodate an enlargement of young Can you elaborate on how you rearranged it? My previous suggestion (from/to before eden) is also a type of rearrangement. I'm curious about how the two compare. /Albert ________________________________________ From: Kirk Pepperdine Sent: Friday, January 10, 2025 21:12 To: Albert Yang Cc: Guoxiong Li; hotspot-gc-dev at openjdk.org Subject: [External] : Re: [Discussion] Serial GC: Expand young generation size Hi Albert, We are working on adding automated heap sizing (AHS) to the serial collector. One of the issues that we?re looking at is how to expand (or contract) heap as needed. One of the challenges in expanding young gen is the location of data in the heap after a full collection. One of the experiments that we?re currently working on is a rearrangement of the heap so that young can be expanded after a full collection without the need to copy all of the surviving data to accommodate an enlargement of young. Kind regards, Kirk > On Jan 8, 2025, at 11:27?AM, Albert Yang wrote: > > Re https://bugs.openjdk.org/browse/JDK-8333386, I think your suggestion, "add the option `-XX:NewSize=65m`", is the way to go. > > As for adding young-gen expansion support to Serial, it probably should have its own enhancement ticket. > > I am currently working on placing from/to spaces before eden inside young-gen as part of Parallel heap-auto-sizing. I believe Serial can use the same layout (from/to/eden, instead of eden/from/to) to facilitate eden/young-gen expansion. My 2c. > > /Albert > > ________________________________________ > From: hotspot-gc-dev on behalf of Guoxiong Li > Sent: Wednesday, January 8, 2025 09:30 > To: hotspot-gc-dev at openjdk.org > Subject: [Discussion] Serial GC: Expand young generation size > > Hi all, > > Currently, the young generation in SerialGC can't be expanded now > and is always the initial young generation size. It is not a very big problem generally. > But in some situations, like JDK-8333386 [1], it will crash the VM (unnecessary OutOfMemoryError). > > So I want to implement the feature to expand the young generation size > (absolutely, not exceeding the max young generation size). > What do you think about it? Any ideas will be appreciated. > > Best Regards, > -- Guoxiong > > Related links: > [1] https://bugs.openjdk.org/browse/JDK-8333386 > [2] https://bugs.openjdk.org/browse/JDK-8335925 > From kbarrett at openjdk.org Mon Jan 13 10:42:33 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 10:42:33 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: > Please review this change to PSStripeShadowCardTable to avoid several examples > of UB in it's internal calculations. We avoid the UB by switching to the > integer domain (using uintptr_t) for all of the internal calculations, with > casts between pointers and uintptr_t as needed at the boundaries. > > This applies not just to the various pointer adjustments, but also to pointer > comparisons. In particular, the prior range check assertions using pointer > comparisons could have been partially or even completely "optimized" away > based on the no-UB assumption. > > Testing: mach5 tier1-5 > local (linux-x64) tier1 with -XX:+UseParallelGC Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: typo: intptr_t => uintptr_t ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23032/files - new: https://git.openjdk.org/jdk/pull/23032/files/374add7f..58c704f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23032&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23032&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23032.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23032/head:pull/23032 PR: https://git.openjdk.org/jdk/pull/23032 From kbarrett at openjdk.org Mon Jan 13 10:42:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 10:42:34 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com>

Message-ID: On Mon, 13 Jan 2025 10:19:47 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/parallel/psCardTable.cpp line 129: >> >>> 127: // Avoid UB pointer operations by using integers internally. >>> 128: >>> 129: static_assert(sizeof(intptr_t) == sizeof(CardValue*), "simplifying assumption"); >> >> Why check `sizeof(intptr_t)` instead of `uintptr_t` here? > > I mean, the change uses `uintptr_t` throughput, and probably the size of `intptr_t` and `uintptr_t` must be(?) the same, but why not check the actually used type? typo - fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23032#discussion_r1912984142 From tschatzl at openjdk.org Mon Jan 13 11:29:41 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Jan 2025 11:29:41 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: <7sweSVqwZcmnqFepnd4Cnl9nJCbuIfreU-mZt3n0a6I=.695c61d4-b2e3-4bc0-be69-827d1c79fbb6@github.com> On Mon, 13 Jan 2025 10:42:33 GMT, Kim Barrett wrote: >> Please review this change to PSStripeShadowCardTable to avoid several examples >> of UB in it's internal calculations. We avoid the UB by switching to the >> integer domain (using uintptr_t) for all of the internal calculations, with >> casts between pointers and uintptr_t as needed at the boundaries. >> >> This applies not just to the various pointer adjustments, but also to pointer >> comparisons. In particular, the prior range check assertions using pointer >> comparisons could have been partially or even completely "optimized" away >> based on the no-UB assumption. >> >> Testing: mach5 tier1-5 >> local (linux-x64) tier1 with -XX:+UseParallelGC > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > typo: intptr_t => uintptr_t Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23032#pullrequestreview-2546358162 From ayang at openjdk.org Mon Jan 13 12:27:44 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 13 Jan 2025 12:27:44 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: On Mon, 13 Jan 2025 10:42:33 GMT, Kim Barrett wrote: >> Please review this change to PSStripeShadowCardTable to avoid several examples >> of UB in it's internal calculations. We avoid the UB by switching to the >> integer domain (using uintptr_t) for all of the internal calculations, with >> casts between pointers and uintptr_t as needed at the boundaries. >> >> This applies not just to the various pointer adjustments, but also to pointer >> comparisons. In particular, the prior range check assertions using pointer >> comparisons could have been partially or even completely "optimized" away >> based on the no-UB assumption. >> >> Testing: mach5 tier1-5 >> local (linux-x64) tier1 with -XX:+UseParallelGC > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > typo: intptr_t => uintptr_t src/hotspot/share/gc/parallel/psCardTable.cpp line 149: > 147: assert(iaddr(card) >= iaddr(_table), "out of bounds"); > 148: assert(iaddr(card) <= (iaddr(_table) + sizeof(_table)), "out of bounds"); > 149: } The two impls look identical to me. Also, can you change `check` to `verify` to make it more explicit that they are for verification only? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23032#discussion_r1913094129 From kbarrett at openjdk.org Mon Jan 13 15:19:52 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 15:19:52 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com>

Message-ID: On Mon, 13 Jan 2025 12:12:17 GMT, Albert Mingkun Yang wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> typo: intptr_t => uintptr_t > > src/hotspot/share/gc/parallel/psCardTable.cpp line 149: > >> 147: assert(iaddr(card) >= iaddr(_table), "out of bounds"); >> 148: assert(iaddr(card) <= (iaddr(_table) + sizeof(_table)), "out of bounds"); >> 149: } > > The two impls look identical to me. Also, can you change `check` to `verify` to make it more explicit that they are for verification only? They certainly used to be different; not sure how that crept in. I'll push after rerunning tests. Rename suggestion is fine, and adopted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23032#discussion_r1913358972 From lgxbslgx at gmail.com Mon Jan 13 15:52:08 2025 From: lgxbslgx at gmail.com (Guoxiong Li) Date: Mon, 13 Jan 2025 23:52:08 +0800 Subject: [Discussion] Serial GC: Expand young generation size In-Reply-To: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> References: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> Message-ID: Hi Albert, Sorry for any delay. I bisect the related patches in several days. > Re https://bugs.openjdk.org/browse/JDK-8333386, I think your suggestion, "add the option `-XX:NewSize=65m`", is the way to go. When I run the test case `TestAbortOnVMOperationTimeout` [1] in current main-line code, the test case passed in the client VM. I bisected all the related patches these months and found the change (the test case passed) began at JDK-8333786 [2]. Then I tried partial code of JDK-8333786 [2] in its previous patch (the code is in [3]), and finally I found the method `SerialHeap::should_try_older_generation_allocation` [4] in JDK-8333786 [2] makes the test case pass in the client VM. It is because such a change can make the `DefNewGeneration::compute_new_size` effective, and then the young-gen size can be expanded after the GC. So I think the JDK-8333386 [1] is not a bug now and the ticket can be closed. What do you think about it? > As for adding young-gen expansion support to Serial, it probably should have its own enhancement ticket. > > I am currently working on placing from/to spaces before eden inside young-gen as part of Parallel heap-auto-sizing. I believe Serial can use the same layout (from/to/eden, instead of eden/from/to) to facilitate eden/young-gen expansion. My 2c. Yes, the expansion should have its ticket and need more investigation and discussion. And this email line is the beginning of the discussion. Best Regards, -- Guoxiong [1] https://bugs.openjdk.org/browse/JDK-8333386 [2] https://bugs.openjdk.org/browse/JDK-8333786 [3] https://bugs.openjdk.org/browse/JDK- 8335007 [4] https://github.com/openjdk/jdk/commit/6b961acb87c29027f2158c6b7a764f1276a0bf52#diff-d53bf148d2758636cb8c6a54595515610a4c52d953e23e7bfdeef1106c14f626R286-R290 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wkemper at openjdk.org Mon Jan 13 18:22:38 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 18:22:38 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> Message-ID: On Sat, 11 Jan 2025 01:50:39 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comments and method names > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1245: > >> 1243: void do_thread(Thread* thread) override { >> 1244: _propagator.do_thread(thread); >> 1245: if (ShenandoahThreadLocalData::gclab(thread) != nullptr) { > > Which thread may have this be null? (I am looking at the ShenandoahRetireGCLabClosure which insists that this should be non-null.) > > I assume we have some threads here that have a gc state that must be updated but which don't have a gc lab. > > I am wondering if the check for an initialized gclab and in the generational case the plab can be pushed down into the closure rather than being exposed here. At that place, we would want to document (or as needed assert) why some threads targeted by the closure may have null gclab or plab. Only worker threads and java threads are required to have gclabs. In other use cases (`shHeap::make_labs_parsable`, `shHeap::retire_gclabs`), this closure is _only_ used on java and worker threads, so pushing the test into the closure would be redundant for other uses. I will put in a comment here instead? Additionally, I noticed an inconsistency between `make_labs_parsable` (which skips the safepoint workers) and `retire_gclabs` (which also visited the control thread). An earlier version of this PR had given gclabs to the control and vm threads, but these threads will only perform evacuations in very rare circumstances, so I've removed their gclabs. > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1267: > >> 1265: >> 1266: // This will propagate the gc state and retire gclabs and plabs for threads that require it. >> 1267: ShenandoahPrepareForUpdateRefs prepare_for_update_refs(_gc_state.raw_value()); > > In looking at this I see that we do not set `_gc_state_changed` here because we don't want individual threads to observe the global state, but only their local state (when it's propagated below). It would be good to emphasise this in the documetation of `_gc_state_changed` use protocol. > > Indeed, as I had suggested before, I think this might be better encapsulated with a `set_gc_state_concurrent()` that is analogous to `set_gc_state_at_safepoint()` that takes the appropriate state value as an argument, and uses the appropriate `_gc_state_changed` protocol. > > IIUC, this will be re-used when other safepoints are eliminated in the future. I'll encapsulate the access here and improve the documentation for `_gc_state_changed`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1913617339 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1913620624 From wkemper at openjdk.org Mon Jan 13 18:32:47 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 18:32:47 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> Message-ID: On Sat, 11 Jan 2025 01:35:06 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comments and method names > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 196: > >> 194: >> 195: // Evacuation is complete, retire gc labs >> 196: heap->concurrent_prepare_for_update_refs(); > > For consistency with other related method naming, can we use "updaterefs" instead of "update_refs" (makes IDE searches easier to locate related methods). I'm all for making this consistent, but it seems that `update_refs` is more commonly used in method and variable declarations: [0] % grep -r --include "*.hpp" updaterefs src/hotspot/share/gc/shenandoah | wc -l 17 [0] % grep -r --include "*.hpp" update_refs src/hotspot/share/gc/shenandoah | wc -l 27 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1913637958 From wkemper at openjdk.org Mon Jan 13 18:36:44 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 18:36:44 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> Message-ID: On Mon, 13 Jan 2025 18:29:52 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 196: >> >>> 194: >>> 195: // Evacuation is complete, retire gc labs >>> 196: heap->concurrent_prepare_for_update_refs(); >> >> For consistency with other related method naming, can we use "updaterefs" instead of "update_refs" (makes IDE searches easier to locate related methods). > > I'm all for making this consistent, but it seems that `update_refs` is more commonly used in method and variable declarations: > > [0] % grep -r --include "*.hpp" updaterefs src/hotspot/share/gc/shenandoah | wc -l > 17 > > [0] % grep -r --include "*.hpp" update_refs src/hotspot/share/gc/shenandoah | wc -l > 27 If it's okay with you, I will do this on a separate PR so that this current PR is not cluttered by the change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1913642618 From wkemper at openjdk.org Mon Jan 13 18:45:23 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 18:45:23 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v4] In-Reply-To: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: > Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. > > The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. > > Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. William Kemper has updated the pull request incrementally with two additional commits since the last revision: - Encapsulate and document a method for making concurrent gc_state changes - Control thread doesn't need a gc lab, also make gclabs for safepoint workers parsable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22688/files - new: https://git.openjdk.org/jdk/pull/22688/files/89c20a14..26e382c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22688&range=02-03 Stats: 21 lines in 2 files changed: 14 ins; 3 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/22688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22688/head:pull/22688 PR: https://git.openjdk.org/jdk/pull/22688 From wkemper at openjdk.org Mon Jan 13 18:55:50 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 18:55:50 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com>

Message-ID: On Mon, 13 Jan 2025 18:34:06 GMT, William Kemper wrote: >> I'm all for making this consistent, but it seems that `update_refs` is more commonly used in method and variable declarations: >> >> [0] % grep -r --include "*.hpp" updaterefs src/hotspot/share/gc/shenandoah | wc -l >> 17 >> >> [0] % grep -r --include "*.hpp" update_refs src/hotspot/share/gc/shenandoah | wc -l >> 27 > > If it's okay with you, I will do this on a separate PR so that this current PR is not cluttered by the change. https://bugs.openjdk.org/browse/JDK-8347617 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1913666119 From wkemper at openjdk.org Mon Jan 13 20:13:19 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Jan 2025 20:13:19 GMT Subject: RFR: 8347620: Shenandoah: Use 'free' tag for free set related logging Message-ID: Without a distinguishing tag, debug logging is too voluminous to enable when we really only want the free set's debug messages. ------------- Commit messages: - Use 'free' tag with free set messages Changes: https://git.openjdk.org/jdk/pull/23086/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23086&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347620 Stats: 78 lines in 1 file changed: 7 ins; 7 del; 64 mod Patch: https://git.openjdk.org/jdk/pull/23086.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23086/head:pull/23086 PR: https://git.openjdk.org/jdk/pull/23086 From gli at openjdk.org Tue Jan 14 07:55:45 2025 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 14 Jan 2025 07:55:45 GMT Subject: RFR: 8331723: Serial: Remove the unused parameter of the method SerialHeap::gc_prologue In-Reply-To: References:

Message-ID: On Mon, 13 May 2024 01:57:58 GMT, xiaotaonan wrote: >> Serial: Remove the unused parameter of the method SerialHeap::gc_prologue > > @lgxbslgx @xiaotaonan What about this patch? Kindly ping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19207#issuecomment-2589239444 From albert.m.yang at oracle.com Tue Jan 14 10:54:04 2025 From: albert.m.yang at oracle.com (Albert Yang) Date: Tue, 14 Jan 2025 10:54:04 +0000 Subject: [External] : Re: [Discussion] Serial GC: Expand young generation size In-Reply-To: References: <2DB4422C-73F3-4E27-B41B-6028A6BFF51E@kodewerk.com> Message-ID: Hi Guoxiong, Thank you for the archeology work -- seems that the simplification patch "fixes" the failure by accident. Then, JDK-8333386 can be closed as duplicate of JDK-8333786. /Albert ________________________________________ From: Guoxiong Li Sent: Monday, January 13, 2025 16:52 To: Albert Yang Cc: hotspot-gc-dev at openjdk.org Subject: [External] : Re: [Discussion] Serial GC: Expand young generation size Hi Albert, Sorry for any delay. I bisect the related patches in several days. > Re https://bugs.openjdk.org/browse/JDK-8333386, I think your suggestion, "add the option `-XX:NewSize=65m`", is the way to go. When I run the test case `TestAbortOnVMOperationTimeout` [1] in current main-line code, the test case passed in the client VM. I bisected all the related patches these months and found the change (the test case passed) began at JDK-8333786 [2]. Then I tried partial code of JDK-8333786 [2] in its previous patch (the code is in [3]), and finally I found the method `SerialHeap::should_try_older_generation_allocation` [4] in JDK-8333786 [2] makes the test case pass in the client VM. It is because such a change can make the `DefNewGeneration::compute_new_size` effective, and then the young-gen size can be expanded after the GC. So I think the JDK-8333386 [1] is not a bug now and the ticket can be closed. What do you think about it? > As for adding young-gen expansion support to Serial, it probably should have its own enhancement ticket. > > I am currently working on placing from/to spaces before eden inside young-gen as part of Parallel heap-auto-sizing. I believe Serial can use the same layout (from/to/eden, instead of eden/from/to) to facilitate eden/young-gen expansion. My 2c. Yes, the expansion should have its ticket and need more investigation and discussion. And this email line is the beginning of the discussion. Best Regards, -- Guoxiong [1] https://bugs.openjdk.org/browse/JDK-8333386 [2] https://bugs.openjdk.org/browse/JDK-8333786 [3] https://bugs.openjdk.org/browse/JDK-8335007 [4] https://github.com/openjdk/jdk/commit/6b961acb87c29027f2158c6b7a764f1276a0bf52#diff-d53bf148d2758636cb8c6a54595515610a4c52d953e23e7bfdeef1106c14f626R286-R290 From kbarrett at openjdk.org Tue Jan 14 16:27:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 16:27:13 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v3] In-Reply-To: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: <0fqcml_YX_eihw0qPEuNsTPpW2pyvXJJYUZGjyaEutc=.b577fa4e-99a2-45f6-8fed-79fe64f39525@github.com> > Please review this change to PSStripeShadowCardTable to avoid several examples > of UB in it's internal calculations. We avoid the UB by switching to the > integer domain (using uintptr_t) for all of the internal calculations, with > casts between pointers and uintptr_t as needed at the boundaries. > > This applies not just to the various pointer adjustments, but also to pointer > comparisons. In particular, the prior range check assertions using pointer > comparisons could have been partially or even completely "optimized" away > based on the no-UB assumption. > > Testing: mach5 tier1-5 > local (linux-x64) tier1 with -XX:+UseParallelGC Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into pscardtable-ubsan - fix exclusive check, rename to verify - typo: intptr_t => uintptr_t - avoid UB ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23032/files - new: https://git.openjdk.org/jdk/pull/23032/files/58c704f7..a6dbfeda Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23032&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23032&range=01-02 Stats: 11917 lines in 695 files changed: 3795 ins; 5185 del; 2937 mod Patch: https://git.openjdk.org/jdk/pull/23032.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23032/head:pull/23032 PR: https://git.openjdk.org/jdk/pull/23032 From ayang at openjdk.org Tue Jan 14 17:01:52 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 14 Jan 2025 17:01:52 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v3] In-Reply-To: <0fqcml_YX_eihw0qPEuNsTPpW2pyvXJJYUZGjyaEutc=.b577fa4e-99a2-45f6-8fed-79fe64f39525@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> <0fqcml_YX_eihw0qPEuNsTPpW2pyvXJJYUZGjyaEutc=.b577fa4e-99a2-45f6-8fed-79fe64f39525@github.com> Message-ID: On Tue, 14 Jan 2025 16:27:13 GMT, Kim Barrett wrote: >> Please review this change to PSStripeShadowCardTable to avoid several examples >> of UB in it's internal calculations. We avoid the UB by switching to the >> integer domain (using uintptr_t) for all of the internal calculations, with >> casts between pointers and uintptr_t as needed at the boundaries. >> >> This applies not just to the various pointer adjustments, but also to pointer >> comparisons. In particular, the prior range check assertions using pointer >> comparisons could have been partially or even completely "optimized" away >> based on the no-UB assumption. >> >> Testing: mach5 tier1-5 >> local (linux-x64) tier1 with -XX:+UseParallelGC > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into pscardtable-ubsan > - fix exclusive check, rename to verify > - typo: intptr_t => uintptr_t > - avoid UB Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23032#pullrequestreview-2550445506 From coleenp at openjdk.org Tue Jan 14 18:45:27 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 14 Jan 2025 18:45:27 GMT Subject: RFR: 8347730: Replace SIZE_FORMAT in g1 Message-ID: Please review this change to replace SIZE_FORMAT with %zu in the g1 directory. Most edits were by script but SIZE_FORMAT_W(n) was done sort of by hand. Also fixed one place where preexisting formatting was bad. Tested all the GC changes together (other PRs coming) with tier1-4 on x86 and aarch64. ------------- Commit messages: - Fix squished message. - Replace SIZE_FORMAT in g1 gc Changes: https://git.openjdk.org/jdk/pull/23114/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23114&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347730 Stats: 177 lines in 37 files changed: 0 ins; 0 del; 177 mod Patch: https://git.openjdk.org/jdk/pull/23114.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23114/head:pull/23114 PR: https://git.openjdk.org/jdk/pull/23114 From kbarrett at openjdk.org Tue Jan 14 18:57:54 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 18:57:54 GMT Subject: RFR: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds [v2] In-Reply-To: <7sweSVqwZcmnqFepnd4Cnl9nJCbuIfreU-mZt3n0a6I=.695c61d4-b2e3-4bc0-be69-827d1c79fbb6@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> <7sweSVqwZcmnqFepnd4Cnl9nJCbuIfreU-mZt3n0a6I=.695c61d4-b2e3-4bc0-be69-827d1c79fbb6@github.com> Message-ID: On Mon, 13 Jan 2025 11:27:28 GMT, Thomas Schatzl wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> typo: intptr_t => uintptr_t > > Lgtm. Thanks for reviews @tschatzl and @albertnetymk ------------- PR Comment: https://git.openjdk.org/jdk/pull/23032#issuecomment-2590874334 From kbarrett at openjdk.org Tue Jan 14 18:57:55 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 18:57:55 GMT Subject: Integrated: 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds In-Reply-To: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> References: <9z8Cc4DgoD1x5QsvkcW6SptNYlEsiAr2V8M5eWvt7gk=.30a3dd75-782e-4a1c-86d9-08449c8c7d93@github.com> Message-ID: On Fri, 10 Jan 2025 11:21:31 GMT, Kim Barrett wrote: > Please review this change to PSStripeShadowCardTable to avoid several examples > of UB in it's internal calculations. We avoid the UB by switching to the > integer domain (using uintptr_t) for all of the internal calculations, with > casts between pointers and uintptr_t as needed at the boundaries. > > This applies not just to the various pointer adjustments, but also to pointer > comparisons. In particular, the prior range check assertions using pointer > comparisons could have been partially or even completely "optimized" away > based on the no-UB assumption. > > Testing: mach5 tier1-5 > local (linux-x64) tier1 with -XX:+UseParallelGC This pull request has now been integrated. Changeset: 4c30933b Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/4c30933b2ab92369d2da449ab3cd030b748e61fb Stats: 35 lines in 1 file changed: 29 ins; 0 del; 6 mod 8346971: [ubsan] psCardTable.cpp:131:24: runtime error: large index is out of bounds Reviewed-by: ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/23032 From ayang at openjdk.org Tue Jan 14 19:22:42 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 14 Jan 2025 19:22:42 GMT Subject: RFR: 8346572: Check is_reserved() before using ReservedSpace instances In-Reply-To: References: Message-ID: On Thu, 19 Dec 2024 09:35:33 GMT, Stefan Karlsson wrote: > There are a number of places where we reserve memory and create a ReservedSpace, and after the use the created instance without checking if the memory actually got reserved and the instance got initialized. This mostly affects code paths during JVM initialization and fixing this will mostly give better error handling and tracing. > > The patch also includes some minor restructuring to get early returns and remove redundant null checks after calls to new. > > Tested tier1 and GHA, but will run more tests when/if this gets accepted. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22825#pullrequestreview-2550823927 From xpeng at openjdk.org Tue Jan 14 19:25:16 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 14 Jan 2025 19:25:16 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v6] In-Reply-To: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: > Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. > > I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. > > GenShen: > Before: > > [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) > > > After: > > [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) > [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) > > > Shenandoah: > Before: > > [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) > > After: > > [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) > [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) > > > Additional changes: > * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. > * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: > - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 > - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. > * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. > * Clean up FullGC code, remove duplicate code. > > Additional tests: > - [x] CONF=macosx-aarch64-server-fastdebug make test T... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 22 additional commits since the last revision: - Merge branch 'openjdk:master' into reset-bitmap - Merge branch 'openjdk:master' into reset-bitmap - Adding condition "!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress()" back and address some PR comments - Remove entry_reset_after_collect from ShenandoahOldGC - Remove condition check !_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress() from op_reset_after_collect - Merge branch 'openjdk:master' into reset-bitmap - Address review comments - Merge branch 'openjdk:master' into reset-bitmap - Remove ShenandoahResetUpdateRegionStateClosure - Always set_mark_incomplete when reset mark bitmap - ... and 12 more: https://git.openjdk.org/jdk/compare/4692634b...9e7f342d ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22778/files - new: https://git.openjdk.org/jdk/pull/22778/files/5a181473..9e7f342d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22778&range=04-05 Stats: 10312 lines in 616 files changed: 3431 ins; 3917 del; 2964 mod Patch: https://git.openjdk.org/jdk/pull/22778.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22778/head:pull/22778 PR: https://git.openjdk.org/jdk/pull/22778 From ysr at openjdk.org Tue Jan 14 19:42:43 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 14 Jan 2025 19:42:43 GMT Subject: RFR: 8347620: Shenandoah: Use 'free' tag for free set related logging In-Reply-To: References: Message-ID: <9JVPoK0R2zACoySOT5h_uCekBiY7qm0ym53ybXK9ZwM=.5dc1a715-0093-4896-8ab2-bd3a68aca0b6@github.com> On Mon, 13 Jan 2025 20:07:03 GMT, William Kemper wrote: > Without a distinguishing tag, debug logging is too voluminous to enable when we really only want the free set's debug messages. Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23086#pullrequestreview-2550863968 From ysr at openjdk.org Tue Jan 14 20:29:28 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 14 Jan 2025 20:29:28 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v4] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> Message-ID: On Mon, 13 Jan 2025 18:45:23 GMT, William Kemper wrote: >> Shenandoah typically takes 4 safepoints per GC cycle. Although Shenandoah itself does not spend much time on these safepoints, it may still take quite some time for all of the mutator threads to reach the safepoint. The occasionally long time-to-safepoint increases latency in the higher percentiles. >> >> The `init-update-refs` safepoint is responsible for retiring GCLABs (and PLABs) used during evacuation. Once evacuation is complete, no threads will access these LABs. This need not be done on a safepoint. `init-update-refs` is also where the global and thread local copies of the `gc_state` are updated. However, here we are turning off the `WEAK_ROOTS` flag _after_ all of the unmarked weak referents have been `nulled` out, so this does not need to happen atomically with respect to the mutators. Neither is it necessary to change the other state flags (EVACUATION, UPDATE_REFS) atomically across all mutators. >> >> Note that the `init-update-refs` safepoint is still taken if either verification or `ShenandoahPacing` are enabled. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Encapsulate and document a method for making concurrent gc_state changes > - Control thread doesn't need a gc lab, also make gclabs for safepoint workers parsable Left a few comments/questions. Looks good modulo those comments. Thanks again for your patience. Rest looks good. No need for a further round from me even if you make further changes in response to my comments. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1268: > 1266: > 1267: // The handshake won't touch non-java threads, so do those separately. > 1268: Threads::non_java_threads_do(&prepare_for_update_refs); Which non-Java threads need to prepare for update refs? (i.e. which make use of this state predicate and/or participate in update refs phase.) Would be good to document that somewhere (or may be repeat it here). src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 372: > 370: // Critically, this method will _not_ flag that the global gc state has changed and threads > 371: // will continue to use their thread local copy. This is expected to be used in conjunction > 372: // with a handshake operation to propagate the new gc state. Thank you for this comment! ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22688#pullrequestreview-2550382471 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1915539302 PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1915543381 From ysr at openjdk.org Tue Jan 14 20:29:29 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 14 Jan 2025 20:29:29 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com>

Message-ID: On Mon, 13 Jan 2025 18:53:20 GMT, William Kemper wrote: >> If it's okay with you, I will do this on a separate PR so that this current PR is not cluttered by the change. > > https://bugs.openjdk.org/browse/JDK-8347617 Yes, that's OK. I didn't realize there were so many `update_refs` and `updaterefs` each... I had only looked at the `[[[vm_]op_]init_updaterefs}` class of names (including the ones right below here) and incorrectly assumed there was consistency in the naming, but for this new variant that you had introduced in this PR. Thanks for filing a separate ticket for rename, and it makes sense to do that kind of wholesale renaming later. (The suggestion here had been just for this name, but I see that it isn't inconsistent with the lack of a single convention, so I take back my comment.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1915221534 From ysr at openjdk.org Tue Jan 14 20:29:31 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 14 Jan 2025 20:29:31 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: References: <6ZVLoWPco9LC3XZOturDKG9F42n20Ie4h61f5Ap5iIY=.bbeb52d3-3de0-4778-b504-a69dc6ef7d3b@github.com> <2smNeh6fdjcA_HtcFLFy9IqJBFETW_CRnqzyW1Z7rbI=.8bd30d87-3602-42cd-9f54-c0b818446e7d@github.com> Message-ID: On Mon, 13 Jan 2025 18:18:37 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1245: >> >>> 1243: void do_thread(Thread* thread) override { >>> 1244: _propagator.do_thread(thread); >>> 1245: if (ShenandoahThreadLocalData::gclab(thread) != nullptr) { >> >> Which thread may have this be null? (I am looking at the ShenandoahRetireGCLabClosure which insists that this should be non-null.) >> >> I assume we have some threads here that have a gc state that must be updated but which don't have a gc lab. >> >> I am wondering if the check for an initialized gclab and in the generational case the plab can be pushed down into the closure rather than being exposed here. At that place, we would want to document (or as needed assert) why some threads targeted by the closure may have null gclab or plab. > > Only worker threads and java threads are required to have gclabs. In other use cases (`shHeap::make_labs_parsable`, `shHeap::retire_gclabs`), this closure is _only_ used on java and worker threads, so pushing the test into the closure would be redundant for other uses. I will put in a comment here instead? > > Additionally, I noticed an inconsistency between `make_labs_parsable` (which skips the safepoint workers) and `retire_gclabs` (which also visited the control thread). An earlier version of this PR had given gclabs to the control and vm threads, but these threads will only perform evacuations in very rare circumstances, so I've removed their gclabs. Para 1: Yes, that makes sense. Para 2: I am not sure I follow. Don't we need to cover the _rare_ case where these threads _have_ a gc lab? I agree that the control and vm threads should not need to participate in actual copying work. What is the rare circumstance in which they do need to? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22688#discussion_r1915522467 From ysr at openjdk.org Tue Jan 14 20:29:31 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 14 Jan 2025 20:29:31 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v3] In-Reply-To: