From xpeng at openjdk.org Sat Mar 1 06:06:01 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Sat, 1 Mar 2025 06:06:01 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v9] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: On Fri, 28 Feb 2025 00:08:22 GMT, Xiaolong Peng wrote: >> Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. >> >> I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. >> >> GenShen: >> Before: >> >> [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) >> >> >> After: >> >> [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) >> [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) >> >> >> Shenandoah: >> Before: >> >> [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) >> >> After: >> >> [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) >> [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) >> >> >> Additional changes: >> * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. >> * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: >> - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 >> - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. >> * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. >> * Clean up FullGC code, remove duplicate code. >> >> ... > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 25 additional commits since the last revision: > > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Adding condition "!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress()" back and address some PR comments > - Remove entry_reset_after_collect from ShenandoahOldGC > - Remove condition check !_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress() from op_reset_after_collect > - Merge branch 'openjdk:master' into reset-bitmap > - Address review comments > - ... and 15 more: https://git.openjdk.org/jdk/compare/8e164a93...7eea9556 Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22778#issuecomment-2691984558 From duke at openjdk.org Sat Mar 1 06:06:01 2025 From: duke at openjdk.org (duke) Date: Sat, 1 Mar 2025 06:06:01 GMT Subject: RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle [v9] In-Reply-To: References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: On Fri, 28 Feb 2025 00:08:22 GMT, Xiaolong Peng wrote: >> Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. >> >> I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. >> >> GenShen: >> Before: >> >> [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) >> >> >> After: >> >> [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) >> [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) >> >> >> Shenandoah: >> Before: >> >> [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) >> >> After: >> >> [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) >> [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) >> >> >> Additional changes: >> * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. >> * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: >> - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 >> - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. >> * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. >> * Clean up FullGC code, remove duplicate code. >> >> ... > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 25 additional commits since the last revision: > > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Merge branch 'openjdk:master' into reset-bitmap > - Adding condition "!_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress()" back and address some PR comments > - Remove entry_reset_after_collect from ShenandoahOldGC > - Remove condition check !_do_old_gc_bootstrap && !heap->is_concurrent_old_mark_in_progress() from op_reset_after_collect > - Merge branch 'openjdk:master' into reset-bitmap > - Address review comments > - ... and 15 more: https://git.openjdk.org/jdk/compare/8e164a93...7eea9556 @pengxiaolong Your change (at version 7eea95568115c3ceb976bf83559b4df1d2b490d4) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22778#issuecomment-2691985569 From jsjolen at openjdk.org Mon Mar 3 10:10:00 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 3 Mar 2025 10:10:00 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag In-Reply-To: References: Message-ID: On Tue, 25 Feb 2025 09:49:41 GMT, Afshin Zafari wrote: > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* LGTM ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23770#pullrequestreview-2653648133 From dfenacci at openjdk.org Mon Mar 3 12:43:53 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 3 Mar 2025 12:43:53 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v4] In-Reply-To: <2jI87up85vKeQq7xy6WoI987MOuqTqA6I8G75VvC74g=.e8ef9f9c-b8b3-496d-9b48-28c83dc1fb64@github.com> References: <2jI87up85vKeQq7xy6WoI987MOuqTqA6I8G75VvC74g=.e8ef9f9c-b8b3-496d-9b48-28c83dc1fb64@github.com> Message-ID: On Fri, 28 Feb 2025 20:35:58 GMT, Dean Long wrote: > Refreshing my memory, isn't the real problem with trying to fix this with a minimum codecache size is that some of these stubs are not allocated during initial single-threaded JVM startup, but later when the first compiler threads start, and that allows other code blobs to fill up the codecache? Yes, exactly. This seems to be even more of an issue with 2 compiler threads (i.e. C1/C2) since the first can fill up the code cache first at the expense of the other. The result is that if one compiler thread tries to allocate more space in a full code cache during initialization with one of the 4 call paths above, the VM crashes (but could actually just turn off the compiler thread instead). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23630#issuecomment-2694260818 From dfenacci at openjdk.org Mon Mar 3 12:54:26 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 3 Mar 2025 12:54:26 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v5] In-Reply-To: References: Message-ID: > # Issue > The test `src/hotspot/share/opto/c2compiler.cpp` fails intermittently due to a crash that happens when trying to allocate code cache space for C1 and C2 in `RuntimeStub::new_runtime_stub` and `SingletonBlob::operator new`. > > # Causes > There are a few call paths during the initialization of C1 and C2 that can lead to the code cache allocations in `RuntimeStub::new_runtime_stub` (through `RuntimeStub::operator new`) and `SingletonBlob::operator new` triggering a fatal error if there is no more space. The paths in question are: > 1. `Compiler::init_c1_runtime` -> `Runtime1::initialize` -> `Runtime1::generate_blob_for` -> `Runtime1::generate_blob` -> `RuntimeStub::new_runtime_stub` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_stub` -> `Compile::Compile` -> `Compile::Code_Gen` -> `PhaseOutput::install` -> `PhaseOutput::install_stub` -> `RuntimeStub::new_runtime_stub` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_uncommon_trap_blob` -> `UncommonTrapBlob::create` -> `new UncommonTrapBlob` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_exception_blob` -> `ExceptionBlob::create` -> `new ExceptionBlob` > > # Solution > Instead of fatally crashing the we can use the `alloc_fail_is_fatal` flag of `RuntimeStub::new_runtime_stub` to avoid crashing in cases 1 and 2 and add a similar flag to `SingletonBlob::operator new` for cases 3 and 4. In the latter case we need to adjust all calls accordingly. > > Note: In [JDK-8326615](https://bugs.openjdk.org/browse/JDK-8326615) it was argued that increasing the minimum code cache size would solve the issue but that wasn't entirely accurate: doing so possibly decreases the chances of a failed allocation in these 4 places but doesn't totally avoid it. > > # Testing > The original failing regression test in `test/hotspot/jtreg/compiler/startup/StartupOutput.java` has been modified to run multiple times with randomized values (within the original failing range) to increase the chances of hitting the fatal assertion. > > Tests: Tier 1-4 (windows-x64, linux-x64/aarch64, and macosx-x64/aarch64; release and debug mode) Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: JDK-8347406: move assert into else clause ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23630/files - new: https://git.openjdk.org/jdk/pull/23630/files/906cd756..722ca508 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23630&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23630&range=03-04 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23630/head:pull/23630 PR: https://git.openjdk.org/jdk/pull/23630 From dfenacci at openjdk.org Mon Mar 3 12:58:53 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 3 Mar 2025 12:58:53 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v5] In-Reply-To: References: Message-ID: <7p-BfhPDiY8ImbAwlaBaN1Mre-HA0zpEz42NTQWYMoE=.38ad35e1-0e5f-43b7-9f1d-4c0461881f76@github.com> On Fri, 28 Feb 2025 20:43:03 GMT, Dean Long wrote: >> A slightly modified one surely is. Inserted it again. > > I was thinking it could be moved into the `else` clause and simplified further. Oh I see ?. Moved. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23630#discussion_r1977476672 From xpeng at openjdk.org Mon Mar 3 17:24:02 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 3 Mar 2025 17:24:02 GMT Subject: Integrated: 8338737: Shenandoah: Reset marking bitmaps after the cycle In-Reply-To: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> References: <6duTgo8vKHyCUnasOsrHp341B2krxcK8jNogKjX09gs=.af63669e-9c8d-4f17-b055-bf3a03a9618e@github.com> Message-ID: On Tue, 17 Dec 2024 00:09:25 GMT, Xiaolong Peng wrote: > Reset marking bitmaps after collection cycle; for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP. > > I have run same workload for 30s with Shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly since in most case bitmap for young gen should have been reset after pervious concurrent cycle finishes if there is no need to preserve bitmap states. > > GenShen: > Before: > > [33.342s][info][gc,stats ] Concurrent Reset = 0.023 s (a = 1921 us) (n = 12) (lvls, us = 133, 385, 1191, 1836, 8878) > > > After: > > [33.597s][info][gc,stats ] Concurrent Reset = 0.004 s (a = 317 us) (n = 13) (lvls, us = 58, 119, 217, 410, 670) > [33.597s][info][gc,stats ] Concurrent Reset After Collect = 0.018 s (a = 1365 us) (n = 13) (lvls, us = 91, 186, 818, 1836, 3872) > > > Shenandoah: > Before: > > [33.144s][info][gc,stats ] Concurrent Reset = 0.014 s (a = 1067 us) (n = 13) (lvls, us = 139, 277, 898, 1328, 2118) > > After: > > [33.128s][info][gc,stats ] Concurrent Reset = 0.003 s (a = 225 us) (n = 13) (lvls, us = 32, 92, 137, 295, 542) > [33.128s][info][gc,stats ] Concurrent Reset After Collect = 0.009 s (a = 661 us) (n = 13) (lvls, us = 92, 160, 594, 896, 1661) > > > Additional changes: > * Remove `ShenandoahResetBitmapClosure` and `ShenandoahPrepareForMarkClosure`, merge the code with `ShenandoahResetBitmapClosure`, saving one iteration over all the regions. > * Use API `ShenandoahGeneration::parallel_heap_region_iterate_free` to iterate the regions, two benefits from this: > - Underneath it calls `ShenandoahHeap::parallel_heap_region_iterate`, which is faster for very light tasks, see https://bugs.openjdk.org/browse/JDK-8337154 > - `ShenandoahGeneration::parallel_heap_region_iterate_free` decorate the closure with `ShenandoahExcludeRegionClosure`, which simplifies the code in closure. > * When `_do_old_gc_bootstrap is true`, instead of reset mark bitmap for old gen separately, simply reset the global generations, so we don't need walk the all regions twice. > * Clean up FullGC code, remove duplicate code. > > Additional tests: > - [x] CONF=macosx-aarch64-server-fastdebug make test T... This pull request has now been integrated. Changeset: 7c187b5d Author: Xiaolong Peng Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/7c187b5d81a653b87fc498101ad9e2d99b72efc6 Stats: 180 lines in 8 files changed: 95 ins; 62 del; 23 mod 8338737: Shenandoah: Reset marking bitmaps after the cycle Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/jdk/pull/22778 From wkemper at openjdk.org Mon Mar 3 18:24:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 3 Mar 2025 18:24:52 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v3] In-Reply-To: References: Message-ID: On Fri, 28 Feb 2025 17:44:36 GMT, William Kemper wrote: >> The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. >> >> ## Testing >> GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Comment tweak Tests with uncommit behavior enabled look good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23760#issuecomment-2695210222 From wkemper at openjdk.org Mon Mar 3 18:30:33 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 3 Mar 2025 18:30:33 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v2] In-Reply-To: References: Message-ID: <5Lr95p3Uwv5w0n3YzDmALQc6KESs9xLnWdGm7p1IwGA=.3df358c6-f5d5-4f10-822d-5905429c050e@github.com> > This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots - Fix comments - Add whitespace at end of file - More detail for init update refs event message - Use timing tracker for timing verification - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots - WIP: Fix up phase timings for newly concurrent final roots and init update refs - WIP: Combine satb transfer with state propagation, restore phase timing data - WIP: Transfer pointers out of SATB with a handshake - WIP: Clear weak roots flag concurrently ------------- Changes: https://git.openjdk.org/jdk/pull/23830/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=01 Stats: 291 lines in 14 files changed: 194 ins; 47 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/23830.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23830/head:pull/23830 PR: https://git.openjdk.org/jdk/pull/23830 From xpeng at openjdk.org Mon Mar 3 20:16:32 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 3 Mar 2025 20:16:32 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect Message-ID: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. ------------- Commit messages: - 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect Changes: https://git.openjdk.org/jdk/pull/23872/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23872&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351077 Stats: 5 lines in 1 file changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23872.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23872/head:pull/23872 PR: https://git.openjdk.org/jdk/pull/23872 From wkemper at openjdk.org Mon Mar 3 20:20:05 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 3 Mar 2025 20:20:05 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. Thanks for getting to the bottom of this. ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23872#pullrequestreview-2655199892 From xpeng at openjdk.org Mon Mar 3 20:30:59 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 3 Mar 2025 20:30:59 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. Thanks for the review, I'll integrate it since it is really a trivial only for code comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23872#issuecomment-2695462169 From duke at openjdk.org Mon Mar 3 20:31:00 2025 From: duke at openjdk.org (duke) Date: Mon, 3 Mar 2025 20:31:00 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. @pengxiaolong Your change (at version 3764bf7d41619a2b51bb860e7ae4005e7f8c0e37) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23872#issuecomment-2695464781 From cslucas at openjdk.org Mon Mar 3 21:09:48 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 3 Mar 2025 21:09:48 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v4] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Fix merge conflict - Address PR feedback: no changes to shared files. - Merge master - Addressing PR comments: some refactorings, ppc fix, off-by-one fix. - Relocation of Card Tables ------------- Changes: https://git.openjdk.org/jdk/pull/23170/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=03 Stats: 305 lines in 30 files changed: 151 ins; 95 del; 59 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From ysr at openjdk.org Mon Mar 3 21:19:02 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 3 Mar 2025 21:19:02 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1235: > 1233: // Valid bitmap of young generation is needed by concurrent weak references phase of old GC cycle, > 1234: // because it is possible that there is soft reference in old generation with the referent in young generation; > 1235: // therefore mark bitmap of young generation can't be reset if there will be old GC after the concurrent GC cycle. I don't understand the comment. If the soft reference in old gen points to its referent in the young gen, then the latter should be either reachable, or should have been cleared (depending on who discovered the soft reference & the soft reference clearing policy). If the former, the old gen card should be dirty. May be I am confused about the change in comment, but this may be pointing to a bug in the reference processing code or the associated card-marking code. Or I am not clearly understanding your comment in context. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23872#discussion_r1978221380 From ysr at openjdk.org Mon Mar 3 23:01:07 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 3 Mar 2025 23:01:07 GMT Subject: RFR: 8349094: GenShen: Race between control and regulator threads may violate assertions [v18] In-Reply-To: References: Message-ID: <9rfQ1rnji3vwQIPlRGqVmh_PwZxLdvcYv-JuukdP7G0=.b4583678-800b-416a-a154-b878535189e4@github.com> On Fri, 28 Feb 2025 17:17:17 GMT, William Kemper wrote: >> There are several changes to the operation of Shenandoah's control threads here. >> * The reason for cancellation is now recorded in `ShenandoahHeap::_cancelled_gc` as a `GCCause`, instead of various member variables in the control thread. >> * The cancellation handling is driven entirely by the cancellation cause >> * The graceful shutdown, alloc failure, humongous alloc failure and preemption requested flags are all removed >> * The shutdown sequence is simpler >> * The generational control thread uses a lock to coordinate updates to the requested cause and generation >> * APIs have been simplified to avoid converting between the generation `type` and the actual generation instance >> * The old heuristic, rather than the control thread itself, is now responsible for resuming old generation cycles >> * The control thread doesn't loop on its own (unless the pacer is enabled). >> >> ## Testing >> * jtreg hotspot_gc_shenandoah >> * dacapo, extremem, diluvian, specjbb2015, specjvm2018, heapothesys > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: > > - Merge remote-tracking branch 'jdk/master' into fix-control-regulator-threads > - Don't check for shutdown in control thread loop condition > > It may cause the thread to exit before it is requested to stop > - Add assertions about old gen state when resuming old cycles > - Remove duplicated field pointer for old generation > - Improve names and comments > - Merge tag 'jdk-25+11' into fix-control-regulator-threads > > Added tag jdk-25+11 for changeset 0131c1bf > - Address review feedback (better comments, better names) > - Merge remote-tracking branch 'jdk/master' into fix-control-regulator-threads > - Old gen bootstrap cycle must make it to init mark > - Merge remote-tracking branch 'jdk/master' into fix-control-regulator-threads > - ... and 27 more: https://git.openjdk.org/jdk/compare/e98df71d...37e445d6 ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23475#pullrequestreview-2655535655 From ysr at openjdk.org Mon Mar 3 23:08:06 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 3 Mar 2025 23:08:06 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v3] In-Reply-To: References: Message-ID: On Fri, 28 Feb 2025 17:44:57 GMT, William Kemper wrote: > That's a good point. I created a branch that enables uncommit for the test pipelines when I made this original change. I'll resurrect that branch and run that configuration again. Thanks. Any reason not to have (a subset or all) non-performance testing in pipeline run with the default of uncommit enabled? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23760#issuecomment-2695768379 From ysr at openjdk.org Mon Mar 3 23:52:53 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 3 Mar 2025 23:52:53 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v3] In-Reply-To: References: Message-ID: On Fri, 28 Feb 2025 17:44:36 GMT, William Kemper wrote: >> The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. >> >> ## Testing >> GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Comment tweak ? Small documentation suggestion. No re-review needed. If available, please add to the ticket or to the PR the failing test name(s), and a suitable exemplar stack retrace of assertion violation. src/hotspot/share/gc/shenandoah/shenandoahUncommitThread.hpp line 65: > 63: // Iterate and uncommit eligible regions. Return the number of regions uncommitted. > 64: // This operation may be interrupted if the GC calls `forbid_uncommit`. > 65: size_t do_uncommit_work(double shrink_before, size_t shrink_until) const; I'd document the semantics of the parameters too: // Iterate over and uncommit eligible regions unless committed heap would fall below `shrink_until` . // A region is eligible if it's been empty for at least `shrink_before` // Returns the number of regions uncommitted. May be interrupted by `forbid_uncommit`. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23760#pullrequestreview-2655587421 PR Review Comment: https://git.openjdk.org/jdk/pull/23760#discussion_r1978390429 From ysr at openjdk.org Tue Mar 4 00:12:53 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 4 Mar 2025 00:12:53 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 21:16:32 GMT, Y. Srinivas Ramakrishna wrote: >> This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. >> >> After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1235: > >> 1233: // Valid bitmap of young generation is needed by concurrent weak references phase of old GC cycle, >> 1234: // because it is possible that there is soft reference in old generation with the referent in young generation; >> 1235: // therefore mark bitmap of young generation can't be reset if there will be old GC after the concurrent GC cycle. > > I don't understand the comment. If the soft reference in old gen points to its referent in the young gen, then the latter should be either reachable, or should have been cleared (depending on who discovered the soft reference & the soft reference clearing policy). If the former, the old gen card should be dirty. > > May be I am confused about the change in comment, but this may be pointing to a bug in the reference processing code or the associated card-marking code. > > Or I am not clearly understanding your comment in context. Thanks @earthling-amzn for explaining the issue to me offline. Based on my current understanding of the issue from that explanation, I'd suggest rewording the comment as follows: // If we are in the midst of an old gc bootstrap or an old marking, we want to leave the mark bit map of // the young generation intact. In particular, reference processing in the old generation may potentially // need the reachability of a young generation referent of a Reference object in the old generation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23872#discussion_r1978405645 From ysr at openjdk.org Tue Mar 4 00:12:52 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 4 Mar 2025 00:12:52 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. ? small suggested rewording, although what you have also works. (I'll think some more about this to fully understand the context. Thanks.) ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23872#pullrequestreview-2655609678 From xpeng at openjdk.org Tue Mar 4 00:12:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 00:12:53 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Tue, 4 Mar 2025 00:02:29 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1235: >> >>> 1233: // Valid bitmap of young generation is needed by concurrent weak references phase of old GC cycle, >>> 1234: // because it is possible that there is soft reference in old generation with the referent in young generation; >>> 1235: // therefore mark bitmap of young generation can't be reset if there will be old GC after the concurrent GC cycle. >> >> I don't understand the comment. If the soft reference in old gen points to its referent in the young gen, then the latter should be either reachable, or should have been cleared (depending on who discovered the soft reference & the soft reference clearing policy). If the former, the old gen card should be dirty. >> >> May be I am confused about the change in comment, but this may be pointing to a bug in the reference processing code or the associated card-marking code. >> >> Or I am not clearly understanding your comment in context. > > Thanks @earthling-amzn for explaining the issue to me offline. Based on my current understanding of the issue from that explanation, I'd suggest rewording the comment as follows: > > // If we are in the midst of an old gc bootstrap or an old marking, we want to leave the mark bit map of > // the young generation intact. In particular, reference processing in the old generation may potentially > // need the reachability of a young generation referent of a Reference object in the old generation. Thank you Ramki, I'll update the comments and refresh the PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23872#discussion_r1978411180 From wkemper at openjdk.org Tue Mar 4 00:44:58 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 00:44:58 GMT Subject: Integrated: 8349094: GenShen: Race between control and regulator threads may violate assertions In-Reply-To: References: Message-ID: On Wed, 5 Feb 2025 22:30:35 GMT, William Kemper wrote: > There are several changes to the operation of Shenandoah's control threads here. > * The reason for cancellation is now recorded in `ShenandoahHeap::_cancelled_gc` as a `GCCause`, instead of various member variables in the control thread. > * The cancellation handling is driven entirely by the cancellation cause > * The graceful shutdown, alloc failure, humongous alloc failure and preemption requested flags are all removed > * The shutdown sequence is simpler > * The generational control thread uses a lock to coordinate updates to the requested cause and generation > * APIs have been simplified to avoid converting between the generation `type` and the actual generation instance > * The old heuristic, rather than the control thread itself, is now responsible for resuming old generation cycles > * The control thread doesn't loop on its own (unless the pacer is enabled). > > ## Testing > * jtreg hotspot_gc_shenandoah > * dacapo, extremem, diluvian, specjbb2015, specjvm2018, heapothesys This pull request has now been integrated. Changeset: 3a8a432c Author: William Kemper URL: https://git.openjdk.org/jdk/commit/3a8a432c05999fe478b94de75b416404b5a515d2 Stats: 963 lines in 18 files changed: 327 ins; 294 del; 342 mod 8349094: GenShen: Race between control and regulator threads may violate assertions Reviewed-by: ysr, kdnilsen ------------- PR: https://git.openjdk.org/jdk/pull/23475 From wkemper at openjdk.org Tue Mar 4 00:57:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 00:57:06 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v4] In-Reply-To: References: Message-ID: > The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. > > ## Testing > GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). William Kemper has updated the pull request incrementally with one additional commit since the last revision: Document parameters for do_uncommit_work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23760/files - new: https://git.openjdk.org/jdk/pull/23760/files/1c32c0e3..e25e6276 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23760&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23760&range=02-03 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23760.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23760/head:pull/23760 PR: https://git.openjdk.org/jdk/pull/23760 From wkemper at openjdk.org Tue Mar 4 00:57:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 00:57:06 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v3] In-Reply-To: References: Message-ID: On Mon, 3 Mar 2025 23:05:45 GMT, Y. Srinivas Ramakrishna wrote: >> That's a good point. I created a branch that enables uncommit for the test pipelines when I made this original change. I'll resurrect that branch and run that configuration again. Thanks. > >> That's a good point. I created a branch that enables uncommit for the test pipelines when I made this original change. I'll resurrect that branch and run that configuration again. Thanks. > > Any reason not to have (a subset or all) non-performance testing in pipeline run with the default of uncommit enabled? @ysramakrishna , I will enable uncommit for the stress tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23760#issuecomment-2695908894 From wkemper at openjdk.org Tue Mar 4 00:57:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 00:57:06 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v3] In-Reply-To: References: Message-ID: On Mon, 3 Mar 2025 23:40:25 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Comment tweak > > src/hotspot/share/gc/shenandoah/shenandoahUncommitThread.hpp line 65: > >> 63: // Iterate and uncommit eligible regions. Return the number of regions uncommitted. >> 64: // This operation may be interrupted if the GC calls `forbid_uncommit`. >> 65: size_t do_uncommit_work(double shrink_before, size_t shrink_until) const; > > I'd document the semantics of the parameters too: > > // Iterate over and uncommit eligible regions unless committed heap would fall below `shrink_until` . > // A region is eligible if it's been empty for at least `shrink_before` > // Returns the number of regions uncommitted. May be interrupted by `forbid_uncommit`. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23760#discussion_r1978440214 From xpeng at openjdk.org Tue Mar 4 00:58:27 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 00:58:27 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect [v2] In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: <6GKNvjF02TlZU_UZMNtWnzbs_BIRVf2x1UeiDIFg4hU=.160089d2-5601-4fc4-9d77-2fb6aa09d18b@github.com> > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Update code comments as suggested in PR ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23872/files - new: https://git.openjdk.org/jdk/pull/23872/files/3764bf7d..d760471e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23872&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23872&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23872.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23872/head:pull/23872 PR: https://git.openjdk.org/jdk/pull/23872 From xpeng at openjdk.org Tue Mar 4 00:58:27 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 00:58:27 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect [v2] In-Reply-To: References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Tue, 4 Mar 2025 00:08:27 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Update code comments as suggested in PR > > ? > > small suggested rewording, although what you have also works. > > (I'll think some more about this to fully understand the context. Thanks.) Thank you @ysramakrishna and @earthling-amzn! I have updated the comments as you have suggested in the PR review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23872#issuecomment-2695910096 From wkemper at openjdk.org Tue Mar 4 01:08:54 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 01:08:54 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect [v2] In-Reply-To: <6GKNvjF02TlZU_UZMNtWnzbs_BIRVf2x1UeiDIFg4hU=.160089d2-5601-4fc4-9d77-2fb6aa09d18b@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> <6GKNvjF02TlZU_UZMNtWnzbs_BIRVf2x1UeiDIFg4hU=.160089d2-5601-4fc4-9d77-2fb6aa09d18b@github.com> Message-ID: On Tue, 4 Mar 2025 00:58:27 GMT, Xiaolong Peng wrote: >> This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. >> >> After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Update code comments as suggested in PR Marked as reviewed by wkemper (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23872#pullrequestreview-2655679787 From duke at openjdk.org Tue Mar 4 01:19:52 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Mar 2025 01:19:52 GMT Subject: RFR: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect [v2] In-Reply-To: <6GKNvjF02TlZU_UZMNtWnzbs_BIRVf2x1UeiDIFg4hU=.160089d2-5601-4fc4-9d77-2fb6aa09d18b@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> <6GKNvjF02TlZU_UZMNtWnzbs_BIRVf2x1UeiDIFg4hU=.160089d2-5601-4fc4-9d77-2fb6aa09d18b@github.com> Message-ID: On Tue, 4 Mar 2025 00:58:27 GMT, Xiaolong Peng wrote: >> This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. >> >> After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Update code comments as suggested in PR @pengxiaolong Your change (at version d760471e5a84bc45466ba2d676f97a0efcb477db) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23872#issuecomment-2695934719 From ysr at openjdk.org Tue Mar 4 02:13:54 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 4 Mar 2025 02:13:54 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v4] In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 00:57:06 GMT, William Kemper wrote: >> The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. >> >> ## Testing >> GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Document parameters for do_uncommit_work Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23760#pullrequestreview-2655747950 From xpeng at openjdk.org Tue Mar 4 03:58:56 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 03:58:56 GMT Subject: Integrated: 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect In-Reply-To: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> References: <1drXUZ5QM7_IPvLi3eRBKVx14M0ofow8KF0XlnzaJzY=.b37d216f-4c68-4427-ab2d-f591bf00d18f@github.com> Message-ID: On Mon, 3 Mar 2025 20:12:34 GMT, Xiaolong Peng wrote: > This is a trivial PR to update the code comments in ShenandoahConcurrentGC::op_reset_after_collect. > > After doing more test and analysis, we have a better understanding why reset bitmap of young gen after concurrent cycle may cause crash if there is pending old GC cycle to execute: When there is soft reference in old gen, but the referent is in young, reseting bitmap of young will cause wrong state of the soft reference, which may lead to expected cashes. This pull request has now been integrated. Changeset: 7c173fde Author: Xiaolong Peng URL: https://git.openjdk.org/jdk/commit/7c173fde4274a798f299876492a2cd833eee9fdd Stats: 5 lines in 1 file changed: 0 ins; 2 del; 3 mod 8351077: Shenandoah: Update comments in ShenandoahConcurrentGC::op_reset_after_collect Reviewed-by: wkemper, ysr ------------- PR: https://git.openjdk.org/jdk/pull/23872 From cslucas at openjdk.org Tue Mar 4 04:10:25 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 4 Mar 2025 04:10:25 GMT Subject: RFR: 8351081: Off-by-one error in ShenandoahCardCluster Message-ID: Given certain values for the variables in [this expression](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp#L173) the result of the computation can be equal to `_ rs->total_cards()` which will lead to segmentation fault, for instance in [starts_object(card_at_end)](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp#L393). The problem happens, though, because the `_object_starts` array doesn't have a [guarding entry](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp#L37) at the end. This pull request adjusts the allocation of `_object_starts` to include an additional entry at the end to account for this situation. Tested with JTREG tier 1-4, x86_64 & AArch64 on Linux. ------------- Commit messages: - Adjust allocation of object_starts Changes: https://git.openjdk.org/jdk/pull/23882/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23882&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351081 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23882.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23882/head:pull/23882 PR: https://git.openjdk.org/jdk/pull/23882 From cslucas at openjdk.org Tue Mar 4 04:13:33 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 4 Mar 2025 04:13:33 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request incrementally with two additional commits since the last revision: - Revert changes to shared cardTable.hpp - Revert changes to shared cardTable.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23170/files - new: https://git.openjdk.org/jdk/pull/23170/files/6210f026..717b8b44 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=03-04 Stats: 6 lines in 1 file changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From cslucas at openjdk.org Tue Mar 4 04:16:03 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 4 Mar 2025 04:16:03 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v3] In-Reply-To: References: <6_AoWQhldJttOIEOL1T7HSapPzE4Qn2j4WN7E-bI3rM=.2685d3d8-e47c-42a6-845b-b68f50cc568e@github.com> Message-ID: On Thu, 20 Feb 2025 15:33:35 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shared/cardTable.hpp line 205: >> >>> 203: virtual CardValue* byte_map_base() const { return _byte_map_base; } >>> 204: >>> 205: virtual CardValue* byte_map() const { return _byte_map; } >> >> @shipilev - can you please confirm that this is the part that you didn't like? > > Yes, I am not fond of extending `CardTable` with virtual members, especially if they can be used on high-performance paths. Not sure if the following idea is viable. > > ShenandoahBarrierSet knows where to get card table base: from Shenandoah thread local data. Now it looks like we need to deal with two problems: > 1. Protect ourselves from accidentally calling `CardTable` methods that may reference "incorrect" `_byte_map_(base)`. To do that, it looks it is enough to initialize `CardTable::_byte_map_(base)` to non-sensical values (`nullptr`-s?), and let the testing crash. > 2. Allow calls to `CardTable` utility methods with our base. For that, I think we can drill a few new (non-virtual) methods in `CardTable`, and enter from Shenandoah through them. So for example `byte_for_index(const size_t card_index)` becomes: > ``` > CardValue* byte_for_index(const CardValue* base, const size_t card_index) const { > return base + card_index; > } > CardValue* byte_for_index(const size_t card_index) const { > return byte_for_index(_byte_map, card_index); > } > ``` @shipilev - can you please take a look at the latest pushes? I realized that the logic implemented already keeps the fields of the base card table class always updated, therefore I don't really need to make the methods (`_byte_map_(base)` virtual at all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1978578378 From shade at openjdk.org Tue Mar 4 11:51:07 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Mar 2025 11:51:07 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... Great, thanks for the feedback. I think we are going to go with the JEP implementation that removes the easy parts of x86_32 code, and then do the deeper cleanups under [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella. I added some subtasks there, based on the commits from this bulk PR. I am closing this PR in favor of about-to-be-created cleaner PR for JEP 503. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2697266596 From shade at openjdk.org Tue Mar 4 11:51:07 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Mar 2025 11:51:07 GMT Subject: Withdrawn: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/22567 From kdnilsen at openjdk.org Tue Mar 4 15:02:04 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 4 Mar 2025 15:02:04 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v2] In-Reply-To: <5Lr95p3Uwv5w0n3YzDmALQc6KESs9xLnWdGm7p1IwGA=.3df358c6-f5d5-4f10-822d-5905429c050e@github.com> References: <5Lr95p3Uwv5w0n3YzDmALQc6KESs9xLnWdGm7p1IwGA=.3df358c6-f5d5-4f10-822d-5905429c050e@github.com> Message-ID: On Mon, 3 Mar 2025 18:30:33 GMT, William Kemper wrote: >> This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - Fix comments > - Add whitespace at end of file > - More detail for init update refs event message > - Use timing tracker for timing verification > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - WIP: Fix up phase timings for newly concurrent final roots and init update refs > - WIP: Combine satb transfer with state propagation, restore phase timing data > - WIP: Transfer pointers out of SATB with a handshake > - WIP: Clear weak roots flag concurrently Thanks. Great improvement. src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.cpp line 458: > 456: > 457: // Step 1. All threads need to 'complete' partially filled, thread local buffers. This > 458: // is accomplished in ShenandoahConcurrentGC::complete_abbreviated_cycle using a Handshake I think we're talking about "complete processing" of thread-local satb buffers. To avoid confusion with tlab, maybe add satb to this comment. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/23830#pullrequestreview-2657883998 PR Review Comment: https://git.openjdk.org/jdk/pull/23830#discussion_r1979620964 From kdnilsen at openjdk.org Tue Mar 4 15:04:59 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 4 Mar 2025 15:04:59 GMT Subject: RFR: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them [v4] In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 00:57:06 GMT, William Kemper wrote: >> The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. >> >> ## Testing >> GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Document parameters for do_uncommit_work Repeat approval. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/23760#pullrequestreview-2657921882 From wkemper at openjdk.org Tue Mar 4 17:14:58 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 17:14:58 GMT Subject: Integrated: 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them In-Reply-To: References: Message-ID: On Tue, 25 Feb 2025 01:38:14 GMT, William Kemper wrote: > The protocol which is meant to prevent regions from being uncommitted while their bitmaps are being reset may fail. This happens when the control thread attempts to wait for the uncommit thread to finish, but the uncommit thread has not yet indicated that it has started. > > ## Testing > GHA, Dacapo, Extremem, Heapothesys, Diluvian, SpecJBB2015, SpecJVM2008 (with and without stress flags, asserts). Also have run the JTREG test that failed this assertion over 10K times (and counting). This pull request has now been integrated. Changeset: fe806caa Author: William Kemper URL: https://git.openjdk.org/jdk/commit/fe806caa160b2d550db273af17dc08270f143819 Stats: 79 lines in 2 files changed: 41 ins; 24 del; 14 mod 8350605: assert(!heap->is_uncommit_in_progress()) failed: Cannot uncommit bitmaps while resetting them Reviewed-by: kdnilsen, ysr ------------- PR: https://git.openjdk.org/jdk/pull/23760 From wkemper at openjdk.org Tue Mar 4 17:14:54 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 17:14:54 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v2] In-Reply-To: References: <5Lr95p3Uwv5w0n3YzDmALQc6KESs9xLnWdGm7p1IwGA=.3df358c6-f5d5-4f10-822d-5905429c050e@github.com> Message-ID: On Tue, 4 Mar 2025 14:52:23 GMT, Kelvin Nilsen wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots >> - Fix comments >> - Add whitespace at end of file >> - More detail for init update refs event message >> - Use timing tracker for timing verification >> - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots >> - WIP: Fix up phase timings for newly concurrent final roots and init update refs >> - WIP: Combine satb transfer with state propagation, restore phase timing data >> - WIP: Transfer pointers out of SATB with a handshake >> - WIP: Clear weak roots flag concurrently > > src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.cpp line 458: > >> 456: >> 457: // Step 1. All threads need to 'complete' partially filled, thread local buffers. This >> 458: // is accomplished in ShenandoahConcurrentGC::complete_abbreviated_cycle using a Handshake > > I think we're talking about "complete processing" of thread-local satb buffers. To avoid confusion with tlab, maybe add satb to this comment. Yes, good point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23830#discussion_r1979884800 From wkemper at openjdk.org Tue Mar 4 17:18:37 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 17:18:37 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v3] In-Reply-To: References: Message-ID: > This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). William Kemper has updated the pull request incrementally with one additional commit since the last revision: Clarify which thread local buffers in comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23830/files - new: https://git.openjdk.org/jdk/pull/23830/files/0b2675af..390de7f9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23830.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23830/head:pull/23830 PR: https://git.openjdk.org/jdk/pull/23830 From wkemper at openjdk.org Tue Mar 4 18:40:55 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 18:40:55 GMT Subject: RFR: 8351081: Off-by-one error in ShenandoahCardCluster In-Reply-To: References: Message-ID: <6todYj98wTBywpKJ8GkvakvJGoPiAvF2Gurs01Pq6t0=.8cfb3200-86a3-4289-91c4-5fdfdb7d82bb@github.com> On Tue, 4 Mar 2025 04:06:00 GMT, Cesar Soares Lucas wrote: > Given certain values for the variables in [this expression](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp#L173) the result of the computation can be equal to `_ rs->total_cards()` which will lead to segmentation fault, for instance in [starts_object(card_at_end)](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp#L393). The problem happens, though, because the `_object_starts` array doesn't have a [guarding entry](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp#L37) at the end. This pull request adjusts the allocation of `_object_starts` to include an additional entry at the end to account for this situation. > > Tested with JTREG tier 1-4, x86_64 & AArch64 on Linux. LGTM. ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23882#pullrequestreview-2658615578 From duke at openjdk.org Tue Mar 4 19:18:59 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Mar 2025 19:18:59 GMT Subject: RFR: 8351081: Off-by-one error in ShenandoahCardCluster In-Reply-To: References: Message-ID: <65Nau_mejcjgMsRM1Qli2hkyeEJlXGZxDExGV6vmWcQ=.84f05fff-f04c-4708-bb40-b974a99aff5e@github.com> On Tue, 4 Mar 2025 04:06:00 GMT, Cesar Soares Lucas wrote: > Given certain values for the variables in [this expression](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp#L173) the result of the computation can be equal to `_ rs->total_cards()` which will lead to segmentation fault, for instance in [starts_object(card_at_end)](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp#L393). The problem happens, though, because the `_object_starts` array doesn't have a [guarding entry](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp#L37) at the end. This pull request adjusts the allocation of `_object_starts` to include an additional entry at the end to account for this situation. > > Tested with JTREG tier 1-4, x86_64 & AArch64 on Linux. @JohnTortugo Your change (at version 9a4ac53343aaa62b055241f90bd6d610a483ed66) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23882#issuecomment-2698667853 From shade at openjdk.org Tue Mar 4 20:09:58 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Mar 2025 20:09:58 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: References: Message-ID: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> On Tue, 4 Mar 2025 04:13:33 GMT, Cesar Soares Lucas wrote: >> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. >> >> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. >> >> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. >> >> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. >> >> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. >> >> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. >> >> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. > > Cesar Soares Lucas has updated the pull request incrementally with two additional commits since the last revision: > > - Revert changes to shared cardTable.hpp > - Revert changes to shared cardTable.hpp Much cleaner, thanks! I'll take another look later, but meanwhile, some comments: src/hotspot/cpu/arm/gc/shared/cardTableBarrierSetAssembler_arm.cpp line 100: > 98: assert(bs->kind() == BarrierSet::CardTableBarrierSet, > 99: "Wrong barrier set kind"); > 100: Unnecessary deletion of blank line? src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 655: > 653: > 654: #ifndef _LP64 > 655: __ pop(tmp1); Sounds like `tmp1` is undefined here. Should be `tmp`? src/hotspot/os_cpu/linux_arm/javaThread_linux_arm.cpp line 46: > 44: if (UseShenandoahGC) { > 45: _card_table_base = nullptr; > 46: return ; Suggestion: return; src/hotspot/os_cpu/linux_arm/javaThread_linux_arm.cpp line 50: > 48: _card_table_base = nullptr; > 49: } > 50: Unnecessary removals of blank lines? src/hotspot/share/ci/ciUtilities.cpp line 49: > 47: CardTableBarrierSet* ctbs = barrier_set_cast(bs); > 48: CardTable* ct = ctbs->card_table(); > 49: SHENANDOAHGC_ONLY(assert(!UseShenandoahGC, "Shenandoah byte_map_base is not constant.");) Here is a bit of a trick about the `Use${X}GC` flags: you don't need to guard them with `${X}GC_ONLY` macros. They are specifically designed that way: they reside in `gc_globals.hpp` without any feature flags. src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp line 25: > 23: */ > 24: > 25: #include "gc/shenandoah/shenandoahThreadLocalData.hpp" Includes should be sorted alphabetically. src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 268: > 266: > 267: void ShenandoahGeneration::prepare_gc() { > 268: Unnecessary removal. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 258: > 256: if (ShenandoahCardBarrier) { > 257: ShenandoahThreadLocalData::set_card_table(Thread::current(), bs->card_table()->write_byte_map_base()); > 258: } Er. This sets up card table for VMThread, right? I am surprised we do not need this for other fields in `ShenandoahThreadLocalData`. src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 407: > 405: ShenandoahCardCluster(ShenandoahDirectCardMarkRememberedSet* rs) { > 406: _rs = rs; > 407: _object_starts = NEW_C_HEAP_ARRAY(crossing_info, rs->total_cards()+1, mtGC); What is this `+1`? This is #23882, right? ------------- PR Review: https://git.openjdk.org/jdk/pull/23170#pullrequestreview-2656931853 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980148491 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1979192037 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980147454 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980121669 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980118417 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980116049 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1979940218 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1979944657 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1979158102 From cslucas at openjdk.org Tue Mar 4 21:08:08 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 4 Mar 2025 21:08:08 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> References: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> Message-ID: On Tue, 4 Mar 2025 10:50:30 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Revert changes to shared cardTable.hpp >> - Revert changes to shared cardTable.hpp > > src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 407: > >> 405: ShenandoahCardCluster(ShenandoahDirectCardMarkRememberedSet* rs) { >> 406: _rs = rs; >> 407: _object_starts = NEW_C_HEAP_ARRAY(crossing_info, rs->total_cards()+1, mtGC); > > What is this `+1`? This is #23882, right? Yes, correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980229122 From xpeng at openjdk.org Tue Mar 4 21:26:03 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 21:26:03 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained Message-ID: With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. ### Test - [x] hotspot_gc_shenandoah - [x] Tier 1 - [ ] Tier 2 ------------- Commit messages: - Revert unnecessary changes in ShenandoahReferenceProcessor - Revert the change in ShenandoahHeap::generation_for - touch up - If GC generation is young and referent is in old, make should_drop return false if old gen marking is not complete - Remove ShenandoahHeap::complete_marking_context() - Fix improper use of heap->complete_marking_context() - promotion in place and reference processor should be aware of heap generation when use complete marking context - JDK-8351091: initial works Changes: https://git.openjdk.org/jdk/pull/23886/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351091 Stats: 61 lines in 17 files changed: 9 ins; 23 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/23886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886 PR: https://git.openjdk.org/jdk/pull/23886 From wkemper at openjdk.org Tue Mar 4 21:26:04 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Mar 2025 21:26:04 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 08:34:16 GMT, Xiaolong Peng wrote: > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [ ] Tier 2 Changes requested by wkemper (Reviewer). src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2837: > 2835: } else if (affiliation == OLD_GENERATION) { > 2836: return old_generation(); > 2837: } else if (affiliation == FREE) { I don't think it makes sense to connect `FREE` regions to the global generation in this way. Free regions are _not_ affiliated with any generation. I think in some of these cases where you want to find the mark context, it would be possible to take it from a `_generation` member variable. src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337: > 335: // If generation is young and referent is in old, marking context of the old > 336: // may or may not be complete, we can safely drop the reference when old gen mark is complete. > 337: if (_generation->is_young() && referent_region->is_old()) { Have you seen this happen? The reference processor for each generation is only supposed to discover references for which the referent is in the collected generation. See `ShenandoahReferenceProcessor::should_discover`: if (!heap->is_in_active_generation(referent)) { log_trace(gc,ref)("Referent outside of active generation: " PTR_FORMAT, p2i(referent)); return false; } ------------- PR Review: https://git.openjdk.org/jdk/pull/23886#pullrequestreview-2658463721 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1979938123 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1979932540 From xpeng at openjdk.org Tue Mar 4 21:26:04 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 21:26:04 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained In-Reply-To: References: Message-ID: <5EhmY89ZN6u3AyeugsAf1wAVw7AxHU5HD0pfEmPZXZE=.a69c2802-cae9-479d-ab51-47cc69f85c4d@github.com> On Tue, 4 Mar 2025 17:48:58 GMT, William Kemper wrote: >> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. >> >> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. >> >> ### Test >> - [x] hotspot_gc_shenandoah >> - [x] Tier 1 >> - [ ] Tier 2 > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2837: > >> 2835: } else if (affiliation == OLD_GENERATION) { >> 2836: return old_generation(); >> 2837: } else if (affiliation == FREE) { > > I don't think it makes sense to connect `FREE` regions to the global generation in this way. Free regions are _not_ affiliated with any generation. I think in some of these cases where you want to find the mark context, it would be possible to take it from a `_generation` member variable. Yeah, I don't think it is necessary to change the behavior here either, I'll remove it in later update. > src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337: > >> 335: // If generation is young and referent is in old, marking context of the old >> 336: // may or may not be complete, we can safely drop the reference when old gen mark is complete. >> 337: if (_generation->is_young() && referent_region->is_old()) { > > Have you seen this happen? The reference processor for each generation is only supposed to discover references for which the referent is in the collected generation. See `ShenandoahReferenceProcessor::should_discover`: > > if (!heap->is_in_active_generation(referent)) { > log_trace(gc,ref)("Referent outside of active generation: " PTR_FORMAT, p2i(referent)); > return false; > } Ok, I didn't see happen in any of the jtreg tests yet. Just base on the the behavior we saw in old gc, I assumed this could happen. Now I am more curious about the real cause of the crash caused by reference from old to young, since we always check if the referent is in the active generation, that shouldn't have happened if it works as described, my feeling is there might be something fishy in the place where we use `_active_generation`(the comments it should be update only in the STW phases), maybe should we should get rid of it, currently we directly use _gc_generation in many places as well, not sure it if is possible to cause inconsistency. I'll revert this part, I'll follow up on the questions in separate work. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1979976173 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980145388 From xpeng at openjdk.org Tue Mar 4 21:26:04 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Mar 2025 21:26:04 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained In-Reply-To: <5EhmY89ZN6u3AyeugsAf1wAVw7AxHU5HD0pfEmPZXZE=.a69c2802-cae9-479d-ab51-47cc69f85c4d@github.com> References: <5EhmY89ZN6u3AyeugsAf1wAVw7AxHU5HD0pfEmPZXZE=.a69c2802-cae9-479d-ab51-47cc69f85c4d@github.com> Message-ID: <4_6n2QkucG-4itVGY9thZovsVDHqZFD_FbgFdBo5Fyg=.03fa388d-d65f-4ab2-b891-109de430fd2c@github.com> On Tue, 4 Mar 2025 18:14:58 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2837: >> >>> 2835: } else if (affiliation == OLD_GENERATION) { >>> 2836: return old_generation(); >>> 2837: } else if (affiliation == FREE) { >> >> I don't think it makes sense to connect `FREE` regions to the global generation in this way. Free regions are _not_ affiliated with any generation. I think in some of these cases where you want to find the mark context, it would be possible to take it from a `_generation` member variable. > > Yeah, I don't think it is necessary to change the behavior here either, I'll remove it in later update. I have removed the change. >> src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337: >> >>> 335: // If generation is young and referent is in old, marking context of the old >>> 336: // may or may not be complete, we can safely drop the reference when old gen mark is complete. >>> 337: if (_generation->is_young() && referent_region->is_old()) { >> >> Have you seen this happen? The reference processor for each generation is only supposed to discover references for which the referent is in the collected generation. See `ShenandoahReferenceProcessor::should_discover`: >> >> if (!heap->is_in_active_generation(referent)) { >> log_trace(gc,ref)("Referent outside of active generation: " PTR_FORMAT, p2i(referent)); >> return false; >> } > > Ok, I didn't see happen in any of the jtreg tests yet. > > Just base on the the behavior we saw in old gc, I assumed this could happen. Now I am more curious about the real cause of the crash caused by reference from old to young, since we always check if the referent is in the active generation, that shouldn't have happened if it works as described, my feeling is there might be something fishy in the place where we use `_active_generation`(the comments it should be update only in the STW phases), maybe should we should get rid of it, currently we directly use _gc_generation in many places as well, not sure it if is possible to cause inconsistency. > > I'll revert this part, I'll follow up on the questions in separate work. Reverted, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980219344 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980239633 From cslucas at openjdk.org Tue Mar 4 21:47:57 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 4 Mar 2025 21:47:57 GMT Subject: Integrated: 8351081: Off-by-one error in ShenandoahCardCluster In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 04:06:00 GMT, Cesar Soares Lucas wrote: > Given certain values for the variables in [this expression](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp#L173) the result of the computation can be equal to `_ rs->total_cards()` which will lead to segmentation fault, for instance in [starts_object(card_at_end)](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp#L393). The problem happens, though, because the `_object_starts` array doesn't have a [guarding entry](https://github.com/openjdk/jdk/blob/a87dd1a75f78cf872df49bea83ba48af8acfa2fd/src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp#L37) at the end. This pull request adjusts the allocation of `_object_starts` to include an additional entry at the end to account for this situation. > > Tested with JTREG tier 1-4, x86_64 & AArch64 on Linux. This pull request has now been integrated. Changeset: 38b4d46c Author: Cesar Soares Lucas Committer: William Kemper URL: https://git.openjdk.org/jdk/commit/38b4d46c1ff3701d75ff8347e5edbb01acd9b512 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8351081: Off-by-one error in ShenandoahCardCluster Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/jdk/pull/23882 From andrew at openjdk.org Tue Mar 4 22:38:25 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Mar 2025 22:38:25 GMT Subject: RFR: Merge jdk8u:master Message-ID: <4cF-jYBLChdNhTp1jOQC3_ssjO14CZ13bGr1oQHDW5o=.85737113-070b-4691-9e16-c4ff6b33ab2f@github.com> Merge jdk8u332-b08 ------------- Commit messages: - Merge jdk8u332-b08 - 8284920: Incorrect Token type causes XPath expression to return empty result - Added tag jdk8u332-b07 for changeset 6d526dbc3432 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/13/files Stats: 10 lines in 4 files changed: 2 ins; 1 del; 7 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/13.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/13/head:pull/13 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/13 From andrew at openjdk.org Tue Mar 4 22:40:56 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Mar 2025 22:40:56 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 3 new changesets Message-ID: <8b817646-49f9-43aa-a4d3-8e45bdd1024d@openjdk.org> Changeset: f1a7de17 Branch: master Author: Andrew John Hughes Date: 2022-04-15 04:34:35 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/f1a7de17268b2278ccd9ed7f757718d21ca085d8 Added tag jdk8u332-b07 for changeset 6d526dbc3432 ! .hgtags Changeset: d0b89297 Branch: master Author: Anton Kozlov Date: 2022-04-16 04:22:57 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/d0b8929739120d9f8850a1dffbb5d891acdcd70e 8284920: Incorrect Token type causes XPath expression to return empty result Reviewed-by: andrew ! jaxp/src/com/sun/org/apache/xpath/internal/compiler/Lexer.java ! jaxp/src/com/sun/org/apache/xpath/internal/compiler/Token.java ! jaxp/src/com/sun/org/apache/xpath/internal/compiler/XPathParser.java Changeset: 9bdf1b61 Branch: master Author: Andrew John Hughes Date: 2025-02-19 15:53:41 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/9bdf1b614403327870d04fa23dbba16d4aa68063 Merge jdk8u332-b08 From andrew at openjdk.org Tue Mar 4 22:41:14 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Mar 2025 22:41:14 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u332-b08 for changeset d0b89297 Message-ID: <43f36c20-1cf1-48dd-a684-34f48cfecf51@openjdk.org> Tagged by: Andrew John Hughes Date: 2022-04-16 04:24:00 +0000 Changeset: d0b89297 Author: Anton Kozlov Date: 2022-04-16 04:22:57 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/d0b8929739120d9f8850a1dffbb5d891acdcd70e From andrew at openjdk.org Tue Mar 4 22:41:17 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Mar 2025 22:41:17 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u332-b08 for changeset 9bdf1b61 Message-ID: <594f4f66-c132-445d-bd7b-2fb601e0ccae@openjdk.org> Tagged by: Andrew John Hughes Date: 2025-02-19 19:35:24 +0000 Added tag shenandoah8u332-b08 for changeset 9bdf1b61440 Changeset: 9bdf1b61 Author: Andrew John Hughes Date: 2025-02-19 15:53:41 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/9bdf1b614403327870d04fa23dbba16d4aa68063 From iris at openjdk.org Tue Mar 4 22:42:52 2025 From: iris at openjdk.org (Iris Clark) Date: Tue, 4 Mar 2025 22:42:52 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: <4cF-jYBLChdNhTp1jOQC3_ssjO14CZ13bGr1oQHDW5o=.85737113-070b-4691-9e16-c4ff6b33ab2f@github.com> References: <4cF-jYBLChdNhTp1jOQC3_ssjO14CZ13bGr1oQHDW5o=.85737113-070b-4691-9e16-c4ff6b33ab2f@github.com> Message-ID: On Tue, 4 Mar 2025 22:34:19 GMT, Andrew John Hughes wrote: > Merge jdk8u332-b08 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/13 From andrew at openjdk.org Tue Mar 4 22:42:52 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Mar 2025 22:42:52 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: <4cF-jYBLChdNhTp1jOQC3_ssjO14CZ13bGr1oQHDW5o=.85737113-070b-4691-9e16-c4ff6b33ab2f@github.com> References: <4cF-jYBLChdNhTp1jOQC3_ssjO14CZ13bGr1oQHDW5o=.85737113-070b-4691-9e16-c4ff6b33ab2f@github.com> Message-ID: <1NSi8jQ8G7Uxk_pVhGSZXYdGexBJxoET1J6TJqpG3bk=.be5e0266-946a-46f9-9d25-6c5044133d44@github.com> > Merge jdk8u332-b08 Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/13/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/13/files/9bdf1b61..9bdf1b61 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=13&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=13&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/13.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/13/head:pull/13 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/13 From cslucas at openjdk.org Wed Mar 5 00:57:54 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 5 Mar 2025 00:57:54 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> References: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> Message-ID: On Tue, 4 Mar 2025 17:53:57 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Revert changes to shared cardTable.hpp >> - Revert changes to shared cardTable.hpp > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 258: > >> 256: if (ShenandoahCardBarrier) { >> 257: ShenandoahThreadLocalData::set_card_table(Thread::current(), bs->card_table()->write_byte_map_base()); >> 258: } > > Er. This sets up card table for VMThread, right? I am surprised we do not need this for other fields in `ShenandoahThreadLocalData`. Yes, that's for the VMThread. That seems like a good question. I ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1980492593 From cslucas at openjdk.org Wed Mar 5 01:10:50 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 5 Mar 2025 01:10:50 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v6] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Address PR feedback: formatting. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23170/files - new: https://git.openjdk.org/jdk/pull/23170/files/717b8b44..cbf5aab0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=04-05 Stats: 8 lines in 6 files changed: 4 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From cslucas at openjdk.org Wed Mar 5 01:14:44 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 5 Mar 2025 01:14:44 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v7] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: - Fix merge conflict - Address PR feedback: formatting. - Revert changes to shared cardTable.hpp - Revert changes to shared cardTable.hpp - Fix merge conflict - Address PR feedback: no changes to shared files. - Merge master - Addressing PR comments: some refactorings, ppc fix, off-by-one fix. - Relocation of Card Tables ------------- Changes: https://git.openjdk.org/jdk/pull/23170/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=06 Stats: 295 lines in 28 files changed: 150 ins; 92 del; 53 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From gziemski at openjdk.org Wed Mar 5 15:32:03 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 5 Mar 2025 15:32:03 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag In-Reply-To: References: Message-ID: On Tue, 25 Feb 2025 09:49:41 GMT, Afshin Zafari wrote: > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* Changes requested by gziemski (Reviewer). Changes requested by gziemski (Reviewer). src/hotspot/share/cds/metaspaceShared.cpp line 1475: > 1473: (address)archive_space_rs.base() == base_address, "Sanity"); > 1474: // Register archive space with NMT. > 1475: MemTracker::record_virtual_memory_tag(archive_space_rs.base(), archive_space_rs.size(), mtClassShared); The pattern here is: `something.base(), something.base.size()` instead of doing this over and over again, why can't we just pass `something` to MemTracker::record_virtual_memory_tag() and let it figure out `base` and `size` itself? src/hotspot/share/cds/metaspaceShared.cpp line 1548: > 1546: return nullptr; > 1547: } > 1548: // NMT: fix up the space tags What exactly needs to be fixed here? ------------- PR Review: https://git.openjdk.org/jdk/pull/23770#pullrequestreview-2661498707 PR Review: https://git.openjdk.org/jdk/pull/23770#pullrequestreview-2661515550 PR Review Comment: https://git.openjdk.org/jdk/pull/23770#discussion_r1981647511 PR Review Comment: https://git.openjdk.org/jdk/pull/23770#discussion_r1981635746 From shade at openjdk.org Wed Mar 5 16:55:25 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Mar 2025 16:55:25 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port Message-ID: This PR implements JEP 503: Remove the 32-bit x86 Port. The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. Additional testing: - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) ------------- Commit messages: - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 - 8345169: Implement JEP 503: Remove the 32-bit x86 Port Changes: https://git.openjdk.org/jdk/pull/23906/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23906&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345169 Stats: 30068 lines in 26 files changed: 4 ins; 30054 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/23906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23906/head:pull/23906 PR: https://git.openjdk.org/jdk/pull/23906 From vlivanov at openjdk.org Wed Mar 5 17:19:13 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 5 Mar 2025 17:19:13 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) Hotspot changes look good to me. I fully support removing x86-32-specific files first and then clean up x86-32-specific code in x86-specific and shared files (e.g., guarded by `#ifndef _LP64`). ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2661836831 From shade at openjdk.org Wed Mar 5 17:48:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Mar 2025 17:48:09 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: References: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> Message-ID: On Wed, 5 Mar 2025 00:55:13 GMT, Cesar Soares Lucas wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 258: >> >>> 256: if (ShenandoahCardBarrier) { >>> 257: ShenandoahThreadLocalData::set_card_table(Thread::current(), bs->card_table()->write_byte_map_base()); >>> 258: } >> >> Er. This sets up card table for VMThread, right? I am surprised we do not need this for other fields in `ShenandoahThreadLocalData`. > > Yes, that's for the VMThread. That seems like a good question. I Actually, I am wondering why this is needed. It looks to me VMThread attaches after heap initialization, and the normal `ShenandoahBarrierSet::on_thread_attach` should handle it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1981887605 From shade at openjdk.org Wed Mar 5 17:48:08 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Mar 2025 17:48:08 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v7] In-Reply-To: References: Message-ID: <_LIv8Ggp3ukK0HmhknyG_Mz2x5OKs63Y-qSXTQo9Gdo=.9efc86f1-6cc4-425b-9319-5e1500eb59da@github.com> On Wed, 5 Mar 2025 01:14:44 GMT, Cesar Soares Lucas wrote: >> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. >> >> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. >> >> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. >> >> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. >> >> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. >> >> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. >> >> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Fix merge conflict > - Address PR feedback: formatting. > - Revert changes to shared cardTable.hpp > - Revert changes to shared cardTable.hpp > - Fix merge conflict > - Address PR feedback: no changes to shared files. > - Merge master > - Addressing PR comments: some refactorings, ppc fix, off-by-one fix. > - Relocation of Card Tables src/hotspot/os_cpu/linux_arm/javaThread_linux_arm.cpp line 43: > 41: > 42: void JavaThread::cache_global_variables() { > 43: #if INCLUDE_SHENANDOAHGC Sounds like we want to be consistent between C1 and C2 code, so maybe we should inject in adjacent block as: if (bs->is_a(BarrierSet::CardTableBarrierSet) && !bs->is_a(BarrierSet::ShenandoahBarrierSet)) { ... src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp line 57: > 55: _byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); > 56: assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); > 57: assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); It is a bit sad to see these asserts go. Is this because `_byte_map` is now mutable? May I suggest doing something like: _write_byte_map = (CardValue*) write_space.base(); _write_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); ...later... _read_byte_map = (CardValue*) read_space.base(); _read_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); ...later... // Set up current byte map _byte_map = _write_byte_map; _byte_map_base = _write_byte_map_base; // Check one side is good assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); swap_read_and_write_tables(); // Check another side is good assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); swap_read_and_write_tables(); src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 638: > 636: CardTable::CardValue* new_ptr; > 637: SwapTLSCardTable(CardTable::CardValue* np) { > 638: this->new_ptr = np; Suggestion: CardTable::CardValue* const _new_ptr; SwapTLSCardTable(CardTable::CardValue* np) : _new_ptr(np) {} ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1981872217 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1981869962 PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1981835070 From ysr at openjdk.org Wed Mar 5 18:02:57 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 5 Mar 2025 18:02:57 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 23:29:18 GMT, Y. Srinivas Ramakrishna wrote: >> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. >> >> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. >> >> ### Test >> - [x] hotspot_gc_shenandoah >> - [x] Tier 1 >> - [x] Tier 2 > > src/hotspot/share/gc/shenandoah/shenandoahGeneration.hpp line 206: > >> 204: bool is_mark_complete() { return _is_marking_complete.is_set(); } >> 205: virtual void set_mark_complete(); >> 206: virtual void set_mark_incomplete(); > > Why are these declared virtual? OK, I see that `ShenandoahGlobalGeneration` forces the state of `ShenandoahOdGeneration` and `ShenandoahYoungGeneration`, but is that our intention? I am seeing (see comment elsewhere) that we are always either using global generation's marking context explicitly, or using a region to index into the appropriate containing generation's marking context. If so, can we dispense with the forcing of global context's state into the contexts for the two generations? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980429065 From ysr at openjdk.org Wed Mar 5 18:02:56 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 5 Mar 2025 18:02:56 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 08:34:16 GMT, Xiaolong Peng wrote: > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [x] Tier 2 Had a few questions and comments inline. I'll take a closer look once you have responded to those. Thank you for finding this probably long-standing incorrectness/fuzziness and fixing it properly! src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1028: > 1026: > 1027: #ifdef ASSERT > 1028: ShenandoahMarkingContext* const ctx = _heap->marking_context(); Why not this instead? ShenandoahMarkingContext* const ctx = _heap->marking_context(r); src/hotspot/share/gc/shenandoah/shenandoahGeneration.hpp line 206: > 204: bool is_mark_complete() { return _is_marking_complete.is_set(); } > 205: virtual void set_mark_complete(); > 206: virtual void set_mark_incomplete(); Why are these declared virtual? src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 737: > 735: public: > 736: inline ShenandoahMarkingContext* complete_marking_context(ShenandoahHeapRegion* region) const; > 737: inline ShenandoahMarkingContext* marking_context() const; Should document semantics of both methods, please! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp line 868: > 866: #ifdef ASSERT > 867: { > 868: // During full gc, heap->complete_marking_context() is not valid, may equal nullptr. Looks like this comment is obsolete? src/hotspot/share/gc/shenandoah/shenandoahMarkingContext.cpp line 103: > 101: > 102: bool ShenandoahMarkingContext::is_complete() { > 103: return ShenandoahHeap::heap()->global_generation()->is_mark_complete(); Do we need this? It seems wrong to me that even though each generation has its own marking context, we ask any marking context to report if that of the Global Generation is complete. I'd explicitly let generations maintain the state of completeness of their marking contexts, and for clients to query the generations for that state rather than having the individual marking contexts respond to that question. Where is this used after your changes? src/hotspot/share/gc/shenandoah/shenandoahMarkingContext.hpp line 88: > 86: bool is_bitmap_range_within_region_clear(const HeapWord* start, const HeapWord* end) const; > 87: > 88: bool is_complete(); Add a 1-line documentation comment for this method. src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337: > 335: // drop the reference. > 336: if (type == REF_PHANTOM) { > 337: return heap->complete_marking_context(referent_region)->is_marked(raw_referent); Doesn't the assert down at line 350 also need `complete_marking_context` ? Same at line 441. May be comb through all of these to determine which we need for proper assertion checking? I'd start by documenting the semantics of the APIs clearly. I am not completely clear on that yet (pun not intended :-) ------------- PR Review: https://git.openjdk.org/jdk/pull/23886#pullrequestreview-2659389355 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980523168 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980420417 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980401312 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980403403 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980437298 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980406186 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1981905245 From andrew at openjdk.org Wed Mar 5 18:25:18 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 18:25:18 GMT Subject: RFR: Merge jdk8u:master Message-ID: Merge jdk8u332-b09 ------------- Commit messages: - Merge jdk8u332-b09 - 8284936: Fix Java 7 bootstrap breakage due to use of Arrays.stream - Added tag jdk8u332-b08 for changeset 95b31159fdfd The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/14/files Stats: 13 lines in 3 files changed: 11 ins; 0 del; 2 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/14.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/14/head:pull/14 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/14 From cslucas at openjdk.org Wed Mar 5 18:49:05 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 5 Mar 2025 18:49:05 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v7] In-Reply-To: <_LIv8Ggp3ukK0HmhknyG_Mz2x5OKs63Y-qSXTQo9Gdo=.9efc86f1-6cc4-425b-9319-5e1500eb59da@github.com> References: <_LIv8Ggp3ukK0HmhknyG_Mz2x5OKs63Y-qSXTQo9Gdo=.9efc86f1-6cc4-425b-9319-5e1500eb59da@github.com> Message-ID: On Wed, 5 Mar 2025 17:32:30 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: >> >> - Fix merge conflict >> - Address PR feedback: formatting. >> - Revert changes to shared cardTable.hpp >> - Revert changes to shared cardTable.hpp >> - Fix merge conflict >> - Address PR feedback: no changes to shared files. >> - Merge master >> - Addressing PR comments: some refactorings, ppc fix, off-by-one fix. >> - Relocation of Card Tables > > src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp line 57: > >> 55: _byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); >> 56: assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); >> 57: assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); > > It is a bit sad to see these asserts go. Is this because `_byte_map` is now mutable? May I suggest doing something like: > > > _write_byte_map = (CardValue*) write_space.base(); > _write_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); > ...later... > _read_byte_map = (CardValue*) read_space.base(); > _read_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); > ...later... > > // Set up current byte map > _byte_map = _write_byte_map; > _byte_map_base = _write_byte_map_base; > > // Check one side is good > assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); > assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); > swap_read_and_write_tables(); > > // Check another side is good > assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); > assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); > swap_read_and_write_tables(); Yeah, I didn't like that either. If I recall correctly I had to remove them because part of the expressions ended up calling `byte_map(_base)` which would come from `ThreadLocalData` which wasn't set at the time `initialize()` was being called. Now that we don't have the virtual methods anymore I think I can put back the asserts. I'll try+test that and get back to you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1981983462 From andrew at openjdk.org Wed Mar 5 18:59:30 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 18:59:30 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 3 new changesets Message-ID: Changeset: c7a735dd Branch: master Author: Andrew John Hughes Date: 2022-04-16 04:24:00 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/c7a735ddeb50ed7fb24b5024c5fb11841b0818e0 Added tag jdk8u332-b08 for changeset 95b31159fdfd ! .hgtags Changeset: 3d2fe9bb Branch: master Author: Andrew John Hughes Date: 2022-04-18 01:32:28 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3d2fe9bbb4c5f704d08982a3b1c4b424a9dd1d37 8284936: Fix Java 7 bootstrap breakage due to use of Arrays.stream Reviewed-by: mbalao ! jaxp/src/com/sun/java_cup/internal/runtime/lr_parser.java ! jaxp/src/com/sun/org/apache/xpath/internal/compiler/Token.java Changeset: 0f3b1805 Branch: master Author: Andrew John Hughes Date: 2025-03-04 22:41:27 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/0f3b1805da765b18999dc8b614f29795fb060195 Merge jdk8u332-b09 From andrew at openjdk.org Wed Mar 5 18:59:38 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 18:59:38 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u332-b09 for changeset 3d2fe9bb Message-ID: <66942053-dd81-4670-a7b2-7dda936161d9@openjdk.org> Tagged by: Andrew John Hughes Date: 2022-04-18 02:47:59 +0000 Changeset: 3d2fe9bb Author: Andrew John Hughes Date: 2022-04-18 01:32:28 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3d2fe9bbb4c5f704d08982a3b1c4b424a9dd1d37 From andrew at openjdk.org Wed Mar 5 18:59:46 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 18:59:46 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u332-b09 for changeset 0f3b1805 Message-ID: <1f30b32a-2303-4866-a29d-eabc47ced3e4@openjdk.org> Tagged by: Andrew John Hughes Date: 2025-03-05 17:52:40 +0000 Added tag shenandoah8u332-b09 for changeset 0f3b1805da7 Changeset: 0f3b1805 Author: Andrew John Hughes Date: 2025-03-04 22:41:27 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/0f3b1805da765b18999dc8b614f29795fb060195 From andrew at openjdk.org Wed Mar 5 18:59:48 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 18:59:48 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u332-ga for changeset 3d2fe9bb Message-ID: <8cc9c4e5-5f42-4692-98b4-7e6bf86c78e2@openjdk.org> Tagged by: Andrew John Hughes Date: 2022-04-22 16:45:54 +0000 Changeset: 3d2fe9bb Author: Andrew John Hughes Date: 2022-04-18 01:32:28 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3d2fe9bbb4c5f704d08982a3b1c4b424a9dd1d37 From andrew at openjdk.org Wed Mar 5 19:02:15 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 19:02:15 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: References: Message-ID: > Merge jdk8u332-b09 Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/14/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/14/files/0f3b1805..0f3b1805 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=14&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=14&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/14.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/14/head:pull/14 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/14 From andrew at openjdk.org Wed Mar 5 19:02:15 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 5 Mar 2025 19:02:15 GMT Subject: RFR: Merge jdk8u:master In-Reply-To: References: Message-ID: On Wed, 5 Mar 2025 18:20:12 GMT, Andrew John Hughes wrote: > Merge jdk8u332-b09 GHA builds will not work until [JDK-8284622](https://bugs.openjdk.org/browse/JDK-8284622) is merged in 8u362-b03 ------------- PR Comment: https://git.openjdk.org/shenandoah-jdk8u/pull/14#issuecomment-2701812530 From iris at openjdk.org Wed Mar 5 19:02:15 2025 From: iris at openjdk.org (Iris Clark) Date: Wed, 5 Mar 2025 19:02:15 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: References: Message-ID: On Wed, 5 Mar 2025 18:20:12 GMT, Andrew John Hughes wrote: > Merge jdk8u332-b09 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/14 From xpeng at openjdk.org Wed Mar 5 19:11:30 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 5 Mar 2025 19:11:30 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v2] In-Reply-To: References: Message-ID: > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [x] Tier 2 Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Always use active_generation()->complete_marking_context() during reference processing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23886/files - new: https://git.openjdk.org/jdk/pull/23886/files/01c6ea66..465deaec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=00-01 Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/23886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886 PR: https://git.openjdk.org/jdk/pull/23886 From kvn at openjdk.org Wed Mar 5 20:10:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 5 Mar 2025 20:10:53 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) Good. So it will be stacked PRs which you will combine for final integration? ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2662377172 From kvn at openjdk.org Wed Mar 5 20:14:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 5 Mar 2025 20:14:53 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: <8UpKLmwCBMscNGtKyktL_h1aBYo6uzB3kYJOWeJIugA=.78c737ec-e212-4458-a009-79867ad260e5@github.com> On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) This is confusing. This PR is part of changes so it can't be "Implement JEP 503: Remove the 32-bit x86 Port" and should be subtask of Umbrella RFE. Am I missing something? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2701962563 From xpeng at openjdk.org Wed Mar 5 21:13:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 5 Mar 2025 21:13:53 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v2] In-Reply-To: References: Message-ID: <18_o9YLk3Ri0MTJscSSdp1Mg1C8c_cLUjoRfnxGL2e4=.ab8937c4-7ba2-4a67-8ecf-248f1c6f5545@github.com> On Wed, 5 Mar 2025 17:59:36 GMT, Y. Srinivas Ramakrishna wrote: > Had a few questions and comments inline. I'll take a closer look once you have responded to those. > > Thank you for finding this probably long-standing incorrectness/fuzziness and fixing it properly! Thanks, I'll update PR to address your comments. > src/hotspot/share/gc/shenandoah/shenandoahMarkingContext.cpp line 103: > >> 101: >> 102: bool ShenandoahMarkingContext::is_complete() { >> 103: return ShenandoahHeap::heap()->global_generation()->is_mark_complete(); > > Do we need this? It seems wrong to me that even though each generation has its own marking context, we ask any marking context to report if that of the Global Generation is complete. I'd explicitly let generations maintain the state of completeness of their marking contexts, and for clients to query the generations for that state rather than having the individual marking contexts respond to that question. > > Where is this used after your changes? It may not be needed anymore, I will double check the usage and remove it is not used at all. > src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337: > >> 335: // drop the reference. >> 336: if (type == REF_PHANTOM) { >> 337: return heap->complete_marking_context(referent_region)->is_marked(raw_referent); > > Doesn't the assert down at line 350 also need `complete_marking_context` ? Same at line 441. May be comb through all of these to determine which we need for proper assertion checking? > > I'd start by documenting the semantics of the APIs clearly. I am not completely clear on that yet (pun not intended :-) Yes, the assert at line 350 should use complete_marking_context, I have update it in the fix of the issue we found in stress test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23886#issuecomment-2702079928 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1982190515 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1982188076 From xpeng at openjdk.org Wed Mar 5 21:50:09 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 5 Mar 2025 21:50:09 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v2] In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 23:11:20 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Always use active_generation()->complete_marking_context() during reference processing > > src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 737: > >> 735: public: >> 736: inline ShenandoahMarkingContext* complete_marking_context(ShenandoahHeapRegion* region) const; >> 737: inline ShenandoahMarkingContext* marking_context() const; > > Should document semantics of both methods, please! I'll add some comments for both. Also I'm feel the assert is not enough, I feel we should change the `assert` in complete_marking_context to `guarantee`, should be something like: guarantee(is_mark_complete(), "Marking must be completed."); return ShenandoahHeap::heap()->marking_context(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1982236158 From xpeng at openjdk.org Wed Mar 5 21:54:08 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 5 Mar 2025 21:54:08 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: References: Message-ID: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [x] Tier 2 Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: - Remove obsolete code comments - Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23886/files - new: https://git.openjdk.org/jdk/pull/23886/files/465deaec..c78f66ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=01-02 Stats: 9 lines in 4 files changed: 2 ins; 7 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886 PR: https://git.openjdk.org/jdk/pull/23886 From xpeng at openjdk.org Wed Mar 5 22:02:02 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 5 Mar 2025 22:02:02 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: References: Message-ID: On Wed, 5 Mar 2025 01:33:26 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove obsolete code comments >> - Address review comments > > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1028: > >> 1026: >> 1027: #ifdef ASSERT >> 1028: ShenandoahMarkingContext* const ctx = _heap->marking_context(); > > Why not this instead? > > ShenandoahMarkingContext* const ctx = _heap->marking_context(r); Technically there is only one global marking context for Shenandoah, even in generational mode, passing the region to marking_context doesn't make any difference. But in the method `complete_marking_context(r)`, it checks if the affiliated generation has complete marking, it is a more convenient version of `complete_marking_context(affiliation)`. > src/hotspot/share/gc/shenandoah/shenandoahMarkingContext.hpp line 88: > >> 86: bool is_bitmap_range_within_region_clear(const HeapWord* start, const HeapWord* end) const; >> 87: >> 88: bool is_complete(); > > Add a 1-line documentation comment for this method. is_complete is not used in any place, I removed it in the new version. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1982247904 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1982248805 From vlivanov at openjdk.org Wed Mar 5 23:22:51 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 5 Mar 2025 23:22:51 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) There's a wide variety of options to justify the goal of the JEP. A bare minimum would be to just remove x86-32 build support. And on the other side of the spectrum the current patch would be accompanied by all x86-32-specific code and all the features used exclusively by x86-32 port. During previous round of discussions I expressed my preference as keeping JEP implementation simple and perform all non-trivial cleanups as follow-up RFEs. IMO it enables swift removal (and eliminates the burden to keep x86-32 port alive during ongoing development work) while keeping incremental cleanup activities at comfortable pace. Proposed patch perfectly justifies my preference. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2702299307 From kvn at openjdk.org Wed Mar 5 23:35:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 5 Mar 2025 23:35:54 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: <5ztalawYQsCNUsfzWyR_b5YVFWbDNzoHVUA4ycRjvRs=.42fd2b02-462f-4803-9d3b-2b907121c5be@github.com> On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) To clarify. I am completely agree with changes in this PR - I approved it. My concern is the **Title** of this PR and JBS entry. So I want to understand the steps we do with this PR and following changes covered by numbers of subtask pointed by Aleksey. So what, @iwanowww, you say is that this PR is **indeed** implementation of the JEP. And all subtasks listed in Umbrella RFE are following up RFEs after we integrated the JEP. Do I understand that correctly? Why not do what Ioi did for AOT class loading JEP? I mean, to have depending PRs which are combined into one implementation push. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2702316448 From vlivanov at openjdk.org Thu Mar 6 00:18:52 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 6 Mar 2025 00:18:52 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> References: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> Message-ID: On Wed, 5 Mar 2025 23:19:50 GMT, Vladimir Ivanov wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > There's a wide variety of options to justify the goal of the JEP. A bare minimum would be to just remove x86-32 build support. And on the other side of the spectrum the current patch would be accompanied by all x86-32-specific code and all the features used exclusively by x86-32 port. > > During previous round of discussions I expressed my preference as keeping JEP implementation simple and perform all non-trivial cleanups as follow-up RFEs. IMO it enables swift removal (and eliminates the burden to keep x86-32 port alive during ongoing development work) while keeping incremental cleanup activities at comfortable pace. > > Proposed patch perfectly justifies my preference. > So what, @iwanowww, you say is that this PR is indeed implementation of the JEP. > And all subtasks listed in Umbrella RFE are following up RFEs after we integrated the JEP. > Do I understand that correctly? Yes. > Why not do what Ioi did for AOT class loading JEP? I mean, to have depending PRs which are combined into one implementation push. It's definitely an option. But, most likely, there'll be some overlooked cases anyway (leading to additional followup RFEs). And the more convoluted the changes are the harder it is to validate their correctness, thus increasing the risks for product stability and delaying the integration. (I'm not sure how much time Aleksey and other contributors want to volunteer to this project.) Also, in case of AOT JEP the situation was quite the opposite: it started with a huge patch which was split into multiple mostly independent parts to streamline its review. For x86-32 code removal there's no such patch prepared yet and the complete scope of work is not clear yet. IMO the crucial part is to get the port officially retired. After that the rest can become a good source of starter tasks :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2702376289 From kvn at openjdk.org Thu Mar 6 00:21:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 6 Mar 2025 00:21:53 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) Okay. Thank you for explaining. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2702380269 From dholmes at openjdk.org Thu Mar 6 04:38:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 6 Mar 2025 04:38:52 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) I am also a bit puzzled by the JEP/JBS strategy here. I would expect a bunch of dependent PRs that then get integrated together as "The Implementation of JEP 503". I understand things may be missed that require some follow up RFE's but I don't think we should start from that position and have a large chunk of work not be done under the JEP umbrella. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2702781694 From shade at openjdk.org Thu Mar 6 09:52:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Mar 2025 09:52:09 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> Message-ID: On Thu, 6 Mar 2025 00:16:12 GMT, Vladimir Ivanov wrote: >> There's a wide variety of options to justify the goal of the JEP. A bare minimum would be to just remove x86-32 build support. And on the other side of the spectrum the current patch would be accompanied by all x86-32-specific code and all the features used exclusively by x86-32 port. >> >> During previous round of discussions I expressed my preference as keeping JEP implementation simple and perform all non-trivial cleanups as follow-up RFEs. IMO it enables swift removal (and eliminates the burden to keep x86-32 port alive during ongoing development work) while keeping incremental cleanup activities at comfortable pace. >> >> Proposed patch perfectly justifies my preference. > >> So what, @iwanowww, you say is that this PR is indeed implementation of the JEP. >> And all subtasks listed in Umbrella RFE are following up RFEs after we integrated the JEP. >> Do I understand that correctly? > > Yes. > >> Why not do what Ioi did for AOT class loading JEP? I mean, to have depending PRs which are combined into one implementation push. > > It's definitely an option. But, most likely, there'll be some overlooked cases anyway (leading to additional followup RFEs). And the more convoluted the changes are the harder it is to validate their correctness, thus increasing the risks for product stability and delaying the integration. (I'm not sure how much time Aleksey and other contributors want to volunteer to this project.) > > Also, in case of AOT JEP the situation was quite the opposite: it started with a huge patch which was split into multiple mostly independent parts to streamline its review. For x86-32 code removal there's no such patch prepared yet and the complete scope of work is not clear yet. > > IMO the crucial part is to get the port officially retired. After that the rest can become a good source of starter tasks :-) Basically what @iwanowww said: this PR *is* the removal of x86_32 port. After this PR integrates, it is not possible to build x86_32, because the core implementation of it is missing, and build system would refuse to even try building it. So this removes x86_32 port as the feature, atomically, matching the title and intent of the JEP. *Then*, follow-up subtasks RFE would clean up the parts of Hotspot that were added to support various x86_32-specific features, and are no longer needed anymore. I, for one, also believed the complete PR would be more straight-forward. I attempted this at at https://github.com/openjdk/jdk/pull/22567. After working on that draft PR, and listening to what people said about it, I can conclude that is not a great way to go with this removal. The massive drawbacks of complete/stacked PR are now obvious to me: 1. It is hard to review. The complete PR is huge, 210+ files affected. A lot of removals are logically connected across different files, and while they are simple in isolation, it is hard for a reviewer to separate several cleanups in large PRs. 2. It accrues merge conflicts very fast. This happens even when mainline is somewhat idle without large feature integrations. I expect this work to be even harder once we are closer to RDP1. 3. It is hard to reach consensus on. Non-trivial changes require thorough review, and cobbling together multiple non-trivial changes require polynomially more coordination. I have seen this in Win32 port removal, so for a large PR like that I expect multiple, week-long review and amendment sessions. Which conspires with (1) and (2). 4. It is easy to introduce/overlook bugs. I already did this once in a complete PR when I accidentally removed the wrong part of C1 regalloc code, and it started ever so slightly misbehaving. And it was not obvious, because it was obscured by other changes in the vicinity. Which conspires with (1), (2) and (3). 5. It would introduce a single changeset that would be hard to bisect when things go wrong. And the things would go wrong, because of (1), (4) and partially by new opportunities presented by (2). For the C1 bug I mentioned above, I was able to quickly nail it through the bisection of my stack of atomic commits. That stack would not be available once we squash the commits/PRs before the integration. So while on a surface it might look more enticing to purge everything at once, the amount of hassle we would endure is hard to justify. Doing this PR for port removal + multiple post-removal cleanups piecewise lets us reach the same final state without extra work, while doing so at leisurely pace and maintaining more convenient code history for future bug hunts. Bottom-line: Let's not make our own lives harder unnecessarily. Atomic commits FTW. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2703337731 From jsjolen at openjdk.org Thu Mar 6 10:27:06 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 6 Mar 2025 10:27:06 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag In-Reply-To: References: Message-ID: <0CDxD_JcYtu4Ax1xB8TDyWqLkxNub6OfJRtSmCFONgU=.bd3edae0-3eaf-4ba3-ac9e-2582d1baf151@github.com> On Wed, 5 Mar 2025 15:28:59 GMT, Gerard Ziemski wrote: >> With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. >> Tests: >> linux-x64-debug, gtest:NMT* and runtime/NMT* > > src/hotspot/share/cds/metaspaceShared.cpp line 1475: > >> 1473: (address)archive_space_rs.base() == base_address, "Sanity"); >> 1474: // Register archive space with NMT. >> 1475: MemTracker::record_virtual_memory_tag(archive_space_rs.base(), archive_space_rs.size(), mtClassShared); > > The pattern here is: > > `something.base(), something.base.size()` > > instead of doing this over and over again, why can't we just pass `something` to MemTracker::record_virtual_memory_tag() and let it figure out `base` and `size` itself? We could have an overload for `ReservedSpace`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23770#discussion_r1983093725 From coleenp at openjdk.org Thu Mar 6 12:38:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Mar 2025 12:38:52 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) I agree with @iwanowww's and @shipilev comments. I would like to see this be the JEP implementation and the additional cleanups, particularly in the interpreter, handled one by one. I don't see any advantage for one big integration push. It'll be disruptive and for this, there is no scenario where this would be helpful to any future work. When Aleksey sent out the original PR there were cleanups that needed explanation. Finding the explanations in the big PR is a pain for scrolling. And the reviewers for that part of the change were a different set than ones needed for this change. Again for no benefit. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2664309410 From coleenp at openjdk.org Thu Mar 6 12:38:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Mar 2025 12:38:53 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> Message-ID: On Thu, 6 Mar 2025 09:48:47 GMT, Aleksey Shipilev wrote: >>> So what, @iwanowww, you say is that this PR is indeed implementation of the JEP. >>> And all subtasks listed in Umbrella RFE are following up RFEs after we integrated the JEP. >>> Do I understand that correctly? >> >> Yes. >> >>> Why not do what Ioi did for AOT class loading JEP? I mean, to have depending PRs which are combined into one implementation push. >> >> It's definitely an option. But, most likely, there'll be some overlooked cases anyway (leading to additional followup RFEs). And the more convoluted the changes are the harder it is to validate their correctness, thus increasing the risks for product stability and delaying the integration. (I'm not sure how much time Aleksey and other contributors want to volunteer to this project.) >> >> Also, in case of AOT JEP the situation was quite the opposite: it started with a huge patch which was split into multiple mostly independent parts to streamline its review. For x86-32 code removal there's no such patch prepared yet and the complete scope of work is not clear yet. >> >> IMO the crucial part is to get the port officially retired. After that the rest can become a good source of starter tasks :-) > > Basically what @iwanowww said: this PR *is* the removal of x86_32 port. > > After this PR integrates, it is not possible to build x86_32, because the core implementation of it is missing, and build system would refuse to even try building it. So this removes x86_32 port as the feature, atomically, matching the title and intent of the JEP. *Then*, follow-up subtasks RFE would clean up the parts of Hotspot that were added to support various x86_32-specific features, and are no longer needed anymore. > > Honestly, I also believed the complete PR that cleans up every dusty corner at once would be more straight-forward. But then I tried it at https://github.com/openjdk/jdk/pull/22567. After investing a few full days on that draft PR, and listening to what people said about it, I firmly changed my mind, and can conclude that singular PR or series of stacked PRs are not a great way to go with this removal. > > The massive drawbacks of complete/stacked PR are now obvious to me: > 1. It is hard to review. The complete PR is huge, 210+ files affected. A lot of removals are logically connected across different files, and while they are simple in isolation, it is hard for a reviewer to separate several cleanups in large PRs. Stacked PRs would help some, but: > 2. It accrues merge conflicts very fast. This happens even when mainline is somewhat idle without large feature integrations. I did complete PR near New Year holidays, and it was _already_ a headache. I expect this work to be even harder once we are closer to RDP1. It would be even more tedious with a chain of 10+ stacked PRs, as I got the preview of this when rebasing the stack of atomic commits in the complete draft PR several times. > 3. It is hard to reach consensus on. Non-trivial changes require thorough review, and cobbling together multiple non-trivial changes require polynomially more coordination. I have seen this in Win32 port removal, so for a large PR like that I expect multiple, week-long review and amendment sessions. Which conspires with (1) and (2). > 4. It is easy to introduce/overlook bugs. I already did this once in a complete PR when I accidentally removed the wrong part of C1 regalloc code, and it started ever so slightly misbehaving. And it was not obvious, because it was obscured by other changes in the vicinity, and it only failed one test in tier4. This conspires with (1), (2) and (3). > 5. It would introduce a single changeset that would be hard to bisect when things go wrong. And the things wo... Also @shipilev I'm jealous of all your code removal. :) Well done getting agreement on this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2703725960 From azafari at openjdk.org Thu Mar 6 14:22:38 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Mar 2025 14:22:38 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: References: Message-ID: > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: ReservedSpace is accepted as param. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23770/files - new: https://git.openjdk.org/jdk/pull/23770/files/0a1495bc..1e7853e6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=00-01 Stats: 21 lines in 12 files changed: 4 ins; 1 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/23770.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23770/head:pull/23770 PR: https://git.openjdk.org/jdk/pull/23770 From azafari at openjdk.org Thu Mar 6 14:22:39 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Mar 2025 14:22:39 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: References: Message-ID: On Wed, 5 Mar 2025 15:25:29 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> ReservedSpace is accepted as param. > > src/hotspot/share/cds/metaspaceShared.cpp line 1548: > >> 1546: return nullptr; >> 1547: } >> 1548: // NMT: fix up the space tags > > What exactly needs to be fixed here? Removed. Obsolete comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23770#discussion_r1983442554 From azafari at openjdk.org Thu Mar 6 14:22:39 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Mar 2025 14:22:39 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: <0CDxD_JcYtu4Ax1xB8TDyWqLkxNub6OfJRtSmCFONgU=.bd3edae0-3eaf-4ba3-ac9e-2582d1baf151@github.com> References: <0CDxD_JcYtu4Ax1xB8TDyWqLkxNub6OfJRtSmCFONgU=.bd3edae0-3eaf-4ba3-ac9e-2582d1baf151@github.com> Message-ID: On Thu, 6 Mar 2025 10:23:54 GMT, Johan Sj?len wrote: >> src/hotspot/share/cds/metaspaceShared.cpp line 1475: >> >>> 1473: (address)archive_space_rs.base() == base_address, "Sanity"); >>> 1474: // Register archive space with NMT. >>> 1475: MemTracker::record_virtual_memory_tag(archive_space_rs.base(), archive_space_rs.size(), mtClassShared); >> >> The pattern here is: >> >> `something.base(), something.base.size()` >> >> instead of doing this over and over again, why can't we just pass `something` to MemTracker::record_virtual_memory_tag() and let it figure out `base` and `size` itself? > > We could have an overload for `ReservedSpace`. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23770#discussion_r1983441505 From gziemski at openjdk.org Thu Mar 6 15:27:04 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 6 Mar 2025 15:27:04 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 14:22:38 GMT, Afshin Zafari wrote: >> With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. >> Tests: >> linux-x64-debug, gtest:NMT* and runtime/NMT* > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > ReservedSpace is accepted as param. LGTM, thank you for fixing this. Need to fix the build errors: /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:31: error: invalid use of incomplete type ?const class ReservedSpace? 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); | ^~ In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? 38 | class ReservedSpace; | ^~~~~~~~~~~~~ In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:30: /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:42: error: invalid use of incomplete type ?const class ReservedSpace? 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); | ^~ In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? ... (rest of output omitted) ------------- Marked as reviewed by gziemski (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23770#pullrequestreview-2664792545 PR Comment: https://git.openjdk.org/jdk/pull/23770#issuecomment-2704168962 From ihse at openjdk.org Thu Mar 6 16:21:54 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 6 Mar 2025 16:21:54 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) make/autoconf/platform.m4 line 669: > 667: AC_ARG_ENABLE(deprecated-ports, [AS_HELP_STRING([--enable-deprecated-ports@<:@=yes/no@:>@], > 668: [Suppress the error when configuring for a deprecated port @<:@no@:>@])]) > 669: # There are no deprecated ports. This option is left to be consistent with future deprecations. Please remove. Old code is always present in git history if you want to reuse it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1983670151 From shade at openjdk.org Thu Mar 6 16:40:58 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Mar 2025 16:40:58 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 16:18:50 GMT, Magnus Ihse Bursie wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > make/autoconf/platform.m4 line 669: > >> 667: AC_ARG_ENABLE(deprecated-ports, [AS_HELP_STRING([--enable-deprecated-ports@<:@=yes/no@:>@], >> 668: [Suppress the error when configuring for a deprecated port @<:@no@:>@])]) >> 669: # There are no deprecated ports. This option is left to be consistent with future deprecations. > > Please remove. Old code is always present in git history if you want to reuse it. I don't mind removing it, my concern would be to _remember_ this option was there! I guess it is okay to re-re-invent it later, possibly under a different name, when the next port gets deprecated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1983704213 From wkemper at openjdk.org Thu Mar 6 17:59:00 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Mar 2025 17:59:00 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> References: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> Message-ID: On Wed, 5 Mar 2025 21:54:08 GMT, Xiaolong Peng wrote: >> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. >> >> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. >> >> ### Test >> - [x] hotspot_gc_shenandoah >> - [x] Tier 1 >> - [x] Tier 2 > > Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: > > - Remove obsolete code comments > - Address review comments If we always get the complete marking context directly through the generation, we can delete `ShenandoahHeap::complete_marking_context`. src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.cpp line 123: > 121: #ifdef ASSERT > 122: bool reg_live = region->has_live(); > 123: bool bm_live = heap->complete_marking_context(region)->is_marked(cast_to_oop(region->bottom())); Could also use `heap->active_generation()->complete_marking_context()` here. src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: > 170: // contained herein. > 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { > 172: ShenandoahMarkingContext* const marking_context = _heap->complete_marking_context(region); We shouldn't need to look up the generation for this region. It's being promoted so it must be young (in fact, this asserted a few lines down). Perhaps: assert(_heap->young_generation()->is_mark_completed(), "Cannot promote without complete marking for young"); ShenandoahMarkingContext* const marking_context = _heap->marking_context(); ------------- Changes requested by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23886#pullrequestreview-2665222915 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1983818301 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1983812569 From wkemper at openjdk.org Thu Mar 6 17:59:01 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Mar 2025 17:59:01 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: References: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> Message-ID: On Thu, 6 Mar 2025 17:49:35 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove obsolete code comments >> - Address review comments > > src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: > >> 170: // contained herein. >> 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { >> 172: ShenandoahMarkingContext* const marking_context = _heap->complete_marking_context(region); > > We shouldn't need to look up the generation for this region. It's being promoted so it must be young (in fact, this asserted a few lines down). Perhaps: > > assert(_heap->young_generation()->is_mark_completed(), "Cannot promote without complete marking for young"); > ShenandoahMarkingContext* const marking_context = _heap->marking_context(); or `_heap->young_generation()->complete_marking_context()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1983821706 From cslucas at openjdk.org Thu Mar 6 18:24:34 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Mar 2025 18:24:34 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v8] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request incrementally with two additional commits since the last revision: - Revert changes to shenandoahHeap.cpp - Address PR feedback: moar clean-up. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23170/files - new: https://git.openjdk.org/jdk/pull/23170/files/046ea8a0..0262b7df Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=06-07 Stats: 29 lines in 4 files changed: 5 ins; 18 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From cslucas at openjdk.org Thu Mar 6 18:24:34 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Mar 2025 18:24:34 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v6] In-Reply-To: <_LIv8Ggp3ukK0HmhknyG_Mz2x5OKs63Y-qSXTQo9Gdo=.9efc86f1-6cc4-425b-9319-5e1500eb59da@github.com> References: <_LIv8Ggp3ukK0HmhknyG_Mz2x5OKs63Y-qSXTQo9Gdo=.9efc86f1-6cc4-425b-9319-5e1500eb59da@github.com> Message-ID: On Wed, 5 Mar 2025 17:32:30 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Address PR feedback: formatting. > > src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp line 57: > >> 55: _byte_map = (CardValue*) write_space.base(); >> 56: _byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); >> 57: > > It is a bit sad to see these asserts go. Is this because `_byte_map` is now mutable? May I suggest doing something like: > > > _write_byte_map = (CardValue*) write_space.base(); > _write_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); > ...later... > _read_byte_map = (CardValue*) read_space.base(); > _read_byte_map_base = _byte_map - (uintptr_t(low_bound) >> _card_shift); > ...later... > > // Set up current byte map > _byte_map = _write_byte_map; > _byte_map_base = _write_byte_map_base; > > // Check one side is good > assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); > assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); > swap_read_and_write_tables(); > > // Check another side is good > assert(byte_for(low_bound) == &_byte_map[0], "Checking start of map"); > assert(byte_for(high_bound-1) <= &_byte_map[last_valid_index()], "Checking end of map"); > swap_read_and_write_tables(); @shipilev - I did some tests and the conclusion is that we can put the asserts back. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1983847384 From cslucas at openjdk.org Thu Mar 6 18:24:34 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Mar 2025 18:24:34 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v5] In-Reply-To: References: <2ZFtKLn2EcbzjKQ_USb3yiOWEWQJYocFwj_rk-5h0Jg=.f4eec566-3e0c-4a75-8c27-2cb785b0081a@github.com> Message-ID: On Wed, 5 Mar 2025 17:45:19 GMT, Aleksey Shipilev wrote: >> Yes, that's for the VMThread. That seems like a good question. I > > Actually, I am wondering why this is needed. It looks to me VMThread attaches after heap initialization, and the normal `ShenandoahBarrierSet::on_thread_attach` should handle it. You're right, we didn't need that anymore. I removed + test it and we're good. I pushed a commit removing that code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1983853294 From ihse at openjdk.org Thu Mar 6 18:25:54 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 6 Mar 2025 18:25:54 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 16:38:13 GMT, Aleksey Shipilev wrote: >> make/autoconf/platform.m4 line 669: >> >>> 667: AC_ARG_ENABLE(deprecated-ports, [AS_HELP_STRING([--enable-deprecated-ports@<:@=yes/no@:>@], >>> 668: [Suppress the error when configuring for a deprecated port @<:@no@:>@])]) >>> 669: # There are no deprecated ports. This option is left to be consistent with future deprecations. >> >> Please remove. Old code is always present in git history if you want to reuse it. > > I don't mind removing it, my concern would be to _remember_ this option was there! I guess it is okay to re-re-invent it later, possibly under a different name, when the next port gets deprecated. It's no that important, no. I'm not sure if previous deprecated ports were handles exactly like this. And you can always do like `git log | grep -i "remove .* port"` to find the change it was removed in, and look what it did... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1983855800 From xpeng at openjdk.org Thu Mar 6 18:29:59 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 18:29:59 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: References: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> Message-ID: <4q52xc9nKJWFe63AT5i4InyJuRu6pTPahZYmmWTJia4=.f7be6d2d-2082-4644-b6e9-dff343b20cdf@github.com> On Thu, 6 Mar 2025 17:55:53 GMT, William Kemper wrote: > If we always get the complete marking context directly through the generation, we can delete `ShenandoahHeap::complete_marking_context`. True, we don't really need it anymore, I'll update the PR and remove it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23886#issuecomment-2704629400 From xpeng at openjdk.org Thu Mar 6 18:30:00 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 18:30:00 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v3] In-Reply-To: References: <1fKMcwPJFREZry2kJf0Vv3DoY5G4xzbdVJcK4It9hyo=.9a38f089-86c6-4fc9-abeb-a807284be822@github.com> Message-ID: On Thu, 6 Mar 2025 17:56:31 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: >> >>> 170: // contained herein. >>> 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { >>> 172: ShenandoahMarkingContext* const marking_context = _heap->complete_marking_context(region); >> >> We shouldn't need to look up the generation for this region. It's being promoted so it must be young (in fact, this asserted a few lines down). Perhaps: >> >> assert(_heap->young_generation()->is_mark_completed(), "Cannot promote without complete marking for young"); >> ShenandoahMarkingContext* const marking_context = _heap->marking_context(); > > or `_heap->young_generation()->complete_marking_context()`. I think `_heap->young_generation()->complete_marking_context()` is better here, I'll update it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1983859657 From xpeng at openjdk.org Thu Mar 6 18:34:43 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 18:34:43 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [x] Tier 2 Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) - Revert "complete_marking_context should guarantee mark is complete" This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. - complete_marking_context should guarantee mark is complete ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23886/files - new: https://git.openjdk.org/jdk/pull/23886/files/c78f66ee..952f7ea5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=02-03 Stats: 9 lines in 5 files changed: 0 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886 PR: https://git.openjdk.org/jdk/pull/23886 From wkemper at openjdk.org Thu Mar 6 18:47:58 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Mar 2025 18:47:58 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 18:34:43 GMT, Xiaolong Peng wrote: >> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. >> >> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. >> >> ### Test >> - [x] hotspot_gc_shenandoah >> - [x] Tier 1 >> - [x] Tier 2 > > Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: > > - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) > - Revert "complete_marking_context should guarantee mark is complete" > > This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. > - complete_marking_context should guarantee mark is complete Thanks for cleaning this up. ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23886#pullrequestreview-2665341497 From cslucas at openjdk.org Thu Mar 6 19:45:21 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Mar 2025 19:45:21 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v9] In-Reply-To: References: Message-ID: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix build: no shenandoah on arm32. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23170/files - new: https://git.openjdk.org/jdk/pull/23170/files/0262b7df..0a540c79 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23170&range=07-08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23170/head:pull/23170 PR: https://git.openjdk.org/jdk/pull/23170 From shade at openjdk.org Thu Mar 6 19:49:58 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Mar 2025 19:49:58 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v9] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 19:45:21 GMT, Cesar Soares Lucas wrote: >> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. >> >> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. >> >> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. >> >> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. >> >> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. >> >> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. >> >> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix build: no shenandoah on arm32. Looks fine now, thanks! I have not looked deeply at card table lifecycle, so I rely on @kdnilsen and @earthling-amzn reviews here. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23170#pullrequestreview-2665465682 From xpeng at openjdk.org Thu Mar 6 20:23:00 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 20:23:00 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 23:38:47 GMT, Y. Srinivas Ramakrishna wrote: > If so, can we dispense with the forcing of global context's state into the contexts for the two generations? I think we can do that if we deprecated the classical mode and only support generational Shenandoah, in classical mode, there is only global generation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1983998709 From duke at openjdk.org Thu Mar 6 22:20:04 2025 From: duke at openjdk.org (duke) Date: Thu, 6 Mar 2025 22:20:04 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v9] In-Reply-To: References: Message-ID: <8TjLe-qgfKrkvUOoUUq5rDeYMYXxQt_isizNgOBsiJg=.ebb05e71-9ee6-4024-a12e-5c7ed8bd6b5f@github.com> On Thu, 6 Mar 2025 19:45:21 GMT, Cesar Soares Lucas wrote: >> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. >> >> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. >> >> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. >> >> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. >> >> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. >> >> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. >> >> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix build: no shenandoah on arm32. @JohnTortugo Your change (at version 0a540c79584f28fe90d128977f5121467f59626b) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23170#issuecomment-2705065887 From ysr at openjdk.org Thu Mar 6 22:57:07 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Mar 2025 22:57:07 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 18:34:43 GMT, Xiaolong Peng wrote: >> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. >> >> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. >> >> ### Test >> - [x] hotspot_gc_shenandoah >> - [x] Tier 1 >> - [x] Tier 2 > > Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: > > - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) > - Revert "complete_marking_context should guarantee mark is complete" > > This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. > - complete_marking_context should guarantee mark is complete A few more comments, mostly pertaining to global gen's "complete" marking context semantics and usage, as well as `SH::[*_]marking_context` delegating to its `active_generation()`'s method. This should be my last round of comments. Thank you for your patience... src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.cpp line 123: > 121: #ifdef ASSERT > 122: bool reg_live = region->has_live(); > 123: bool bm_live = heap->active_generation()->complete_marking_context()->is_marked(cast_to_oop(region->bottom())); Apropos of another comment, if we really want to keep a delegating method in `ShenandoahHeap`, why not use `heap->complete_marking_context()` as a synonym for `heap->active_generation()->complete_marking_context()` ? src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 1: > 1: /* This all looks good. One thing to think about in general about assertions in closures is whether instead of making use of knowledge of the context in which these closures are used, whether it may produce more mantianable code to embed the "active_generation" to which the closure is being applied in the closure itself and have the assertions (or other uses of context) use that instead. Nothing to be done now, but something to think about in making more maintainable code. src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: > 170: // contained herein. > 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { > 172: ShenandoahMarkingContext* const marking_context = _heap->young_generation()->complete_marking_context(); For clarity, you might assert the following before line 172: assert(gc_generation() == _heap->young_generation(), "Sanity check"); Even though it might seem somewhat tautological. src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1: > 1: /* See previous suggestion/comments on `SH::[complete_]marking_context()` as delegating to that method of its `gc_generation()`. What you have here sounds fine too, but a uniform usage of either keeping `SH::[complete_]marking_context()` or not at all makes more sense to me, and seems cleaner to me. src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1283: > 1281: if (_heap->gc_generation()->is_global()) { > 1282: return _heap->marking_context(); > 1283: } Not sure I understand the point of this change in behavior. What purpose does a partial marking context serve? Why not just leave the behavior as was before and return a non-null marking context only when marking is complete and null otherwise. When the client uses the context, it does so to skip over unmarked objects (which are dead if marking is complete), which might end up being too weak if we are still in the midst of marking. I realize that you may not be maintaining a global mark completion so you are returning the marking context irrespective of the state of completion of the marking, but I wonder if that is really the bahavior you want. I would rather, as necessary, we maintain a flag for completion of global marking for the case where we are doing a global gc? ------------- PR Review: https://git.openjdk.org/jdk/pull/23886#pullrequestreview-2665674435 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984134275 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984075136 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984115881 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984140131 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984108547 From ysr at openjdk.org Thu Mar 6 22:57:09 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Mar 2025 22:57:09 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: <7yfWKXewUM1XqWtlnyuPV3nu9bGr5VNJXuXi1aNQGvQ=.4c53d85b-13f3-4bfc-87c3-634d547bb440@github.com> On Wed, 5 Mar 2025 21:58:03 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1028: >> >>> 1026: >>> 1027: #ifdef ASSERT >>> 1028: ShenandoahMarkingContext* const ctx = _heap->marking_context(); >> >> Why not this instead? >> >> ShenandoahMarkingContext* const ctx = _heap->marking_context(r); > > Technically there is only one global marking context for Shenandoah, even in generational mode, passing the region to marking_context doesn't make any difference. > > But in the method `complete_marking_context(r)`, it checks if the affiliated generation has complete marking, it is a more convenient version of `complete_marking_context(affiliation)`. OK, yes, that makes sense. Why not then use both `ShenandoahHeap::[complete_]marking_context()` as synonyms for `ShehandoahHeap::active_generation()->[complete_]marking_context()`. See other related comments in this review round. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984136546 From ysr at openjdk.org Thu Mar 6 22:57:09 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Mar 2025 22:57:09 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: <8w22oUPhZEx0iEIeNQ-GUUjx8jNkjXrTHjfjN_sX4HE=.2c391dd5-227e-4755-ba4d-528a7dcefca3@github.com> On Thu, 6 Mar 2025 20:20:17 GMT, Xiaolong Peng wrote: >> OK, I see that `ShenandoahGlobalGeneration` forces the state of `ShenandoahOdGeneration` and `ShenandoahYoungGeneration`, but is that our intention? I am seeing (see comment elsewhere) that we are always either using global generation's marking context explicitly, or using a region to index into the appropriate containing generation's marking context. If so, can we dispense with the forcing of global context's state into the contexts for the two generations? > >> If so, can we dispense with the forcing of global context's state into the contexts for the two generations? > > I think we can do that if we deprecated the classical mode and only support generational Shenandoah, in classical mode, there is only global generation. I am not sure I follow. In the legacy (non-generational mode) we shouldn't care about the marking state of the old and young generations, just that of the GlobalGeneration. In the generational case, we explicitly track the marking state of the old and young generations explicitly. It sounds to me as if forcing the Old and Young marking states to the state of that of the GlobalGeneration must be exactly for the case where we are using Generational Shenandoah, and we are doing a Global collection? Indeed: void ShenandoahGlobalGeneration::set_mark_complete() { ShenandoahGeneration::set_mark_complete(); if (ShenandoahHeap::heap()->mode()->is_generational()) { ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); heap->young_generation()->set_mark_complete(); heap->old_generation()->set_mark_complete(); } } I am saying that each of Old, Young, and Global generations maintain their own mark completion state and use that to determine what they pass back in response to `complete_marking_context()`. This completely localizes all state rather than unnecessarily and confusingly coupling the states of these generations. So, you remove the part in the `if` branch in the code above, which reduces to the default (or rather only) implementation in the base class, not requiring the over-ride of the Global generation's method for the generational case. void ShenandoahGeneration::set_mark_complete() { _is_marking_complete.set(); } It is possible that I am still missing the actual structure here that requires the override for GlobalGeneration for the generational case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984121357 From xpeng at openjdk.org Thu Mar 6 23:12:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:12:53 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: <7yfWKXewUM1XqWtlnyuPV3nu9bGr5VNJXuXi1aNQGvQ=.4c53d85b-13f3-4bfc-87c3-634d547bb440@github.com> References: <7yfWKXewUM1XqWtlnyuPV3nu9bGr5VNJXuXi1aNQGvQ=.4c53d85b-13f3-4bfc-87c3-634d547bb440@github.com> Message-ID: On Thu, 6 Mar 2025 22:27:59 GMT, Y. Srinivas Ramakrishna wrote: >> Technically there is only one global marking context for Shenandoah, even in generational mode, passing the region to marking_context doesn't make any difference. >> >> But in the method `complete_marking_context(r)`, it checks if the affiliated generation has complete marking, it is a more convenient version of `complete_marking_context(affiliation)`. > > OK, yes, that makes sense. Why not then use both `ShenandoahHeap::[complete_]marking_context()` as synonyms for `ShehandoahHeap::active_generation()->[complete_]marking_context()`. See other related comments in this review round. I feel using `henandoahHeap::complete_marking_context()` as synonyms for `ShehandoahHeap::active_generation()->[complete_]marking_context()` may cause more confusion, just read from the name it seems that it indicates the marking is complete for the whole heap, not just the active generation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984170738 From xpeng at openjdk.org Thu Mar 6 23:29:55 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:29:55 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: <8w22oUPhZEx0iEIeNQ-GUUjx8jNkjXrTHjfjN_sX4HE=.2c391dd5-227e-4755-ba4d-528a7dcefca3@github.com> References: <8w22oUPhZEx0iEIeNQ-GUUjx8jNkjXrTHjfjN_sX4HE=.2c391dd5-227e-4755-ba4d-528a7dcefca3@github.com> Message-ID: On Thu, 6 Mar 2025 22:11:26 GMT, Y. Srinivas Ramakrishna wrote: >>> If so, can we dispense with the forcing of global context's state into the contexts for the two generations? >> >> I think we can do that if we deprecated the classical mode and only support generational Shenandoah, in classical mode, there is only global generation. > > I am not sure I follow. In the legacy (non-generational mode) we shouldn't care about the marking state of the old and young generations, just that of the GlobalGeneration. In the generational case, we explicitly track the marking state of the old and young generations explicitly. It sounds to me as if forcing the Old and Young marking states to the state of that of the GlobalGeneration must be exactly for the case where we are using Generational Shenandoah, and we are doing a Global collection? Indeed: > > > void ShenandoahGlobalGeneration::set_mark_complete() { > ShenandoahGeneration::set_mark_complete(); > if (ShenandoahHeap::heap()->mode()->is_generational()) { > ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); > heap->young_generation()->set_mark_complete(); > heap->old_generation()->set_mark_complete(); > } > } > > > I am saying that each of Old, Young, and Global generations maintain their own mark completion state and use that to determine what they pass back in response to `complete_marking_context()`. This completely localizes all state rather than unnecessarily and confusingly coupling the states of these generations. > > So, you remove the part in the `if` branch in the code above, which reduces to the default (or rather only) implementation in the base class, not requiring the over-ride of the Global generation's method for the generational case. > > > void ShenandoahGeneration::set_mark_complete() { > _is_marking_complete.set(); > } > > > It is possible that I am still missing the actual structure here that requires the override for GlobalGeneration for the generational case. Sorry I misunderstood your original proposal, I thought you meant to suggest to remove the flag from ShenandoahGlobalGeneration, instead the set_mark_complete/is_mark_complete will more like view/delegation layer like: void ShenandoahGlobalGeneration::set_mark_complete() { ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); heap->young_generation()->set_mark_complete(); heap->old_generation()->set_mark_complete(); } bool ShenandoahGlobalGeneration::is_mark_complete() { ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); return heap->young_generation()->is_mark_complete() && heap->old_generation()->is_mark_complete(); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984182699 From xpeng at openjdk.org Thu Mar 6 23:36:55 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:36:55 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> On Thu, 6 Mar 2025 22:05:31 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: >> >> - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) >> - Revert "complete_marking_context should guarantee mark is complete" >> >> This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. >> - complete_marking_context should guarantee mark is complete > > src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: > >> 170: // contained herein. >> 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { >> 172: ShenandoahMarkingContext* const marking_context = _heap->young_generation()->complete_marking_context(); > > For clarity, you might assert the following before line 172: > > assert(gc_generation() == _heap->young_generation(), "Sanity check"); > > > Even though it might seem somewhat tautological. Thanks, I'll add it. > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1283: > >> 1281: if (_heap->gc_generation()->is_global()) { >> 1282: return _heap->marking_context(); >> 1283: } > > Not sure I understand the point of this change in behavior. What purpose does a partial marking context serve? Why not just leave the behavior as was before and return a non-null marking context only when marking is complete and null otherwise. When the client uses the context, it does so to skip over unmarked objects (which are dead if marking is complete), which might end up being too weak if we are still in the midst of marking. > > I realize that you may not be maintaining a global mark completion so you are returning the marking context irrespective of the state of completion of the marking, but I wonder if that is really the bahavior you want. I would rather, as necessary, we maintain a flag for completion of global marking for the case where we are doing a global gc? It is confusing that it looks like a behavior change, but actually there is no behavior change in this method, all the change here is to make the behavior of this method to be exactly same a before. The old impl always return the the marking context, regardless the completeness status of marking, because the old `_heap->complete_marking_context()` always return w/o assert error due to inaccurate completeness marking status in the marking context, we are fixing the issue in this PR which breaks the old impl of this method. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984188328 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984188121 From xpeng at openjdk.org Thu Mar 6 23:48:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:48:53 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: <8w22oUPhZEx0iEIeNQ-GUUjx8jNkjXrTHjfjN_sX4HE=.2c391dd5-227e-4755-ba4d-528a7dcefca3@github.com> Message-ID: On Thu, 6 Mar 2025 23:26:16 GMT, Xiaolong Peng wrote: >> I am not sure I follow. In the legacy (non-generational mode) we shouldn't care about the marking state of the old and young generations, just that of the GlobalGeneration. In the generational case, we explicitly track the marking state of the old and young generations explicitly. It sounds to me as if forcing the Old and Young marking states to the state of that of the GlobalGeneration must be exactly for the case where we are using Generational Shenandoah, and we are doing a Global collection? Indeed: >> >> >> void ShenandoahGlobalGeneration::set_mark_complete() { >> ShenandoahGeneration::set_mark_complete(); >> if (ShenandoahHeap::heap()->mode()->is_generational()) { >> ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); >> heap->young_generation()->set_mark_complete(); >> heap->old_generation()->set_mark_complete(); >> } >> } >> >> >> I am saying that each of Old, Young, and Global generations maintain their own mark completion state and use that to determine what they pass back in response to `complete_marking_context()`. This completely localizes all state rather than unnecessarily and confusingly coupling the states of these generations. >> >> So, you remove the part in the `if` branch in the code above, which reduces to the default (or rather only) implementation in the base class, not requiring the over-ride of the Global generation's method for the generational case. >> >> >> void ShenandoahGeneration::set_mark_complete() { >> _is_marking_complete.set(); >> } >> >> >> It is possible that I am still missing the actual structure here that requires the override for GlobalGeneration for the generational case. > > Sorry I misunderstood your original proposal, I thought you meant to suggest to remove the flag from ShenandoahGlobalGeneration, instead the set_mark_complete/is_mark_complete will more like view/delegation layer like: > > void ShenandoahGlobalGeneration::set_mark_complete() { > ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); > heap->young_generation()->set_mark_complete(); > heap->old_generation()->set_mark_complete(); > } > > bool ShenandoahGlobalGeneration::is_mark_complete() { > ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); > return heap->young_generation()->is_mark_complete() && heap->old_generation()->is_mark_complete(); > } You proposal will make the impl of the set_mark_complete/is_mark_complete of ShenandoahGeneration cleaner, but the thing is it will change current design and behavior, we may have to update the code where there methods is called, e.g. when we call `set_mark_complete` of gc_generation/active_generation, if it is global generation, we may have to explicitly call the same methods of ShenandoahYoungGeneration and ShenandoahOldGeneration to fan out the status. How about I follow up it in a separate task and update the implementation if necessary? I want to limit the changes involved in this PR, and only fix the bug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984196615 From xpeng at openjdk.org Thu Mar 6 23:54:54 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:54:54 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 21:26:22 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: >> >> - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) >> - Revert "complete_marking_context should guarantee mark is complete" >> >> This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. >> - complete_marking_context should guarantee mark is complete > > src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 1: > >> 1: /* > > This all looks good. > One thing to think about in general about assertions in closures is whether instead of making use of knowledge of the context in which these closures are used, whether it may produce more mantianable code to embed the "active_generation" to which the closure is being applied in the closure itself and have the assertions (or other uses of context) use that instead. Nothing to be done now, but something to think about in making more maintainable code. Right, active_generation should be used instead of global_generation to get the complete marking context, with the context of full GC, even we know it active_generation is the global gen, but it's better not to use global_generation directly for better maintainable code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984201726 From xpeng at openjdk.org Thu Mar 6 23:57:54 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Mar 2025 23:57:54 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> References: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> Message-ID: On Thu, 6 Mar 2025 23:34:21 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp line 172: >> >>> 170: // contained herein. >>> 171: void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { >>> 172: ShenandoahMarkingContext* const marking_context = _heap->young_generation()->complete_marking_context(); >> >> For clarity, you might assert the following before line 172: >> >> assert(gc_generation() == _heap->young_generation(), "Sanity check"); >> >> >> Even though it might seem somewhat tautological. > > Thanks, I'll add it. Question: Does Shenandoah promote region in global cycles? the gc_generation might be global if so. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984203547 From xpeng at openjdk.org Fri Mar 7 00:14:10 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Mar 2025 00:14:10 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v5] In-Reply-To: References: Message-ID: > With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed. > > This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API. > > ### Test > - [x] hotspot_gc_shenandoah > - [x] Tier 1 > - [x] Tier 2 Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23886/files - new: https://git.openjdk.org/jdk/pull/23886/files/952f7ea5..17bcb358 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23886&range=03-04 Stats: 6 lines in 2 files changed: 1 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/23886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886 PR: https://git.openjdk.org/jdk/pull/23886 From xpeng at openjdk.org Fri Mar 7 00:36:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Mar 2025 00:36:53 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 23:52:32 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 1: >> >>> 1: /* >> >> This all looks good. >> One thing to think about in general about assertions in closures is whether instead of making use of knowledge of the context in which these closures are used, whether it may produce more mantianable code to embed the "active_generation" to which the closure is being applied in the closure itself and have the assertions (or other uses of context) use that instead. Nothing to be done now, but something to think about in making more maintainable code. > > Right, active_generation should be used instead of global_generation to get the complete marking context, with the context of full GC, even we know it active_generation is the global gen, but it's better not to use global_generation directly for better maintainable code. Updated it to use active_generation. >> src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1283: >> >>> 1281: if (_heap->gc_generation()->is_global()) { >>> 1282: return _heap->marking_context(); >>> 1283: } >> >> Not sure I understand the point of this change in behavior. What purpose does a partial marking context serve? Why not just leave the behavior as was before and return a non-null marking context only when marking is complete and null otherwise. When the client uses the context, it does so to skip over unmarked objects (which are dead if marking is complete), which might end up being too weak if we are still in the midst of marking. >> >> I realize that you may not be maintaining a global mark completion so you are returning the marking context irrespective of the state of completion of the marking, but I wonder if that is really the bahavior you want. I would rather, as necessary, we maintain a flag for completion of global marking for the case where we are doing a global gc? > > It is confusing that it looks like a behavior change, but actually there is no behavior change in this method, all the change here is to make the behavior of this method to be exactly same a before. > > The old impl always return the the marking context, regardless the completeness status of marking, because the old `_heap->complete_marking_context()` always return w/o assert error due to inaccurate completeness marking status in the marking context, we are fixing the issue in this PR which breaks the old impl of this method. The method get_marking_context_for_old is called at line 1363 in method `verify_rem_set_before_mark`, as the name indicates it could be called before mark. If current gc generation is global gc, the old marking flag should have set to false when the global flag is set, it is a bit wired I'm not sure if we should change it now, but I think we will have to correct/update the impl of this method later when update the design of completeness flags of global/young/old generations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984234928 PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984233610 From xpeng at openjdk.org Fri Mar 7 00:48:56 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Mar 2025 00:48:56 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 22:54:14 GMT, Y. Srinivas Ramakrishna wrote: > A few more comments, mostly pertaining to global gen's "complete" marking context semantics and usage, as well as `SH::[*_]marking_context` delegating to its `active_generation()`'s method. > > This should be my last round of comments. Thank you for your patience... Thanks for much for the reviews, I'll probably not add method like `SH::complete_marking_context()` delegating to its `SH::active_generation()->complete_marking_context()` since the other confusion it may causes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23886#issuecomment-2705261964 From xpeng at openjdk.org Fri Mar 7 01:04:52 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Mar 2025 01:04:52 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> Message-ID: On Fri, 7 Mar 2025 00:58:15 GMT, Y. Srinivas Ramakrishna wrote: >> Question: Does Shenandoah promote region in global cycles? the gc_generation might be global if so. > > Good point. I don't see any reason promotions should be verboten in global cycles. cc @earthling-amzn ? > > If that is indeed the case, a clean separation and maintenance of completeness of marking for global generation, and use of `_heap->gc_generation()` would make sense to me. Thanks for the confirmation, I added assert as below since it gc_generation could be global : assert(!_heap->gc_generation()->is_old(), "Sanity check"); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984253086 From ysr at openjdk.org Fri Mar 7 01:09:52 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 7 Mar 2025 01:09:52 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Fri, 7 Mar 2025 00:31:53 GMT, Xiaolong Peng wrote: >> It is confusing that it looks like a behavior change, but actually there is no behavior change in this method, all the change here is to make the behavior of this method to be exactly same a before. >> >> The old impl always return the the marking context, regardless the completeness status of marking, because the old `_heap->complete_marking_context()` always return w/o assert error due to inaccurate completeness marking status in the marking context, we are fixing the issue in this PR which breaks the old impl of this method. > > The method get_marking_context_for_old is called at line 1363 in method `verify_rem_set_before_mark`, as the name indicates it could be called before mark. > > If current gc generation is global gc, the old marking flag should have set to false when the global flag is set, it is a bit wired I'm not sure if we should change it now, but I think we will have to correct/update the impl of this method later when update the design of completeness flags of global/young/old generations. Here's my thinking: The clients of this method do not want to use an incomplete marking context. We either want to look at all the objects (when marking information is incomplete) or we want complete marking context in which case we will skip over dead objects. Hence my reservation about this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984255948 From ysr at openjdk.org Fri Mar 7 01:13:02 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 7 Mar 2025 01:13:02 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> Message-ID: On Fri, 7 Mar 2025 01:02:29 GMT, Xiaolong Peng wrote: >> Good point. I don't see any reason promotions should be verboten in global cycles. cc @earthling-amzn ? >> >> If that is indeed the case, a clean separation and maintenance of completeness of marking for global generation, and use of `_heap->gc_generation()` would make sense to me. > > Thanks for the confirmation, I added assert as below since it gc_generation could be global : > > > assert(!_heap->gc_generation()->is_old(), "Sanity check"); The assert may be fine, but the treatment of completeness of the marking context seems very brittle to me and apt to cause problems in the future. I would prefer a cleaner separation of these. May be we can sync up separately to discuss this along with @earthling-amzn . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1984257756 From dholmes at openjdk.org Fri Mar 7 07:20:58 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Mar 2025 07:20:58 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> Message-ID: On Thu, 6 Mar 2025 09:48:47 GMT, Aleksey Shipilev wrote: > After this PR integrates, it is not possible to build x86_32 You could add a couple of lines to the build code and it would not be possible to build 32-bit, so that is a necessary but not sufficient condition to claim to implement the JEP IMO. I'm not looking for one big PR, I'm looking for multiple PR's as proposed but which all fall under the JEP umbrella. Until the JEP is targeted then nothing can be integrated anyway. This is what, I thought, dependent PR's were designed for. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2705707504 From dholmes at openjdk.org Fri Mar 7 07:23:54 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Mar 2025 07:23:54 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 18:23:24 GMT, Magnus Ihse Bursie wrote: >> I don't mind removing it, my concern would be to _remember_ this option was there! I guess it is okay to re-re-invent it later, possibly under a different name, when the next port gets deprecated. > > It's no that important, no. I'm not sure if previous deprecated ports were handles exactly like this. > > And you can always do like `git log | grep -i "remove .* port"` to find the change it was removed in, and look what it did... I think leaving a comment describing how to deprecate a port is useful. To look it up in history you have to realise there is something to look up. "They who are not reminded of the past will invent a new way to do it in the future." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1984572816 From azafari at openjdk.org Fri Mar 7 09:36:42 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Mar 2025 09:36:42 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v3] In-Reply-To: References: Message-ID: > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: fixed build problem. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23770/files - new: https://git.openjdk.org/jdk/pull/23770/files/1e7853e6..87f22f46 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23770.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23770/head:pull/23770 PR: https://git.openjdk.org/jdk/pull/23770 From azafari at openjdk.org Fri Mar 7 10:26:17 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Mar 2025 10:26:17 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v4] In-Reply-To: References: Message-ID: > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: new fix. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23770/files - new: https://git.openjdk.org/jdk/pull/23770/files/87f22f46..3850708c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=02-03 Stats: 2 lines in 2 files changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23770.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23770/head:pull/23770 PR: https://git.openjdk.org/jdk/pull/23770 From ihse at openjdk.org Fri Mar 7 11:29:58 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 7 Mar 2025 11:29:58 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) I agree with David here. Yes, implementing this multiple PRs is the correct approach (I think we all agree on this). However, it seems strange to mark just this single PR as implementing the JEP. Instead, that honor should fall on an umbrella JBS issue, which is dependent on this PR, but also the other planned updates. Before these are done, we can't really say that the JEP is implemented. In practical terms it does not mean much, but the bookkeeping seems better aligned with reality in that way. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2706212136 From ihse at openjdk.org Fri Mar 7 15:10:57 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 7 Mar 2025 15:10:57 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 7 Mar 2025 07:21:43 GMT, David Holmes wrote: >> It's no that important, no. I'm not sure if previous deprecated ports were handles exactly like this. >> >> And you can always do like `git log | grep -i "remove .* port"` to find the change it was removed in, and look what it did... > > I think leaving a comment describing how to deprecate a port is useful. To look it up in history you have to realise there is something to look up. > > "They who are not reminded of the past will invent a new way to do it in the future." The `--enable-deprecated-ports` is still there. All that is removed is an if statement and a print line. I know the make syntax can seem intimidating, but just ask me or any other build team member if you need help to recreate such a thing. It is not like it is a complicated algorithm that can be written in many ways. This is just make's equivalant of: if (some_condition) { println("whatever"); } To me this is just utter nonsense to keep that commented out. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1985229429 From ihse at openjdk.org Fri Mar 7 15:10:59 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 7 Mar 2025 15:10:59 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) make/autoconf/platform.m4 line 669: > 667: AC_ARG_ENABLE(deprecated-ports, [AS_HELP_STRING([--enable-deprecated-ports@<:@=yes/no@:>@], > 668: [Suppress the error when configuring for a deprecated port @<:@no@:>@])]) > 669: # There are no deprecated ports. This option is left to be consistent with future deprecations. Also, to be clear, we need to keep the option to not break people's scripts. The alternative would be to deprecate the `--enable-deprecated-ports` arguments, and then remove it in a future release, but I think it is reasonable to keep it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1985232539 From azafari at openjdk.org Fri Mar 7 16:06:32 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Mar 2025 16:06:32 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v5] In-Reply-To: References: Message-ID: <0SlK7ixxGv5N7-LQnC7SwgpcK4Oz_9_H24qnrGPrTpc=.9bfd6434-6a48-4563-9dd6-66cff70dafe7@github.com> > With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. > Tests: > linux-x64-debug, gtest:NMT* and runtime/NMT* Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge remote-tracking branch 'origin/master' into _8350566_size_par_set_tag - new fix. - fixed build problem. - ReservedSpace is accepted as param. - applied also to VMT. - 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag ------------- Changes: https://git.openjdk.org/jdk/pull/23770/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23770&range=04 Stats: 27 lines in 14 files changed: 6 ins; 1 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/23770.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23770/head:pull/23770 PR: https://git.openjdk.org/jdk/pull/23770 From wkemper at openjdk.org Fri Mar 7 18:30:54 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 7 Mar 2025 18:30:54 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: <9nhUQ5sIaBFGlhEh-w5J-TAQMAbp3dWUiSRfMRoK2rY=.9fd2e8bc-6a53-4385-9e7b-1b0d36a91a8d@github.com> Message-ID: On Fri, 7 Mar 2025 01:10:26 GMT, Y. Srinivas Ramakrishna wrote: >> Thanks for the confirmation, I added assert as below since it gc_generation could be global : >> >> >> assert(!_heap->gc_generation()->is_old(), "Sanity check"); > > The assert may be fine, but the treatment of completeness of the marking context seems very brittle to me and apt to cause problems in the future. I would prefer a cleaner separation of these. May be we can sync up separately to discuss this along with @earthling-amzn . Yes, regions may be promoted during a global cycle. Completing the mark for a global cycle also completes the mark for the young and old generations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1985512202 From azafari at openjdk.org Fri Mar 7 18:43:54 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Mar 2025 18:43:54 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: References: Message-ID: <6-tmSINEwkIMphxPbnP92QmD_-i3Ui7pU9aLpeQ_PmY=.1760c755-2e70-4152-a273-1ad036c46e2e@github.com> On Thu, 6 Mar 2025 15:24:49 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> ReservedSpace is accepted as param. > > Need to fix the build errors: > > /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:31: error: invalid use of incomplete type ?const class ReservedSpace? > 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? > 38 | class ReservedSpace; > | ^~~~~~~~~~~~~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:30: > /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:42: error: invalid use of incomplete type ?const class ReservedSpace? > 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? > ... (rest of output omitted) Thank you @gerard-ziemski and @jdksjolen for your reviews. Build failure and merge problems fixed. GHA tests failures are due timeout. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23770#issuecomment-2707162561 From wkemper at openjdk.org Fri Mar 7 19:19:53 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 7 Mar 2025 19:19:53 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v4] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 22:25:29 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: >> >> - Remove ShenandoahHeap::complete_marking_context(ShenandoahHeapRegion* region) >> - Revert "complete_marking_context should guarantee mark is complete" >> >> This reverts commit 2004973965ea0e617cf9e5fc45be24f0e06e90a1. >> - complete_marking_context should guarantee mark is complete > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.cpp line 123: > >> 121: #ifdef ASSERT >> 122: bool reg_live = region->has_live(); >> 123: bool bm_live = heap->active_generation()->complete_marking_context()->is_marked(cast_to_oop(region->bottom())); > > Apropos of another comment, if we really want to keep a delegating method in `ShenandoahHeap`, why not use `heap->complete_marking_context()` as a synonym for `heap->active_generation()->complete_marking_context()` ? This makes sense to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1985570114 From wkemper at openjdk.org Fri Mar 7 19:27:54 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 7 Mar 2025 19:27:54 GMT Subject: RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained [v5] In-Reply-To: References: <8w22oUPhZEx0iEIeNQ-GUUjx8jNkjXrTHjfjN_sX4HE=.2c391dd5-227e-4755-ba4d-528a7dcefca3@github.com> Message-ID: On Thu, 6 Mar 2025 23:46:02 GMT, Xiaolong Peng wrote: >> Sorry I misunderstood your original proposal, I thought you meant to suggest to remove the flag from ShenandoahGlobalGeneration, instead the set_mark_complete/is_mark_complete will more like view/delegation layer like: >> >> void ShenandoahGlobalGeneration::set_mark_complete() { >> ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); >> heap->young_generation()->set_mark_complete(); >> heap->old_generation()->set_mark_complete(); >> } >> >> bool ShenandoahGlobalGeneration::is_mark_complete() { >> ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap(); >> return heap->young_generation()->is_mark_complete() && heap->old_generation()->is_mark_complete(); >> } > > You proposal will make the impl of the set_mark_complete/is_mark_complete of ShenandoahGeneration cleaner, but the thing is it will change current design and behavior, we may have to update the code where there methods is called, e.g. when we call `set_mark_complete` of gc_generation/active_generation, if it is global generation, we may have to explicitly call the same methods of ShenandoahYoungGeneration and ShenandoahOldGeneration to fan out the status. > > How about I follow up it in a separate task and update the implementation if necessary? I want to limit the changes involved in this PR, and only fix the bug. The young and old generations are only instantiated in the generational mode, so using them without checking the mode will result in SEGV in non-generational modes. Global collections have a lot of overlap with old collections. I think what Ramki is saying, is that if we change all the code that makes assertions about the completion status of young/old marking to use the `active_generation` field instead, then we wouldn't need to update the completion status of young/old during a global collection. The difficulty here is that we need assurances that the old generation mark bitmap is valid in collections subsequent to a global collection. So, I don't think we can rely on completion status of `active_generation` when it was global, in following collections where it may now refer to young or old. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1985578948 From azafari at openjdk.org Fri Mar 7 20:22:53 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Mar 2025 20:22:53 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v2] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 15:24:49 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> ReservedSpace is accepted as param. > > Need to fix the build errors: > > /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:31: error: invalid use of incomplete type ?const class ReservedSpace? > 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? > 38 | class ReservedSpace; > | ^~~~~~~~~~~~~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:30: > /home/runner/work/jdk/jdk/src/hotspot/share/nmt/memTracker.hpp:224:42: error: invalid use of incomplete type ?const class ReservedSpace? > 224 | record_virtual_memory_tag(rs.base(), rs.size(), mem_tag); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/memory/allocation.cpp:28: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/metaspace.hpp:38:7: note: forward declaration of ?class ReservedSpace? > ... (rest of output omitted) @gerard-ziemski and @jdksjolen, new reviews are needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23770#issuecomment-2707346222 From wkemper at openjdk.org Fri Mar 7 21:52:09 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 7 Mar 2025 21:52:09 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops Message-ID: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. ------------- Commit messages: - Make concurrent class unloading a little safer - Can't recycle during weak roots, but does the LRB really need to return doomed from space objects? - What happens if we allow trash regions to be recycled during concurrent weak roots? - Trying to find a test that fails because the LRB won't return a doomed from space object Changes: https://git.openjdk.org/jdk/pull/23951/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23951&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351444 Stats: 24 lines in 3 files changed: 10 ins; 7 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/23951.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23951/head:pull/23951 PR: https://git.openjdk.org/jdk/pull/23951 From cslucas at openjdk.org Sat Mar 8 14:04:03 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Sat, 8 Mar 2025 14:04:03 GMT Subject: Integrated: 8343468: GenShen: Enable relocation of remembered set card tables In-Reply-To: References: Message-ID: On Fri, 17 Jan 2025 05:18:39 GMT, Cesar Soares Lucas wrote: > In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. > > The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. > > The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. > > Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. > > The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. > > Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. > > Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. This pull request has now been integrated. Changeset: 4e1367e3 Author: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/4e1367e34be724a0f84069100854c38333610714 Stats: 271 lines in 25 files changed: 132 ins; 87 del; 52 mod 8343468: GenShen: Enable relocation of remembered set card tables Reviewed-by: shade, kdnilsen, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/23170 From alanb at openjdk.org Sat Mar 8 18:15:06 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 8 Mar 2025 18:15:06 GMT Subject: RFR: 8343468: GenShen: Enable relocation of remembered set card tables [v9] In-Reply-To: References: Message-ID: On Thu, 6 Mar 2025 19:45:21 GMT, Cesar Soares Lucas wrote: >> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. >> >> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase. >> >> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table. >> >> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address. >> >> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake?something we plan to work on after this PR is merged. >> >> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code. >> >> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I?d appreciate it if someone with access to RISC-V (@luhenry ?) and PowerPC (@TheRealMDoerr ?) platforms could review and test the changes for those platforms, as I have limited access to running tests on them. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix build: no shenandoah on arm32. This seems to be break Oracle --disable-jvm-feature-shenandoahgc builds on aarch64. [2025-03-08T14:16:00,338Z] src/hotspot/cpu/aarch64/aarch64.ad:4544:58: error: no member named 'ShenandoahBarrierSet' in 'BarrierSet' [2025-03-08T14:16:00,338Z] !BarrierSet::barrier_set()->is_a(BarrierSet::ShenandoahBarrierSet) && ------------- PR Comment: https://git.openjdk.org/jdk/pull/23170#issuecomment-2708425850 From jsjolen at openjdk.org Mon Mar 10 09:00:00 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 10 Mar 2025 09:00:00 GMT Subject: RFR: 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag [v5] In-Reply-To: <0SlK7ixxGv5N7-LQnC7SwgpcK4Oz_9_H24qnrGPrTpc=.9bfd6434-6a48-4563-9dd6-66cff70dafe7@github.com> References: <0SlK7ixxGv5N7-LQnC7SwgpcK4Oz_9_H24qnrGPrTpc=.9bfd6434-6a48-4563-9dd6-66cff70dafe7@github.com> Message-ID: On Fri, 7 Mar 2025 16:06:32 GMT, Afshin Zafari wrote: >> With the `size` parameter there will be no need to traverse/go through the nodes between the base and end of the region. >> Tests: >> linux-x64-debug, gtest:NMT* and runtime/NMT* > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: > > - Merge remote-tracking branch 'origin/master' into _8350566_size_par_set_tag > - new fix. > - fixed build problem. > - ReservedSpace is accepted as param. > - applied also to VMT. > - 8350566: NMT: add size parameter to MemTracker::record_virtual_memory_tag Please check if this PR is responsible for the test failures before integrating. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23770#pullrequestreview-2670204517 From shade at openjdk.org Mon Mar 10 09:41:01 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 09:41:01 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: <5nkWE-TpdoNk-k_5JE7MopX5_KJf6DjjLWMADxWr29k=.ee34fa19-882c-4731-86f6-bdaed2a6e276@github.com> Message-ID: <5PZgChiJciTkkZIUnXtTWZMB4ZxN8DmZHUWBFt9ptBw=.77216c80-a470-4e20-908e-7e419404e607@github.com> On Fri, 7 Mar 2025 07:18:27 GMT, David Holmes wrote: > You could add a couple of lines to the build code and it would not be possible to build 32-bit, so that is a necessary but not sufficient condition to claim to implement the JEP IMO. Agreed. This is why this PR removes the actual implementation of the port as well. Even if you can coerce build system to pass the arch checks, x86_32 would not build, because there is no x86_32 port in the sources anymore. There are only assorted, heavily-intertwined-with-x86-64 leftovers around Hotspot subsystems that were needed to support the port. We will deal with those leftovers at leisurely pace after the port is gone. > @dholmes-ora: I'm not looking for one big PR, I'm looking for multiple PR's as proposed but which all fall under the JEP umbrella. Until the JEP is targeted then nothing can be integrated anyway. This is what, I thought, dependent PR's were designed for. > @magicus Instead, that honor should fall on an umbrella JBS issue, which is dependent on this PR, but also the other planned updates. Before these are done, we can't really say that the JEP is implemented. I believe we are in agreement that we do not want to cobble all removals/cleanups into a singular PR/changeset. We _can_ convert the umbrella RFE for post-JEP cleanups as the implementation task subtasks. I.e. do: - JDK-XXXXX: Implement JEP 503: Remove the 32-bit x86-port (<---- this would be an umbrella, without a changeset) - JDK-XXXXX: JEP 503: Remove the x86_32 files and builds support (<---- this would be this PR) - JDK-XXXXX: JEP 503: Remove code blocks that handle UseSSE < 2 - JDK-XXXXX: JEP 503: Remove dead IA32 code blocks ... Then we manually close umbrella issue as "implemented" when subtasks are done. What I dislike about this approach is that we are committing to doing free-standing post- x86-32 cleanups under the JEP umbrella. This runs into several problems: a) some cleanups are very deep, intertwined with x86-64, connected to x86-32-zero, and might even be rejected, like deep cleaning in `MacroAssembler` ([JDK-8351162](https://bugs.openjdk.org/browse/JDK-8351162)); b) some cleanups would only be discovered later, and would require yet another umbrella tasks for post-JEP work anyway. Are you agreeing to this, @dholmes-ora, @magicus? This would create more work for ourselves and our fellow engineers in JDK 25 timeframe. If you are insisting we need to do it this way, can I count on your prompt reviews in these new JEP subtasks? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2709964337 From shade at openjdk.org Mon Mar 10 09:49:44 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 09:49:44 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Drop commented out block from deprecations - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 - 8345169: Implement JEP 503: Remove the 32-bit x86 Port ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23906/files - new: https://git.openjdk.org/jdk/pull/23906/files/b76816cb..0fef97b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23906&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23906&range=00-01 Stats: 7320 lines in 306 files changed: 3971 ins; 1797 del; 1552 mod Patch: https://git.openjdk.org/jdk/pull/23906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23906/head:pull/23906 PR: https://git.openjdk.org/jdk/pull/23906 From shade at openjdk.org Mon Mar 10 09:49:44 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 09:49:44 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: <3jbKFXHYH2mgyYOQjn2rfGm0IpIwH377DuDrZAY4X7w=.8d0767f3-7190-4396-824a-d55e6a61f479@github.com> On Fri, 7 Mar 2025 15:06:18 GMT, Magnus Ihse Bursie wrote: >> I think leaving a comment describing how to deprecate a port is useful. To look it up in history you have to realise there is something to look up. >> >> "They who are not reminded of the past will invent a new way to do it in the future." > > The `--enable-deprecated-ports` is still there. All that is removed is an if statement and a print line. I know the make syntax can seem intimidating, but just ask me or any other build team member if you need help to recreate such a thing. It is not like it is a complicated algorithm that can be written in many ways. This is just make's equivalant of: > > > if (some_condition) { > println("whatever"); > } > > > To me this is just utter nonsense to keep that commented out. "Utter nonsense" might be a bit harsh. We do code samples around OpenJDK all the time to leave breadcrumbs for future use. As I said, I don't mind removing it, done so in new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1986943830 From alanb at openjdk.org Mon Mar 10 09:52:54 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 10 Mar 2025 09:52:54 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: <8GSZRPDK4WLn6bHC2D2Ow47a-xd9NzCN6azXs2aDp_g=.47762983-f579-4ea1-b22e-abbd1740e6d3@github.com> On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port JEP 486 (Permanently Disable the Security Manager) updated the API and removed the ability to set a SecurityManager in a first big commit. The JBS issue for that commit was associated with the JEP. There were 150+ follow on issues, some removed essentially dead code, others fixed or removed tests that were excluded by the first commit. It wasn't initially clear if all cleanups and code removal could be done in the same release (JDK 24) but almost all did happen as only a few remaining cleanups to APIs docs spilled over into JDK 25. Anyway, just pointing out this JEP as an example that may be useful to look at when considering the approach for the 32-bit x86 port removal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2709998166 From ihse at openjdk.org Mon Mar 10 10:46:55 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 10 Mar 2025 10:46:55 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port I don't have a super strong opinion on this. If you want to call this the implementation of JEP 503, I'm fine with that. I guess it all depends a bit on where you want to draw the line between "removal" and "subsequent cleanups that have now been possible". The latter part almost never ends in a codebase as large as the JDK; I still find Solaris remnants in the code to this day, so getting rid of *all* code that is no longer necessary cannot reasonably be a criterion for finishing a removal. I guess I just viewed the intertwined ifdef:ed code as more part of the actual removal, but then again, it's Hotspot code and that's strictly really not my business. :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2710148614 From shade at openjdk.org Mon Mar 10 11:59:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 11:59:53 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops In-Reply-To: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: <3AO8SncuFl0-pj94X6S1GHNXi01EoOTZU1lnhrmtsKo=.85990912-b326-40f1-9dda-594b05b1f694@github.com> On Fri, 7 Mar 2025 21:47:31 GMT, William Kemper wrote: > Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. This looks fine as the first step to this sequencing problem. I do think we still have a conceptual problem of accessing the the oops in _trash_ regions. This patch blocks `trash` -> `empty` transition by delaying cleanup. This likely works well in release builds. I would expect debug builds to still complain we are touching the oop in `trash` region. At class unloading, we can only have trash regions from the immediate trashing during region selection. So, in addition to this, I think we really need to move immediate trashing somewhere after class unloading as well. This would likely require more fiddling with heurstics: we do immediate trash there to see if we can take a shortcut cycle. Changes requested by shade (Reviewer). src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 124: > 122: } > 123: > 124: // Allow resurrection of unreachable objects that are visited during concurrent class-unloading. Let's not call it "Allow resurrection", which somewhat implies the object has full privileges to exist, i.e. can be inserted into the object graph back. But it really can't. We only do this because we were asked with `AS_NO_KEEPALIVE`. Something like: "Allow runtime to see unreachable objects that are visited during concurrent class unloading" src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 153: > 151: } > 152: > 153: assert(heap->is_concurrent_weak_root_in_progress(), "Must be doing weak roots now"); This does not ring true when final mark is cancelled, or am I missing something? src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 163: > 161: // We cannot recycle regions because weak roots need to know what is marked in trashed regions. > 162: entry_weak_refs(); > 163: entry_weak_roots(); Same as above: should probably be still protected by GC state checks to cover cancelled cases. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23951#pullrequestreview-2670666176 PR Review: https://git.openjdk.org/jdk/pull/23951#pullrequestreview-2670700570 PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987125159 PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987139867 PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987143228 From coleenp at openjdk.org Mon Mar 10 12:30:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 10 Mar 2025 12:30:59 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port Marked as reviewed by coleenp (Reviewer). I do have a strong opinion on this. The security manager removal is a good model to follow. Since this change removes the capability and 50K LOC, I think it's sufficient to say it implements the JEP. The other removals are cleanups and don't need to have to be tied up in the process, and can happen when they're ready and reviewed. There's no technical or practical reason to make this more difficult. ------------- PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2670782075 PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2710418458 From ihse at openjdk.org Mon Mar 10 12:57:57 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 10 Mar 2025 12:57:57 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port Marked as reviewed by ihse (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2670852740 From ihse at openjdk.org Mon Mar 10 13:52:56 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 10 Mar 2025 13:52:56 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: <3jbKFXHYH2mgyYOQjn2rfGm0IpIwH377DuDrZAY4X7w=.8d0767f3-7190-4396-824a-d55e6a61f479@github.com> References: <3jbKFXHYH2mgyYOQjn2rfGm0IpIwH377DuDrZAY4X7w=.8d0767f3-7190-4396-824a-d55e6a61f479@github.com> Message-ID: On Mon, 10 Mar 2025 09:46:38 GMT, Aleksey Shipilev wrote: >> The `--enable-deprecated-ports` is still there. All that is removed is an if statement and a print line. I know the make syntax can seem intimidating, but just ask me or any other build team member if you need help to recreate such a thing. It is not like it is a complicated algorithm that can be written in many ways. This is just make's equivalant of: >> >> >> if (some_condition) { >> println("whatever"); >> } >> >> >> To me this is just utter nonsense to keep that commented out. > > "Utter nonsense" might be a bit harsh. We do code samples around OpenJDK all the time to leave breadcrumbs for future use. As I said, I don't mind removing it, done so in new commit. Yes, you are right. That did not sound good. I apologize. (And thanks for removing it!) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23906#discussion_r1987327814 From wkemper at openjdk.org Mon Mar 10 16:37:01 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 16:37:01 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops In-Reply-To: <3AO8SncuFl0-pj94X6S1GHNXi01EoOTZU1lnhrmtsKo=.85990912-b326-40f1-9dda-594b05b1f694@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> <3AO8SncuFl0-pj94X6S1GHNXi01EoOTZU1lnhrmtsKo=.85990912-b326-40f1-9dda-594b05b1f694@github.com> Message-ID: On Mon, 10 Mar 2025 11:54:50 GMT, Aleksey Shipilev wrote: >> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 153: > >> 151: } >> 152: >> 153: assert(heap->is_concurrent_weak_root_in_progress(), "Must be doing weak roots now"); > > This does not ring true when final mark is cancelled, or am I missing something? There is a cancellation check just above this line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987650487 From wkemper at openjdk.org Mon Mar 10 17:31:53 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 17:31:53 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops In-Reply-To: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Fri, 7 Mar 2025 21:47:31 GMT, William Kemper wrote: > Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. It isn't just about trash regions. This path through the barrier will also allow access to objects in the collection set (without evacuating them). We also choose the collection set during the final mark safepoint. Alternatively, we could soften the constraint for `ShenandoahHeap::is_in` to allow access to trash regions if concurrent weak roots is in progress. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23951#issuecomment-2711330793 From wkemper at openjdk.org Mon Mar 10 18:55:51 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 18:55:51 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v2] In-Reply-To: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: > Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. William Kemper has updated the pull request incrementally with two additional commits since the last revision: - Consider trash regions to be in the heap during concurrent weak roots - Better comment for LRB when accessing unreachable oops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23951/files - new: https://git.openjdk.org/jdk/pull/23951/files/b231d4fe..a5db7360 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23951&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23951&range=00-01 Stats: 27 lines in 2 files changed: 16 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/23951.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23951/head:pull/23951 PR: https://git.openjdk.org/jdk/pull/23951 From shade at openjdk.org Mon Mar 10 19:47:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 19:47:53 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Mon, 10 Mar 2025 17:28:59 GMT, William Kemper wrote: > It isn't just about trash regions. This path through the barrier will also allow access to objects in the collection set (without evacuating them). We also choose the collection set during the final mark safepoint. Alternatively, we could soften the constraint for `ShenandoahHeap::is_in` to allow access to trash regions if concurrent weak roots is in progress. Yes, also `cset`. My point is that conceptually, `trash` means trash, and we should not be accessing it. So if something is not trash yet (including references from weak roots), it should not be labeled `trash` then. With this patch, we are kinda stretching the definition. Which is fine for a local patch, but this over-stretching should be resolved, so it does not haunt us in future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23951#issuecomment-2711652598 From shade at openjdk.org Mon Mar 10 19:47:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 19:47:55 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v2] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Mon, 10 Mar 2025 18:55:51 GMT, William Kemper wrote: >> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Consider trash regions to be in the heap during concurrent weak roots > - Better comment for LRB when accessing unreachable oops src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 127: > 125: // Note that this may also interfere with the DeadCounterClosure when visiting weak oop storage, > 126: // but it does not seem to be a problem in practice because the dead count callbacks do not care > 127: // about the precise number of dead objects (only that there are dead objects). The last 3 lines feel too specific, really :) I think that paragraph should be in the related bug report. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987915941 From shade at openjdk.org Mon Mar 10 19:47:56 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Mar 2025 19:47:56 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v2] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> <3AO8SncuFl0-pj94X6S1GHNXi01EoOTZU1lnhrmtsKo=.85990912-b326-40f1-9dda-594b05b1f694@github.com> Message-ID: On Mon, 10 Mar 2025 16:34:22 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 153: >> >>> 151: } >>> 152: >>> 153: assert(heap->is_concurrent_weak_root_in_progress(), "Must be doing weak roots now"); >> >> This does not ring true when final mark is cancelled, or am I missing something? > > There is a cancellation check just above this line. Ah. Back when we wrote this code originally, we used GC state flags to figure out whether we really need to go into particular phases. But I guess cancellation check is OK too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1987909613 From wkemper at openjdk.org Mon Mar 10 21:12:57 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 21:12:57 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v2] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Mon, 10 Mar 2025 19:42:30 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request incrementally with two additional commits since the last revision: >> >> - Consider trash regions to be in the heap during concurrent weak roots >> - Better comment for LRB when accessing unreachable oops > > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 127: > >> 125: // Note that this may also interfere with the DeadCounterClosure when visiting weak oop storage, >> 126: // but it does not seem to be a problem in practice because the dead count callbacks do not care >> 127: // about the precise number of dead objects (only that there are dead objects). > > The last 3 lines feel too specific, really :) I think that paragraph should be in the related bug report. Okay. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1988028969 From wkemper at openjdk.org Mon Mar 10 21:25:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 21:25:06 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: > Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Trim extraneous comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23951/files - new: https://git.openjdk.org/jdk/pull/23951/files/a5db7360..1c73a85a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23951&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23951&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23951.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23951/head:pull/23951 PR: https://git.openjdk.org/jdk/pull/23951 From wkemper at openjdk.org Mon Mar 10 21:25:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 21:25:07 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v2] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: <6WGwFtG_zwWr9SJw1zebfJPlro0LdRcOzOlE2TWB2N0=.471c1dab-65f7-4fd9-9d5e-d0dcf0142cdb@github.com> On Mon, 10 Mar 2025 18:55:51 GMT, William Kemper wrote: >> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Consider trash regions to be in the heap during concurrent weak roots > - Better comment for LRB when accessing unreachable oops If it is just the name of the region state here, we could call it `pending_recycle` or something that communicates our intent to recycle it, but that we still need to use it. Moving `cset ` and `immediate trash` selection out of `final mark` would probably require a new safepoint. I think we would still need a means to express the 'region cannot be used for allocations' concept between final mark and class unloading. I also modified `ShenandoahHeap::is_in` to match the same constraints we impose on the freeset for allocations during concurrent weak roots (https://github.com/openjdk/jdk/pull/23951/commits/a5db7360610691833ecb4204af2861c77c8b7858). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23951#issuecomment-2711869412 From wkemper at openjdk.org Mon Mar 10 22:58:00 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Mar 2025 22:58:00 GMT Subject: Withdrawn: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 21:36:37 GMT, William Kemper wrote: > When the capacity of a trashed region is transferred from the young to old generation, we must first recycle the region to break its affiliation with the young generation. Failing to do this may violate the constraint that the capacity of a generation is always equal to or greater than the capacity of its affiliated regions. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/23282 From shade at openjdk.org Tue Mar 11 15:02:15 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 11 Mar 2025 15:02:15 GMT Subject: RFR: 8351656: Problemlist gc/TestAllocHumongousFragment#generational Message-ID: Causes noise in GHA testing, so we need to problemlist it. Additional testing: - [x] Checked the test is skipped locally ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/23982/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23982&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351656 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23982/head:pull/23982 PR: https://git.openjdk.org/jdk/pull/23982 From shade at openjdk.org Tue Mar 11 16:03:07 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 11 Mar 2025 16:03:07 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Mon, 10 Mar 2025 21:25:06 GMT, William Kemper wrote: >> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Trim extraneous comment Looks okay to me. @rkennke, @zhengyu123 might have an opinion here as well. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 844: > 842: // during weak roots. Concurrent class unloading may access unmarked oops > 843: // in trash regions. > 844: return r->is_trash() && is_concurrent_weak_root_in_progress(); Pity to do this, but I understand the reason for it. We should investigate if this window is unnecessarily large. I see currently we drop `WEAK_ROOTS` gc state in `ShenandoahHeap::concurrent_prepare_for_update_refs`. Should we drop the flag sooner, somewhere after concurrent class unloading? Can be done separately, if it snowballs into something more complicated. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23951#pullrequestreview-2675209101 PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1989630728 From xpeng at openjdk.org Tue Mar 11 16:20:58 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 11 Mar 2025 16:20:58 GMT Subject: RFR: 8351656: Problemlist gc/TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 13:22:27 GMT, Aleksey Shipilev wrote: > Causes noise in GHA testing, so we need to problemlist it. > > Additional testing: > - [x] Checked the test is skipped locally Marked as reviewed by xpeng (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/23982#pullrequestreview-2675284703 From wkemper at openjdk.org Tue Mar 11 19:03:59 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 19:03:59 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Tue, 11 Mar 2025 15:58:23 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Trim extraneous comment > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 844: > >> 842: // during weak roots. Concurrent class unloading may access unmarked oops >> 843: // in trash regions. >> 844: return r->is_trash() && is_concurrent_weak_root_in_progress(); > > Pity to do this, but I understand the reason for it. > > We should investigate if this window is unnecessarily large. I see currently we drop `WEAK_ROOTS` gc state in `ShenandoahHeap::concurrent_prepare_for_update_refs`. Should we drop the flag sooner, somewhere after concurrent class unloading? Can be done separately, if it snowballs into something more complicated. Class unloading is the last thing we do before recycling trash regions. A region will be usable for allocation as soon as it is recycled, so, in a sense, this has the same effect as turning off the weak roots flag immediately after class unloading. Also, the weak roots phase itself cannot have regions recycled because it relies on accurate mark information (recycling clears live data and resets the TAMS). We _could_ work around this by preserving the mark data (perhaps decoupling TAMS reset from region recycling). But changing the `gc_state` currently requires either a safepoint or a handshake (while holding the `Thread_lock`). I haven't thought all the way through this, but something like this (psuedo-code) might be possible: ```C++ vmop_entry_final_mark(); // Complete class unloading, since it actually _needs_ the oops (still need to forbid trash recycling here). entry_class_unloading(); // Recycle trash, but do not reset TAMS (weak roots needs TAMS to decide reachability of referents). entry_cleanup_early(); // Complete weak roots. There are no more trash regions and we don't have to change gc_state entry_weak_refs(); entry_weak_roots(); What do you think? This would be a separate PR of course, but do you see any reason something like this wouldn't work? I'd expect some asserts to break if we allocate into a new region with TAMS > bottom. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1989959925 From wkemper at openjdk.org Tue Mar 11 19:37:15 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 19:37:15 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational Message-ID: Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. # Testing Ran TestAllocHumongousFragment#generational 6,500 times without failures. ------------- Commit messages: - Track number of threads waiting on allocation failures for notification Changes: https://git.openjdk.org/jdk/pull/23997/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351464 Stats: 12 lines in 3 files changed: 9 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23997.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23997/head:pull/23997 PR: https://git.openjdk.org/jdk/pull/23997 From dnsimon at openjdk.org Tue Mar 11 19:41:18 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 11 Mar 2025 19:41:18 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null Message-ID: All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. ------------- Commit messages: - nmethod entry barriers are no longer optional Changes: https://git.openjdk.org/jdk/pull/23996/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23996&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351700 Stats: 171 lines in 27 files changed: 5 ins; 103 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/23996.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23996/head:pull/23996 PR: https://git.openjdk.org/jdk/pull/23996 From eosterlund at openjdk.org Tue Mar 11 19:41:18 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 11 Mar 2025 19:41:18 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:29:05 GMT, Doug Simon wrote: > All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. Nice! Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23996#pullrequestreview-2675894137 From ysr at openjdk.org Tue Mar 11 19:42:03 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 11 Mar 2025 19:42:03 GMT Subject: RFR: 8351656: Problemlist gc/TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 13:22:27 GMT, Aleksey Shipilev wrote: > Causes noise in GHA testing, so we need to problemlist it. > > Additional testing: > - [x] Checked the test is skipped locally Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23982#pullrequestreview-2675901068 From wkemper at openjdk.org Tue Mar 11 19:42:02 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 19:42:02 GMT Subject: RFR: 8351656: Problemlist gc/TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 13:22:27 GMT, Aleksey Shipilev wrote: > Causes noise in GHA testing, so we need to problemlist it. > > Additional testing: > - [x] Checked the test is skipped locally Thank you. Expect to un-problem list once https://github.com/openjdk/jdk/pull/23997 has been vetted. ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23982#pullrequestreview-2675896072 From shade at openjdk.org Tue Mar 11 19:42:03 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 11 Mar 2025 19:42:03 GMT Subject: Integrated: 8351656: Problemlist gc/TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 13:22:27 GMT, Aleksey Shipilev wrote: > Causes noise in GHA testing, so we need to problemlist it. > > Additional testing: > - [x] Checked the test is skipped locally This pull request has now been integrated. Changeset: cef36931 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/cef369317570f95ac70aac6ceea88a0042ca2b45 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8351656: Problemlist gc/TestAllocHumongousFragment#generational Reviewed-by: xpeng, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/23982 From never at openjdk.org Tue Mar 11 19:53:00 2025 From: never at openjdk.org (Tom Rodriguez) Date: Tue, 11 Mar 2025 19:53:00 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:29:05 GMT, Doug Simon wrote: > All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 6549: > 6547: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); > 6548: if (bs_nm != nullptr) { > 6549: StubRoutines::_method_entry_barrier = generate_method_entry_barrier(); Shouldn't you have kept this line? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23996#discussion_r1990025685 From wkemper at openjdk.org Tue Mar 11 19:59:29 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 19:59:29 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # Message-ID: Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. ------------- Commit messages: - Do not enforce size constraints on generations - Don't allocate in regions that cannot be flipped to old gc - Do not allocate from mutator if young gen cannot spare the region Changes: https://git.openjdk.org/jdk/pull/23998/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23998&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348400 Stats: 66 lines in 3 files changed: 42 ins; 13 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/23998.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23998/head:pull/23998 PR: https://git.openjdk.org/jdk/pull/23998 From dnsimon at openjdk.org Tue Mar 11 20:00:59 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 11 Mar 2025 20:00:59 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v2] In-Reply-To: References: Message-ID: > All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: revived accidentally deleted code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23996/files - new: https://git.openjdk.org/jdk/pull/23996/files/b958ee43..b3d4721d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23996&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23996&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23996.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23996/head:pull/23996 PR: https://git.openjdk.org/jdk/pull/23996 From dnsimon at openjdk.org Tue Mar 11 20:01:00 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 11 Mar 2025 20:01:00 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v2] In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:50:18 GMT, Tom Rodriguez wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> revived accidentally deleted code > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 6549: > >> 6547: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); >> 6548: if (bs_nm != nullptr) { >> 6549: StubRoutines::_method_entry_barrier = generate_method_entry_barrier(); > > Shouldn't you have kept this line? Absolutely! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23996#discussion_r1990039724 From kdnilsen at openjdk.org Tue Mar 11 21:01:53 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 11 Mar 2025 21:01:53 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # In-Reply-To: References: Message-ID: <1TI7zry8_JLLMVwxDq0Yd65TrgkSYafDOEn8zOFS7z0=.0517105a-520a-4686-83eb-a2446ee72a8a@github.com> On Tue, 11 Mar 2025 19:54:20 GMT, William Kemper wrote: > Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. src/hotspot/share/gc/shenandoah/shenandoahGenerationSizer.cpp line 127: > 125: } > 126: > 127: if (dst->max_capacity() + bytes_to_transfer > max_size_for(dst)) { Do we need to edit the descriptions of ShenandoahMinYoungPercentage and ShenandoahMaxYoungPercentage? Do we need to remove these options entirely from shenandoah_globals? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r1990119862 From wkemper at openjdk.org Tue Mar 11 21:49:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 21:49:06 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v4] In-Reply-To: References: Message-ID: > This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots - Clarify which thread local buffers in comment - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots - Fix comments - Add whitespace at end of file - More detail for init update refs event message - Use timing tracker for timing verification - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots - WIP: Fix up phase timings for newly concurrent final roots and init update refs - WIP: Combine satb transfer with state propagation, restore phase timing data - ... and 2 more: https://git.openjdk.org/jdk/compare/1dd9cf10...a3575f1e ------------- Changes: https://git.openjdk.org/jdk/pull/23830/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=03 Stats: 291 lines in 14 files changed: 194 ins; 47 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/23830.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23830/head:pull/23830 PR: https://git.openjdk.org/jdk/pull/23830 From never at openjdk.org Tue Mar 11 21:53:55 2025 From: never at openjdk.org (Tom Rodriguez) Date: Tue, 11 Mar 2025 21:53:55 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v2] In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 20:00:59 GMT, Doug Simon wrote: >> All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > revived accidentally deleted code Marked as reviewed by never (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23996#pullrequestreview-2676195527 From xpeng at openjdk.org Tue Mar 11 21:54:56 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 11 Mar 2025 21:54:56 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:31:47 GMT, William Kemper wrote: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. The bug should also exist in classical Shenandoah w/o generation, I think ShenandoahControlThread also need to be updated to fix the bug, even it seems not happening in ShenandoahControlThread in the jtreg test. src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 281: > 279: > 280: { > 281: MonitorLocker ml(&_alloc_failure_waiters_lock); Should the notification code be encapsulated in method `notify_alloc_failure_waiters()`? ------------- PR Review: https://git.openjdk.org/jdk/pull/23997#pullrequestreview-2676193059 PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1990174134 From kdnilsen at openjdk.org Tue Mar 11 22:27:53 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 11 Mar 2025 22:27:53 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v4] In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 21:49:06 GMT, William Kemper wrote: >> This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - Clarify which thread local buffers in comment > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - Fix comments > - Add whitespace at end of file > - More detail for init update refs event message > - Use timing tracker for timing verification > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - WIP: Fix up phase timings for newly concurrent final roots and init update refs > - WIP: Combine satb transfer with state propagation, restore phase timing data > - ... and 2 more: https://git.openjdk.org/jdk/compare/1dd9cf10...a3575f1e Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23830#pullrequestreview-2676245948 From wkemper at openjdk.org Tue Mar 11 22:32:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 22:32:52 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 21:52:00 GMT, Xiaolong Peng wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 281: > >> 279: >> 280: { >> 281: MonitorLocker ml(&_alloc_failure_waiters_lock); > > Should the notification code be encapsulated in method `notify_alloc_failure_waiters()`? Yes, will do this when I take out the waiter counts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1990208348 From wkemper at openjdk.org Tue Mar 11 22:35:58 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 22:35:58 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:31:47 GMT, William Kemper wrote: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. Not sure I want to change `ShenandoahControlThread.` It uses a different mechanism to track whether or not to notify. It only notifies when it services the alloc failure request (it doesn't depend on the shared `cancelled_gc` state the same way the generational mode does). In the scenario that leads to this live lock for the generational mode, the default mode would _not_ notify the waiters upon successful completion of the concurrent cycle. It would notify them after the subsequent degenerated cycle. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23997#issuecomment-2715851851 From wkemper at openjdk.org Tue Mar 11 22:59:45 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 22:59:45 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v2] In-Reply-To: References: Message-ID: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Notify alloc waiters when GC completes without cancellation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23997/files - new: https://git.openjdk.org/jdk/pull/23997/files/d0168ca9..cb9cd72f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=00-01 Stats: 12 lines in 3 files changed: 0 ins; 9 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23997.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23997/head:pull/23997 PR: https://git.openjdk.org/jdk/pull/23997 From xpeng at openjdk.org Tue Mar 11 23:30:56 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 11 Mar 2025 23:30:56 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 22:33:44 GMT, William Kemper wrote: > Not sure I want to change `ShenandoahControlThread.` It uses a different mechanism to track whether or not to notify. It only notifies when it services the alloc failure request (it doesn't depend on the shared `cancelled_gc` state the same way the generational mode does). In the scenario that leads to this live lock for the generational mode, the default mode would _not_ notify the waiters upon successful completion of the concurrent cycle. It would notify them after the subsequent degenerated cycle. It does use the shared cancelled_cause, see the code here https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp#L68 and at [line 171](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp#L171), `ShenandoahControlThread` does have same problem, alloc_failure_pending is evaluated using shared cancelled_cause before starting a cycle. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23997#issuecomment-2715950210 From wkemper at openjdk.org Tue Mar 11 23:40:24 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 23:40:24 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime Message-ID: When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. ------------- Commit messages: - Can only make assertions about reference strength for stores outside of the heap - Merge remote-tracking branch 'jdk/master' into satb-ignore-weak-store - Do not enqueue weak stores Changes: https://git.openjdk.org/jdk/pull/24001/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24001&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350905 Stats: 7 lines in 1 file changed: 4 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24001/head:pull/24001 PR: https://git.openjdk.org/jdk/pull/24001 From wkemper at openjdk.org Tue Mar 11 23:51:35 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 23:51:35 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v3] In-Reply-To: References: Message-ID: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Unproblem list test that found this issue - Merge remote-tracking branch 'jdk/master' into fix-alloc-waiters-missed-notify - Notify alloc waiters when GC completes without cancellation - Track number of threads waiting on allocation failures for notification ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23997/files - new: https://git.openjdk.org/jdk/pull/23997/files/cb9cd72f..3a16131b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=01-02 Stats: 17915 lines in 138 files changed: 6689 ins; 10288 del; 938 mod Patch: https://git.openjdk.org/jdk/pull/23997.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23997/head:pull/23997 PR: https://git.openjdk.org/jdk/pull/23997 From wkemper at openjdk.org Tue Mar 11 23:53:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 11 Mar 2025 23:53:52 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: <9LfJ2F0nrNM7VPPnLGwsu_jPi5UPuZwy8hsbjkcmPII=.15ea5668-6541-43a4-b012-723dd28d9efb@github.com> On Tue, 11 Mar 2025 23:28:32 GMT, Xiaolong Peng wrote: >> Not sure I want to change `ShenandoahControlThread.` It uses a different mechanism to track whether or not to notify. It only notifies when it services the alloc failure request (it doesn't depend on the shared `cancelled_gc` state the same way the generational mode does). In the scenario that leads to this live lock for the generational mode, the default mode would _not_ notify the waiters upon successful completion of the concurrent cycle. It would notify them after the subsequent degenerated cycle. > >> Not sure I want to change `ShenandoahControlThread.` It uses a different mechanism to track whether or not to notify. It only notifies when it services the alloc failure request (it doesn't depend on the shared `cancelled_gc` state the same way the generational mode does). In the scenario that leads to this live lock for the generational mode, the default mode would _not_ notify the waiters upon successful completion of the concurrent cycle. It would notify them after the subsequent degenerated cycle. > > It does use the shared cancelled_cause, see the code here https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp#L68 and at [line 171](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp#L171), `ShenandoahControlThread` does have same problem, alloc_failure_pending is evaluated using shared cancelled_cause before starting a cycle. @pengxiaolong , yes - I agree. Good catch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23997#issuecomment-2715984419 From wkemper at openjdk.org Wed Mar 12 00:05:05 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 00:05:05 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. William Kemper has updated the pull request incrementally with one additional commit since the last revision: The non-generational modes may also fail to notify waiters ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23997/files - new: https://git.openjdk.org/jdk/pull/23997/files/3a16131b..f72e71c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23997&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23997.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23997/head:pull/23997 PR: https://git.openjdk.org/jdk/pull/23997 From xpeng at openjdk.org Wed Mar 12 00:29:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 12 Mar 2025 00:29:53 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:05:05 GMT, William Kemper wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > The non-generational modes may also fail to notify waiters src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 171: > 169: > 170: // If this cycle completed without being cancelled, notify waiters about it > 171: if (!heap->cancelled_gc()) { I feel we should remove the test `!heap->cancelled_gc()` here, if is fine if there is single mutator thread, but in most cases there are mutator threads, then the following case could happen: 1. **Mutator A** try to cancel GC and notify control thread, it will wait with `_alloc_failure_waiters_lock`, `_cancelled_cause` is set to `_allocation_failure` 2. Concurrent GC clear `_cancelled_cause` and set it to `_no_gc` in op_final_update_refs 3. **Mutator B** try to cancel GC and successfully set `_cancelled_cause` to `_allocation_failure` again. 4. Concurrent GC finishes. 5. Control thread check `!heap->cancelled_gc()` which is false, and won't wake up mutators. In this case, it will delay the wake up for mutator A & B to next cycle. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1990344257 From fyang at openjdk.org Wed Mar 12 00:32:53 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 12 Mar 2025 00:32:53 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v2] In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 20:00:59 GMT, Doug Simon wrote: >> All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > revived accidentally deleted code src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 9903: > 9901: generate_arraycopy_stubs(); > 9902: > 9903: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); Drive-by comment: `bs_nm` seems not used any more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23996#discussion_r1990347462 From eirbjo at openjdk.org Wed Mar 12 05:51:58 2025 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Wed, 12 Mar 2025 05:51:58 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port > JEP 486 (Permanently Disable the Security Manager) updated the API and removed the ability to set a SecurityManager in a first big commit. [..] There were 150+ follow on issues Observation: These JEP 486 follow-on issues served as a nice way for non-experts to contribute with something useful and also to get acquainted with various parts of the OpenJDK code base. Most cleanups followed a predictable pattern, so the implementation work could be distributed also to people not intimately familiar with the particular area without too much risk. Not sure how well this conveys to JEP 503, but I imagine something similar should be possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2716580939 From dnsimon at openjdk.org Wed Mar 12 09:16:44 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Mar 2025 09:16:44 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v3] In-Reply-To: References: Message-ID: > All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: removed unused code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23996/files - new: https://git.openjdk.org/jdk/pull/23996/files/b3d4721d..95da3c2f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23996&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23996&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23996.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23996/head:pull/23996 PR: https://git.openjdk.org/jdk/pull/23996 From shade at openjdk.org Wed Mar 12 09:46:01 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Mar 2025 09:46:01 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v3] In-Reply-To: References: Message-ID: <1stcVqx5LbF9cnNm4gb4YXqoHBbBBigH5fpYlBqRttI=.79261377-2b11-49eb-802d-b579fd23a9ff@github.com> On Wed, 12 Mar 2025 09:16:44 GMT, Doug Simon wrote: >> All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > removed unused code Looks fine, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23996#pullrequestreview-2677747978 From shade at openjdk.org Wed Mar 12 10:38:02 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Mar 2025 10:38:02 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime In-Reply-To: References: Message-ID: <2Hcn5dvmiq7DKbSwecZyrIeTxwbGVU2ISKgyoFdS-sk=.be164c1b-b4b9-47cc-941b-ecd9d25d5fb1@github.com> On Tue, 11 Mar 2025 23:35:24 GMT, William Kemper wrote: > When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. Some nits: src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 159: > 157: HasDecorator::value || > 158: HasDecorator::value || > 159: HasDecorator::value) { Suggest to split it into two things, with comments: // Uninitialized and no-keepalive stores do not need barrier. if (HasDecorator::value || HasDecorator::value) { return; } // Stores to weak/phantom require no barrier. The old references would // have been resurrected by load barrier if they were needed. if (HasDecorator::value || HasDecorator::value) { return; } (I think I caught the reason why we are safe to skip SATB here, maybe comment can be expanded) src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 279: > 277: oop_store_common(addr, value); > 278: if (ShenandoahCardBarrier) { > 279: barrier_set()->write_ref_field_post(addr); Unnecessary change? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24001#pullrequestreview-2677892517 PR Review Comment: https://git.openjdk.org/jdk/pull/24001#discussion_r1991165939 PR Review Comment: https://git.openjdk.org/jdk/pull/24001#discussion_r1991163625 From dnsimon at openjdk.org Wed Mar 12 12:21:57 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Mar 2025 12:21:57 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v3] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 09:16:44 GMT, Doug Simon wrote: >> All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > removed unused code `gc/TestAllocHumongousFragment.java#generational` is failing on Windows: https://github.com/dougxc/jdk/actions/runs/13807682996/job/38625487569#step:9:630 I don't think it can be caused by this PR. Are you able to confirm that @shipilev ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23996#issuecomment-2717699848 From shade at openjdk.org Wed Mar 12 12:34:03 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Mar 2025 12:34:03 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v3] In-Reply-To: <1stcVqx5LbF9cnNm4gb4YXqoHBbBBigH5fpYlBqRttI=.79261377-2b11-49eb-802d-b579fd23a9ff@github.com> References: <1stcVqx5LbF9cnNm4gb4YXqoHBbBBigH5fpYlBqRttI=.79261377-2b11-49eb-802d-b579fd23a9ff@github.com> Message-ID: On Wed, 12 Mar 2025 09:43:21 GMT, Aleksey Shipilev wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> removed unused code > > Looks fine, thanks. > `gc/TestAllocHumongousFragment.java#generational` is failing on Windows: https://github.com/dougxc/jdk/actions/runs/13807682996/job/38625487569#step:9:630 I don't think it can be caused by this PR. Are you able to confirm that @shipilev ? It was problemlisted by #23982 yesterday. You can ignore it, or merge with recent master to get clean GHA runs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23996#issuecomment-2717727270 From dnsimon at openjdk.org Wed Mar 12 12:34:04 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Mar 2025 12:34:04 GMT Subject: RFR: 8351700: Remove code conditional on BarrierSetNMethod being null [v3] In-Reply-To: References: Message-ID: <-urz_l6_Sa21e9SspzfanN4VGdOFZJxOv6E79Npfv5A=.baeb6814-351b-4711-b7fe-4d87e0700532@github.com> On Wed, 12 Mar 2025 09:16:44 GMT, Doug Simon wrote: >> All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > removed unused code I'll ignore it. Thanks for pointing out the problem listing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23996#issuecomment-2717730379 From dnsimon at openjdk.org Wed Mar 12 12:34:05 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Mar 2025 12:34:05 GMT Subject: Integrated: 8351700: Remove code conditional on BarrierSetNMethod being null In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:29:05 GMT, Doug Simon wrote: > All GCs started needing nmethod entry barriers as of loom so there's no longer any need to test for null nmethod entry barriers. This pull request has now been integrated. Changeset: 95b66d5a Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/95b66d5a43a77b257a097afe5df369f92769abd2 Stats: 171 lines in 27 files changed: 5 ins; 102 del; 64 mod 8351700: Remove code conditional on BarrierSetNMethod being null Reviewed-by: shade, eosterlund, never ------------- PR: https://git.openjdk.org/jdk/pull/23996 From shade at openjdk.org Wed Mar 12 13:39:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Mar 2025 13:39:53 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Tue, 11 Mar 2025 19:01:31 GMT, William Kemper wrote: > A region will be usable for allocation as soon as it is recycled, so, in a sense, this has the same effect as turning off the weak roots flag immediately after class unloading. Right. This answers my original question. > What do you think? This would be a separate PR of course, but do you see any reason something like this wouldn't work? It looks to me as stretching the definition of "trash" even further? I think it would be conceptually cleaner to never turn regular regions into trash until after weak roots are done. So accesses to "dead" weak roots are still possible like a regular access to "regular" region. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1991522731 From wkemper at openjdk.org Wed Mar 12 16:42:59 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 16:42:59 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:24:46 GMT, Xiaolong Peng wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> The non-generational modes may also fail to notify waiters > > src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 171: > >> 169: >> 170: // If this cycle completed without being cancelled, notify waiters about it >> 171: if (!heap->cancelled_gc()) { > > I feel we should remove the test `!heap->cancelled_gc()` here, if is fine if there is single mutator thread, but in most cases there are mutator threads, then the following case could happen: > 1. **Mutator A** try to cancel GC and notify control thread, it will wait with `_alloc_failure_waiters_lock`, `_cancelled_cause` is set to `_allocation_failure` > 2. Concurrent GC clear `_cancelled_cause` and set it to `_no_gc` in op_final_update_refs > 3. **Mutator B** try to cancel GC and successfully set `_cancelled_cause` to `_allocation_failure` again. > 4. Concurrent GC finishes. > 5. Control thread check `!heap->cancelled_gc()` which is false, and won't wake up mutators. > > In this case, it will delay the wake up for mutator A & B to next cycle. I believe that is the correct behavior. The mutators are waiting until there is memory available. If mutator B cannot allocate, there is no reason to believe mutator A would be able to allocate. In this case, it is fine for both mutators to wait (even if it means A has to wait an extra cycle). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1991893985 From wkemper at openjdk.org Wed Mar 12 16:46:55 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 16:46:55 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime In-Reply-To: <2Hcn5dvmiq7DKbSwecZyrIeTxwbGVU2ISKgyoFdS-sk=.be164c1b-b4b9-47cc-941b-ecd9d25d5fb1@github.com> References: <2Hcn5dvmiq7DKbSwecZyrIeTxwbGVU2ISKgyoFdS-sk=.be164c1b-b4b9-47cc-941b-ecd9d25d5fb1@github.com> Message-ID: On Wed, 12 Mar 2025 10:30:58 GMT, Aleksey Shipilev wrote: >> When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. > > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 159: > >> 157: HasDecorator::value || >> 158: HasDecorator::value || >> 159: HasDecorator::value) { > > Suggest to split it into two things, with comments: > > > // Uninitialized and no-keepalive stores do not need barrier. > if (HasDecorator::value || > HasDecorator::value) { > return; > } > > // Stores to weak/phantom require no barrier. The old references would > // have been resurrected by load barrier if they were needed. > if (HasDecorator::value || > HasDecorator::value) { > return; > } > > > (I think I caught the reason why we are safe to skip SATB here, maybe comment can be expanded) Ha. I had it that way originally - I'll put it back. > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 279: > >> 277: oop_store_common(addr, value); >> 278: if (ShenandoahCardBarrier) { >> 279: barrier_set()->write_ref_field_post(addr); > > Unnecessary change? Yes, just idly fixing warnings in my editor. I'll revert it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24001#discussion_r1991898007 PR Review Comment: https://git.openjdk.org/jdk/pull/24001#discussion_r1991899265 From wkemper at openjdk.org Wed Mar 12 16:55:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 16:55:52 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: <9WVKbKiBylo3hsIAazsKHpj13TC9q9yzSj-YujSDoWY=.2b50746e-cc0b-4eaa-9976-1ed58d959c83@github.com> On Wed, 12 Mar 2025 13:37:15 GMT, Aleksey Shipilev wrote: >> Class unloading is the last thing we do before recycling trash regions. A region will be usable for allocation as soon as it is recycled, so, in a sense, this has the same effect as turning off the weak roots flag immediately after class unloading. >> >> Also, the weak roots phase itself cannot have regions recycled because it relies on accurate mark information (recycling clears live data and resets the TAMS). We _could_ work around this by preserving the mark data (perhaps decoupling TAMS reset from region recycling). But changing the `gc_state` currently requires either a safepoint or a handshake (while holding the `Thread_lock`). I haven't thought all the way through this, but something like this (psuedo-code) might be possible: >> >> ```C++ >> vmop_entry_final_mark(); >> >> // Complete class unloading, since it actually _needs_ the oops (still need to forbid trash recycling here). >> entry_class_unloading(); >> >> // Recycle trash, but do not reset TAMS (weak roots needs TAMS to decide reachability of referents). >> entry_cleanup_early(); >> >> // Complete weak roots. There are no more trash regions and we don't have to change gc_state >> entry_weak_refs(); >> entry_weak_roots(); >> >> What do you think? This would be a separate PR of course, but do you see any reason something like this wouldn't work? I'd expect some asserts to break if we allocate into a new region with TAMS > bottom. > >> A region will be usable for allocation as soon as it is recycled, so, in a sense, this has the same effect as turning off the weak roots flag immediately after class unloading. > > Right. This answers my original question. > >> What do you think? This would be a separate PR of course, but do you see any reason something like this wouldn't work? > > It looks to me as stretching the definition of "trash" even further? I think it would be conceptually cleaner to never turn regular regions into trash until after weak roots are done. So accesses to "dead" weak roots are still possible like a regular access to "regular" region. The advantage with the scheme I proposed is that it makes immediate trash regions available for allocations earlier in the cycle. I don't think it changes the way "trash" is treated during concurrent class unloading, but it would mean that weak roots/refs wouldn't see "trash" regions any more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23951#discussion_r1991914955 From xpeng at openjdk.org Wed Mar 12 17:27:55 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 12 Mar 2025 17:27:55 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:05:05 GMT, William Kemper wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > The non-generational modes may also fail to notify waiters Looks good to me. ------------- Marked as reviewed by xpeng (Author). PR Review: https://git.openjdk.org/jdk/pull/23997#pullrequestreview-2679311781 From xpeng at openjdk.org Wed Mar 12 17:27:55 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 12 Mar 2025 17:27:55 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: <8HGl6b056y3lTi7An0UsJ896JOy-7Ij8SMcc2MULj0I=.26ca6193-f829-449e-afbe-4d068b8533ab@github.com> On Wed, 12 Mar 2025 16:40:40 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 171: >> >>> 169: >>> 170: // If this cycle completed without being cancelled, notify waiters about it >>> 171: if (!heap->cancelled_gc()) { >> >> I feel we should remove the test `!heap->cancelled_gc()` here, if is fine if there is single mutator thread, but in most cases there are mutator threads, then the following case could happen: >> 1. **Mutator A** try to cancel GC and notify control thread, it will wait with `_alloc_failure_waiters_lock`, `_cancelled_cause` is set to `_allocation_failure` >> 2. Concurrent GC clear `_cancelled_cause` and set it to `_no_gc` in op_final_update_refs >> 3. **Mutator B** try to cancel GC and successfully set `_cancelled_cause` to `_allocation_failure` again. >> 4. Concurrent GC finishes. >> 5. Control thread check `!heap->cancelled_gc()` which is false, and won't wake up mutators. >> >> In this case, it will delay the wake up for mutator A & B to next cycle. > > I believe that is the correct behavior. The mutators are waiting until there is memory available. If mutator B cannot allocate, there is no reason to believe mutator A would be able to allocate. In this case, it is fine for both mutators to wait (even if it means A has to wait an extra cycle). Thanks for the explanation, re-read the relevant codes I think it make sense, when Mutator B fails to allocate when Concurrent GC is at `op_final_update_refs`, very unlikely there is enough space for Mutator A. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1991968117 From wkemper at openjdk.org Wed Mar 12 18:55:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 18:55:08 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime [v2] In-Reply-To: References: Message-ID: > When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. William Kemper has updated the pull request incrementally with two additional commits since the last revision: - Split out and comment on weak/phantom stores separately - Revert unnecessary change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24001/files - new: https://git.openjdk.org/jdk/pull/24001/files/a742874e..929bf043 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24001&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24001&range=00-01 Stats: 10 lines in 1 file changed: 7 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24001/head:pull/24001 PR: https://git.openjdk.org/jdk/pull/24001 From shade at openjdk.org Wed Mar 12 19:24:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Mar 2025 19:24:54 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime [v2] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 18:55:08 GMT, William Kemper wrote: >> When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Split out and comment on weak/phantom stores separately > - Revert unnecessary change Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24001#pullrequestreview-2679647613 From rkennke at openjdk.org Wed Mar 12 19:41:54 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Mar 2025 19:41:54 GMT Subject: RFR: 8351444: Shenandoah: Class Unloading may encounter recycled oops [v3] In-Reply-To: References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Mon, 10 Mar 2025 21:25:06 GMT, William Kemper wrote: >> Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Trim extraneous comment Looks ok to me. Thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23951#pullrequestreview-2679682909 From wkemper at openjdk.org Wed Mar 12 20:16:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 20:16:07 GMT Subject: Integrated: 8351444: Shenandoah: Class Unloading may encounter recycled oops In-Reply-To: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> References: <78jaUyUnMnfncp8I4k6yvHqFaxxJ1BrvqkIelqK6aDc=.a1e2c417-3df2-45cf-befa-d60ff514533f@github.com> Message-ID: On Fri, 7 Mar 2025 21:47:31 GMT, William Kemper wrote: > Unloading classes may require a walk of unreachable oops. For this reason, it is not safe to recycle memory before class unloading is complete. This complements existing code to prevent mutators from recycling trash regions while weak roots is in progress. This pull request has now been integrated. Changeset: cdf7632f Author: William Kemper URL: https://git.openjdk.org/jdk/commit/cdf7632f8a85611077a27c91ad928ed8ea116f95 Stats: 47 lines in 4 files changed: 23 ins; 7 del; 17 mod 8351444: Shenandoah: Class Unloading may encounter recycled oops Reviewed-by: shade, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/23951 From ysr at openjdk.org Wed Mar 12 20:28:55 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 12 Mar 2025 20:28:55 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime [v2] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 18:55:08 GMT, William Kemper wrote: >> When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Split out and comment on weak/phantom stores separately > - Revert unnecessary change Looks good to me. I'm curious if this made any difference to SPECjbb performance w/GenShen. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24001#pullrequestreview-2679782378 From wkemper at openjdk.org Wed Mar 12 20:45:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 20:45:08 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime [v2] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 18:55:08 GMT, William Kemper wrote: >> When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. > > William Kemper has updated the pull request incrementally with two additional commits since the last revision: > > - Split out and comment on weak/phantom stores separately > - Revert unnecessary change I don't see any performance difference on Specjbb. The issue there is with getting cleared weak references processed by the Java thread that queues the weak references. References processing also uses `RawAccess` to null out the referent, so it doesn't go through this barrier. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24001#issuecomment-2719079655 From wkemper at openjdk.org Wed Mar 12 20:45:09 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 20:45:09 GMT Subject: Integrated: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime In-Reply-To: References: Message-ID: <1FWeSQKJ3IFVHgk7roq8VUgXut5mXckVdhDMrjAP6bk=.28958377-21bc-4869-a9ca-327b95810eb6@github.com> On Tue, 11 Mar 2025 23:35:24 GMT, William Kemper wrote: > When weak handles are cleared, the `nullptr` is stored with the `ON_PHANTOM_OOP_REF` decorator. For concurrent collectors using a SATB barrier like Shenandoah, this may cause the referent to be enqueued and marked when it would be otherwise unreachable. The problem is especially acute for Shenandoah's generational mode, in which a young region holding the otherwise unreachable referent, may become trash after the referent is enqueued for old marking. Shenandoah's store barrier should be modified to not enqueue WEAK or PHANTOM stores in the SATB buffer. This pull request has now been integrated. Changeset: a347ecde Author: William Kemper URL: https://git.openjdk.org/jdk/commit/a347ecdedc098bd23598ba6acf28d77db01be066 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime Reviewed-by: shade, ysr ------------- PR: https://git.openjdk.org/jdk/pull/24001 From wkemper at openjdk.org Wed Mar 12 20:55:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 20:55:07 GMT Subject: RFR: 8350905: Shenandoah: Releasing a WeakHandle's referent may extend its lifetime [v2] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 20:26:37 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with two additional commits since the last revision: >> >> - Split out and comment on weak/phantom stores separately >> - Revert unnecessary change > > Looks good to me. > > I'm curious if this made any difference to SPECjbb performance w/GenShen. @ysramakrishna , The issue description could make it more clear, but, in addition to the issue described in the title, this PR fixes the specific problem of the old generation SATB asserting when it tries to decode a narrow oop. The sequence of events for the assertion failure is: 1. SATB is on for old marking 2. Young collection transitions some regions to `trash` 3. Young weak root processing nulls out a referent that points into a `trash` region 4. Old gen SATB barrier tries to decode the narrow oop, but asserts out because the oop is not in the heap (because it is in a trash region) This also relates to: https://github.com/openjdk/jdk/pull/23951 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24001#issuecomment-2719098857 From wkemper at openjdk.org Wed Mar 12 22:41:53 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 22:41:53 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # In-Reply-To: <1TI7zry8_JLLMVwxDq0Yd65TrgkSYafDOEn8zOFS7z0=.0517105a-520a-4686-83eb-a2446ee72a8a@github.com> References: <1TI7zry8_JLLMVwxDq0Yd65TrgkSYafDOEn8zOFS7z0=.0517105a-520a-4686-83eb-a2446ee72a8a@github.com> Message-ID: On Tue, 11 Mar 2025 20:59:10 GMT, Kelvin Nilsen wrote: >> Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. > > src/hotspot/share/gc/shenandoah/shenandoahGenerationSizer.cpp line 127: > >> 125: } >> 126: >> 127: if (dst->max_capacity() + bytes_to_transfer > max_size_for(dst)) { > > Do we need to edit the descriptions of ShenandoahMinYoungPercentage and ShenandoahMaxYoungPercentage? Do we need to remove these options entirely from shenandoah_globals? Yes, we should probably remove one of them and name the other `ShenandoahInitYoungPercentage`. I think I will back out this change to `shGenerationSizer` in this PR, and open a different PR for removing this constraint and renaming the options. It's a bit outside the scope of this bug fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r1992390432 From wkemper at openjdk.org Wed Mar 12 23:17:44 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 23:17:44 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v2] In-Reply-To: References: Message-ID: <-73CoqTBA5dJPEwr7bxSvDmMFC9g_LZpW-q7XSjjtrE=.4966fa3b-e98f-4a50-9492-22bf99eecf1f@github.com> > Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Revert "Do not enforce size constraints on generations" This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23998/files - new: https://git.openjdk.org/jdk/pull/23998/files/11ff0677..a42efe5a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23998&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23998&range=00-01 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23998.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23998/head:pull/23998 PR: https://git.openjdk.org/jdk/pull/23998 From wkemper at openjdk.org Wed Mar 12 23:17:44 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 12 Mar 2025 23:17:44 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v2] In-Reply-To: References: <1TI7zry8_JLLMVwxDq0Yd65TrgkSYafDOEn8zOFS7z0=.0517105a-520a-4686-83eb-a2446ee72a8a@github.com> Message-ID: On Wed, 12 Mar 2025 22:39:45 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahGenerationSizer.cpp line 127: >> >>> 125: } >>> 126: >>> 127: if (dst->max_capacity() + bytes_to_transfer > max_size_for(dst)) { >> >> Do we need to edit the descriptions of ShenandoahMinYoungPercentage and ShenandoahMaxYoungPercentage? Do we need to remove these options entirely from shenandoah_globals? > > Yes, we should probably remove one of them and name the other `ShenandoahInitYoungPercentage`. I think I will back out this change to `shGenerationSizer` in this PR, and open a different PR for removing this constraint and renaming the options. It's a bit outside the scope of this bug fix. Filed: https://bugs.openjdk.org/browse/JDK-8351892 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r1992416245 From cslucas at openjdk.org Thu Mar 13 18:12:00 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 13 Mar 2025 18:12:00 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v4] In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 21:49:06 GMT, William Kemper wrote: >> This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - Clarify which thread local buffers in comment > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - Fix comments > - Add whitespace at end of file > - More detail for init update refs event message > - Use timing tracker for timing verification > - Merge remote-tracking branch 'jdk/master' into eliminate-final-roots > - WIP: Fix up phase timings for newly concurrent final roots and init update refs > - WIP: Combine satb transfer with state propagation, restore phase timing data > - ... and 2 more: https://git.openjdk.org/jdk/compare/1dd9cf10...a3575f1e Marked as reviewed by cslucas (Author). src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.cpp line 129: > 127: ShenandoahSATBMarkQueueSet& _satb_queues; > 128: ShenandoahObjToScanQueueSet* _mark_queues; > 129: volatile size_t _trashed_oops; NIT: perhaps a comment about why this needs to be volatile? ------------- PR Review: https://git.openjdk.org/jdk/pull/23830#pullrequestreview-2682970505 PR Review Comment: https://git.openjdk.org/jdk/pull/23830#discussion_r1994072609 From wkemper at openjdk.org Thu Mar 13 20:43:05 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 13 Mar 2025 20:43:05 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v5] In-Reply-To: References: Message-ID: > This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). William Kemper has updated the pull request incrementally with one additional commit since the last revision: Add comment explaining use of _trashed_oops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23830/files - new: https://git.openjdk.org/jdk/pull/23830/files/a3575f1e..cd6c6e44 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23830&range=03-04 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23830.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23830/head:pull/23830 PR: https://git.openjdk.org/jdk/pull/23830 From wkemper at openjdk.org Fri Mar 14 21:56:33 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 14 Mar 2025 21:56:33 GMT Subject: RFR: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak Message-ID: Clean backport, fixes memory leak. ------------- Commit messages: - Backport dfaa89162a35acd20b1ed35e147f9626a181510a Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/158/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=158&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346569 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/158.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/158/head:pull/158 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/158 From wkemper at openjdk.org Fri Mar 14 23:33:21 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 14 Mar 2025 23:33:21 GMT Subject: Integrated: 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak In-Reply-To: References: Message-ID: <07UYDYd49IZEEkxrxAju2Slr8BXMEmYeIUvpUMQL1yA=.c2ba7d0f-4464-4a7f-86c5-104f6ba0983a@github.com> On Fri, 14 Mar 2025 21:51:18 GMT, William Kemper wrote: > Clean backport, fixes memory leak. This pull request has now been integrated. Changeset: 49bdb3a1 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/49bdb3a16da7bd3ce1521123870515bbe1213eba Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod 8346569: Shenandoah: Worker initializes ShenandoahThreadLocalData twice results in memory leak Backport-of: dfaa89162a35acd20b1ed35e147f9626a181510a ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/158 From wkemper at openjdk.org Fri Mar 14 23:50:25 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 14 Mar 2025 23:50:25 GMT Subject: RFR: 8352091: GenShen: assert(!(request.generation->is_old() && _heap->old_generation()->is_doing_mixed_evacuations())) failed: Old heuristic should not request cycles while it waits for mixed evacuation Message-ID: Consider the following: 1. Regulator thread sees that control thread is `idle` and requests an old cycle 2. Regulator thread waits until control thread is not `idle` 3. Control thread starts old cycle and notifies the Regulator thread (as expected) 4. Regulator thread stays off CPU for a _long_ time 5. Control thread _completes_ old marking and returns to `idle` state 6. Regulator thread finally wakes up and sees that Control thread is _still_ idle 7. In fact, the control thread has completed old marking and the regulator thread should not request another cycle ------------- Commit messages: - Fix ABA issue that could have regulator thread request unexpected old cycles Changes: https://git.openjdk.org/jdk/pull/24069/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24069&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352091 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24069.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24069/head:pull/24069 PR: https://git.openjdk.org/jdk/pull/24069 From dholmes at openjdk.org Mon Mar 17 05:39:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Mar 2025 05:39:02 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v2] In-Reply-To: References: Message-ID: On Mon, 10 Mar 2025 09:49:44 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port Apologies for the silence but I was out-of-action for several days and am still trying to catch up. Obviously different JEPs have modeled things differently and there is no one-right-way. A lot of follow-up tasks have been identified and no doubt there will be even more after that. I personally would have liked to see more of the known tasks counted as part of the JEP. Hopefully a bunch of them may be ready by the time the JEP is ready anyway. Hitting the Approve button to show "general consensus". ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2689161146 From wkemper at openjdk.org Mon Mar 17 21:10:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 17 Mar 2025 21:10:52 GMT Subject: RFR: 8347620: Shenandoah: Use 'free' tag for free set related logging Message-ID: <6fMUeIRZ_11cr_Lg0LhezCPw93o3BjluX9bO9ioYNtI=.93cd77d9-3a14-4e60-a25c-cf3916bb99b7@github.com> Conflicts related to removal of SSIZE_FORMAT macro. ------------- Commit messages: - Backport 9782bfdd27da95c3bab9da6d46d695e717f465d8 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/159/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=159&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347620 Stats: 78 lines in 1 file changed: 7 ins; 7 del; 64 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/159.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/159/head:pull/159 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/159 From wkemper at openjdk.org Mon Mar 17 21:43:22 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 17 Mar 2025 21:43:22 GMT Subject: RFR: 8352181: Shenandoah: Evacuate thread roots after early cleanup In-Reply-To: <99wc8_4LoODnc8E0fwS3VV3NTfdPJ3soau-_jaiLrGU=.ef48e18a-03f2-4863-b610-513b52e539a5@github.com> References: <99wc8_4LoODnc8E0fwS3VV3NTfdPJ3soau-_jaiLrGU=.ef48e18a-03f2-4863-b610-513b52e539a5@github.com> Message-ID: On Mon, 17 Mar 2025 21:37:14 GMT, William Kemper wrote: > Moving the evacuation of thread roots after early cleanup allows Shenandoah to recycle immediate garbage a bit sooner in the cycle. @rkennke , this is a small change that allows immediate garbage to be recycled sooner. Wasn't sure if there was a specific reason to have thread roots evacuated before weak refs/roots and class unloading. Testing didn't show any problems. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24090#issuecomment-2730994869 From wkemper at openjdk.org Mon Mar 17 21:43:22 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 17 Mar 2025 21:43:22 GMT Subject: RFR: 8352181: Shenandoah: Evacuate thread roots after early cleanup Message-ID: <99wc8_4LoODnc8E0fwS3VV3NTfdPJ3soau-_jaiLrGU=.ef48e18a-03f2-4863-b610-513b52e539a5@github.com> Moving the evacuation of thread roots after early cleanup allows Shenandoah to recycle immediate garbage a bit sooner in the cycle. ------------- Commit messages: - Merge remote-tracking branch 'jdk/master' into investigate-root-evacuation - What happens if we evacuate roots after weak roots and class unloading? Changes: https://git.openjdk.org/jdk/pull/24090/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24090&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352181 Stats: 8 lines in 1 file changed: 3 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24090.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24090/head:pull/24090 PR: https://git.openjdk.org/jdk/pull/24090 From shade at openjdk.org Tue Mar 18 09:06:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 18 Mar 2025 09:06:14 GMT Subject: RFR: 8352181: Shenandoah: Evacuate thread roots after early cleanup In-Reply-To: <99wc8_4LoODnc8E0fwS3VV3NTfdPJ3soau-_jaiLrGU=.ef48e18a-03f2-4863-b610-513b52e539a5@github.com> References: <99wc8_4LoODnc8E0fwS3VV3NTfdPJ3soau-_jaiLrGU=.ef48e18a-03f2-4863-b610-513b52e539a5@github.com> Message-ID: On Mon, 17 Mar 2025 21:37:14 GMT, William Kemper wrote: > Moving the evacuation of thread roots after early cleanup allows Shenandoah to recycle immediate garbage a bit sooner in the cycle. I believe the reason we do thread roots earlier is to do the bulk of the stack processing before mutator sees it. If mutator does it by itself, it will go through armed nmethod barriers, which might be introducing extra latency. So we need to think if the benefit of doing the immediate cleanup earlier is worth accepting more active nmethod barriers in mutator. ------------- PR Review: https://git.openjdk.org/jdk/pull/24090#pullrequestreview-2693567054 From rkennke at openjdk.org Tue Mar 18 18:18:20 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 18 Mar 2025 18:18:20 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v5] In-Reply-To: References: Message-ID: <01CgWS5bjZ6prTge9OW7tOkS8g4w1FZ1zIJG1A9_798=.6afb2edc-d075-4f2d-b560-c75c195613d4@github.com> On Thu, 13 Mar 2025 20:43:05 GMT, William Kemper wrote: >> This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Add comment explaining use of _trashed_oops It looks good to me. I only have a small nit, up to you if you want to change that or not. Thank you! src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp line 59: > 57: // > 58: > 59: class ShenandoahFlushSATBHandshakeClosure : public HandshakeClosure { Maybe place the closure somewhere in shenandoahConcurrentGC.cpp, where it is used? Or is there a need to expose it on shenandoahClosures.hpp? ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23830#pullrequestreview-2695514909 PR Review Comment: https://git.openjdk.org/jdk/pull/23830#discussion_r2001568319 From kdnilsen at openjdk.org Tue Mar 18 18:24:37 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 18 Mar 2025 18:24:37 GMT Subject: RFR: 8347620: Shenandoah: Use 'free' tag for free set related logging In-Reply-To: <6fMUeIRZ_11cr_Lg0LhezCPw93o3BjluX9bO9ioYNtI=.93cd77d9-3a14-4e60-a25c-cf3916bb99b7@github.com> References: <6fMUeIRZ_11cr_Lg0LhezCPw93o3BjluX9bO9ioYNtI=.93cd77d9-3a14-4e60-a25c-cf3916bb99b7@github.com> Message-ID: On Mon, 17 Mar 2025 21:04:56 GMT, William Kemper wrote: > Conflicts related to removal of SSIZE_FORMAT macro. Thanks. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/159#pullrequestreview-2695794528 From wkemper at openjdk.org Tue Mar 18 18:28:30 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 18:28:30 GMT Subject: Integrated: 8347620: Shenandoah: Use 'free' tag for free set related logging In-Reply-To: <6fMUeIRZ_11cr_Lg0LhezCPw93o3BjluX9bO9ioYNtI=.93cd77d9-3a14-4e60-a25c-cf3916bb99b7@github.com> References: <6fMUeIRZ_11cr_Lg0LhezCPw93o3BjluX9bO9ioYNtI=.93cd77d9-3a14-4e60-a25c-cf3916bb99b7@github.com> Message-ID: On Mon, 17 Mar 2025 21:04:56 GMT, William Kemper wrote: > Conflicts related to removal of SSIZE_FORMAT macro. This pull request has now been integrated. Changeset: 6123d57b Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/6123d57b47adc2cbf06956e94fdc734bcc3abd47 Stats: 78 lines in 1 file changed: 7 ins; 7 del; 64 mod 8347620: Shenandoah: Use 'free' tag for free set related logging Reviewed-by: kdnilsen Backport-of: 9782bfdd27da95c3bab9da6d46d695e717f465d8 ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/159 From wkemper at openjdk.org Tue Mar 18 20:34:26 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 20:34:26 GMT Subject: RFR: 8350898: Shenandoah: Eliminate final roots safepoint [v5] In-Reply-To: <01CgWS5bjZ6prTge9OW7tOkS8g4w1FZ1zIJG1A9_798=.6afb2edc-d075-4f2d-b560-c75c195613d4@github.com> References: <01CgWS5bjZ6prTge9OW7tOkS8g4w1FZ1zIJG1A9_798=.6afb2edc-d075-4f2d-b560-c75c195613d4@github.com> Message-ID: On Tue, 18 Mar 2025 17:19:56 GMT, Roman Kennke wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comment explaining use of _trashed_oops > > src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp line 59: > >> 57: // >> 58: >> 59: class ShenandoahFlushSATBHandshakeClosure : public HandshakeClosure { > > Maybe place the closure somewhere in shenandoahConcurrentGC.cpp, where it is used? Or is there a need to expose it on shenandoahClosures.hpp? Ah, it is also used in `shenandoahConcurrentMark.cpp`: https://github.com/openjdk/jdk/pull/23830/files#diff-d5228ec0709dbd663da93db4cf13eac3b28015d90d0c4ef206a68b008dc1d429L215 (in fact, this is where I took it from ?). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23830#discussion_r2001949346 From wkemper at openjdk.org Tue Mar 18 21:55:32 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 21:55:32 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled Message-ID: The sequence of events that creates this state: 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake 2. The regulator thread cancels old marking to start a young collection 3. A mutator thread shortly follows and attempts to cancel the nascent young collection 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. ------------- Commit messages: - Allow young cycles that interrupt old cycles to be cancelled Changes: https://git.openjdk.org/jdk/pull/24105/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352299 Stats: 9 lines in 2 files changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24105.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24105/head:pull/24105 PR: https://git.openjdk.org/jdk/pull/24105 From xpeng at openjdk.org Tue Mar 18 22:18:44 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 18 Mar 2025 22:18:44 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification Message-ID: There are some scenarios in which GenShen may have improper remembered set verification logic: 1. Concurrent young cycles following a Full GC: In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification ShenandoahVerifier ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { shenandoah_assert_generations_reconciled(); if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { return _heap->complete_marking_context(); } return nullptr; } For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. ### Test - [x] `make test TEST=hotspot_gc_shenandoah` ------------- Commit messages: - Clean and rebuild rem-set in global gc - Set mark incomplete after ShenandoahMCResetCompleteBitmapTask - Only clean rem-set read table in young gc; not verify rem-set in concurrent global GC in generational mode - Always swap card table in generational mode so the table can be properly rebuilt through marking. - Initial works Changes: https://git.openjdk.org/jdk/pull/24092/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352185 Stats: 42 lines in 4 files changed: 17 ins; 16 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From kdnilsen at openjdk.org Tue Mar 18 22:18:44 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 18 Mar 2025 22:18:44 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 00:19:35 GMT, Xiaolong Peng wrote: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 648: > 646: assert(!heap->has_forwarded_objects(), "No forwarded objects on this path"); > 647: > 648: if (heap->mode()->is_generational()) { I think we do not want to change this code. We only swap remembered set for young-gen because gen will scan the remset and reconstruct it with more updated information. For a global GC, we do not scan the remset and do not reconstruct it. If we swap here, we will lose the information that is currently within the remset. src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 953: > 951: // pinned regions. > 952: if (!r->is_pinned()) { > 953: _heap->marking_context()->reset_top_at_mark_start(r); Here, and below, I think we want to keep complete_marking_context() rather than changing to marking_context() src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1281: > 1279: ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > 1280: shenandoah_assert_generations_reconciled(); > 1281: if (_heap->old_generation()->is_mark_complete()) { In the case that this is an global GC, we know that old-generation()->is_mark_complete() by virtue of the current program counter, I assume. (We would only ask for the old marking context if global marking were already finished.) In the case that we are doing a global GC cycle, I'm guessing that we do not set is-mark-complete for the old generation. So that's why I believe you need to keep the condition as originally written. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r1999966128 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r1999982949 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r1999975507 From wkemper at openjdk.org Tue Mar 18 22:18:44 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 22:18:44 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 00:19:35 GMT, Xiaolong Peng wrote: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1389: > 1387: shenandoah_assert_safepoint(); > 1388: shenandoah_assert_generational(); > 1389: ShenandoahMarkingContext* ctx = get_marking_context_for_old(); This should always be `nullptr` after a full GC, right? The marking context is no longer valid after compaction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2001580562 From xpeng at openjdk.org Tue Mar 18 22:18:44 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 18 Mar 2025 22:18:44 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> On Tue, 18 Mar 2025 01:25:11 GMT, Kelvin Nilsen wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 648: > >> 646: assert(!heap->has_forwarded_objects(), "No forwarded objects on this path"); >> 647: >> 648: if (heap->mode()->is_generational()) { > > I think we do not want to change this code. We only swap remembered set for young-gen because gen will scan the remset and reconstruct it with more updated information. For a global GC, we do not scan the remset and do not reconstruct it. If we swap here, we will lose the information that is currently within the remset. Thanks for for explanation, I have been reading and trying the understand how the remembered set works in GenShen. I wasn't sure whether this is actually right. In generational mode, if the GC cycle is global, the read table is already cleaned during reset phase, so remembered set verification from `verify_before_concmark` and `verify_before_update_refs` shouldn't work properly, I think the remembered set verification before mark and update references should be disabled, what do you think? Meanwhile, there is no need to clean read table during global cycle in generational mode. > src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 953: > >> 951: // pinned regions. >> 952: if (!r->is_pinned()) { >> 953: _heap->marking_context()->reset_top_at_mark_start(r); > > Here, and below, I think we want to keep complete_marking_context() rather than changing to marking_context() The marking context is not complete anymore after ShenandoahMCResetCompleteBitmapTask, but ShenandoahMCResetCompleteBitmapTask only reset bitmaps for the regions w/o pinned objects, the place calling `set_mark_incomplete()` need to moved to some place after ShenandoahPostCompactClosure being executed if use complete_marking_context here. > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1281: > >> 1279: ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> 1280: shenandoah_assert_generations_reconciled(); >> 1281: if (_heap->old_generation()->is_mark_complete()) { > > In the case that this is an global GC, we know that old-generation()->is_mark_complete() by virtue of the current program counter, I assume. (We would only ask for the old marking context if global marking were already finished.) In the case that we are doing a global GC cycle, I'm guessing that we do not set is-mark-complete for the old generation. So that's why I believe you need to keep the condition as originally written. If it is global GC in generational mode, old-generation()->is_mark_complete() is always false after reset and before mark because bitmaps of the entire heap including old gen has been reset during concurrent reset phase, so old mark is not finished in when verify_before_concmark is called. The marking context return by this method is only used for remembered set verification, but as I pointed out in the first comments, we shouldn't do remembered set verification in such case because the rem-set read table is already cleaned/stale. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000354419 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000361095 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000371950 From xpeng at openjdk.org Tue Mar 18 22:18:44 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 18 Mar 2025 22:18:44 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 17:24:54 GMT, William Kemper wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` > > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1389: > >> 1387: shenandoah_assert_safepoint(); >> 1388: shenandoah_assert_generational(); >> 1389: ShenandoahMarkingContext* ctx = get_marking_context_for_old(); > > This should always be `nullptr` after a full GC, right? The marking context is no longer valid after compaction. Yes, get_marking_context_for_old always return `nullptr` after a full GC, the marking completeness has been set to false when we reset marking bitmaps after full GC. I think the method get_marking_context_for_old and the ctx arg of the helper function can be removed, I'll do that in next update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2001815702 From shade at openjdk.org Tue Mar 18 22:26:07 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 18 Mar 2025 22:26:07 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 21:51:34 GMT, William Kemper wrote: > The sequence of events that creates this state: > 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake > 2. The regulator thread cancels old marking to start a young collection > 3. A mutator thread shortly follows and attempts to cancel the nascent young collection > 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` > 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` > 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. (too tired to do a full review, just mentioning a thing, so we look at it tomorrow) src/hotspot/share/gc/shenandoah/shenandoahSharedVariables.hpp line 243: > 241: assert (new_value < (sizeof(ShenandoahSharedValue) * CHAR_MAX), "sanity"); > 242: // Hmm, no platform template specialization defined for exchanging one byte... (up cast to intptr is workaround). > 243: return (T)Atomic::xchg((intptr_t*)&value, (intptr_t)new_value); That... likely gets awkward on different endianness. See the complicated dance `Atomic::CmpxchgByteUsingInt` has to do to handle it. Not to mention we are likely writing to adjacent memory location. Which is _currently_ innocuous, since we hit padding, but it is not very reliable. ------------- Changes requested by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24105#pullrequestreview-2696449916 PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2002110190 From ysr at openjdk.org Tue Mar 18 22:38:08 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 18 Mar 2025 22:38:08 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:05:05 GMT, William Kemper wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > The non-generational modes may also fail to notify waiters Is the description in the PR still valid? > This change directly tracks the number of threads waiting due to an allocation failure, rather than indirectly tracking them through the cancelled gc state. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23997#issuecomment-2734885815 From kdnilsen at openjdk.org Tue Mar 18 22:40:41 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 18 Mar 2025 22:40:41 GMT Subject: RFR: 8350889: GenShen: Break out of infinite loop of old GC cycles part2 Message-ID: A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. ------------- Commit messages: - Cancel old GC triggers when old GC starts/resumes - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - ... and 20 more: https://git.openjdk.org/jdk/compare/4a02de82...4ebb3aaf Changes: https://git.openjdk.org/jdk/pull/24106/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24106&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350889 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24106.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24106/head:pull/24106 PR: https://git.openjdk.org/jdk/pull/24106 From wkemper at openjdk.org Tue Mar 18 23:01:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 23:01:08 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled In-Reply-To: References: Message-ID: <7PFHErLXXCsFeCjx55B_u8JisUcDGX9VFLa5azzsCso=.92f7d81d-8989-4aff-b57e-d2128403e01f@github.com> On Tue, 18 Mar 2025 22:23:23 GMT, Aleksey Shipilev wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > src/hotspot/share/gc/shenandoah/shenandoahSharedVariables.hpp line 243: > >> 241: assert (new_value < (sizeof(ShenandoahSharedValue) * CHAR_MAX), "sanity"); >> 242: // Hmm, no platform template specialization defined for exchanging one byte... (up cast to intptr is workaround). >> 243: return (T)Atomic::xchg((intptr_t*)&value, (intptr_t)new_value); > > That... likely gets awkward on different endianness. See the complicated dance `Atomic::CmpxchgByteUsingInt` has to do to handle it. > > Not to mention we are likely writing to adjacent memory location. Which is _currently_ innocuous, since we hit padding, but it is not very reliable. `PlatformCmpxchg` has specializations on aarch64 and x86 for `sizeof(T) == 1`. Should we also add platform specializations for `PlatformXchg` for `sizeof(T) == 1`? (It has them for `4` and `8`). Could also do what `XchgUsingCmpxchg` does... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2002137199 From wkemper at openjdk.org Tue Mar 18 23:04:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 23:04:07 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:05:05 GMT, William Kemper wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change sees allocation waiters notified any time a GC completes without being cancelled. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > The non-generational modes may also fail to notify waiters Fixed the description. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23997#issuecomment-2734917437 From wkemper at openjdk.org Tue Mar 18 23:08:09 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 18 Mar 2025 23:08:09 GMT Subject: RFR: 8350889: GenShen: Break out of infinite loop of old GC cycles part2 In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 22:36:49 GMT, Kelvin Nilsen wrote: > A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. LGTM ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24106#pullrequestreview-2696496570 From ysr at openjdk.org Tue Mar 18 23:47:08 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 18 Mar 2025 23:47:08 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: References: Message-ID: On Wed, 12 Mar 2025 00:05:05 GMT, William Kemper wrote: >> Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change sees allocation waiters notified any time a GC completes without being cancelled. >> >> # Testing >> Ran TestAllocHumongousFragment#generational 6,500 times without failures. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > The non-generational modes may also fail to notify waiters LGTM! ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23997#pullrequestreview-2696532375 From kdnilsen at openjdk.org Wed Mar 19 00:15:10 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 19 Mar 2025 00:15:10 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 00:19:35 GMT, Xiaolong Peng wrote: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Can we confirm that this addresses JBS issue with further testing before integration? src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 660: > 658: > 659: // Verify before mark is done before swapping card tables, > 660: // therefore the write card table will be verified before being taken snapshot. Not a big deal, but this is two sentences. "... swapping card tables. Therefore, the write card table is verified before we swap read and write card tables." ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2696536061 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2002170767 From kdnilsen at openjdk.org Wed Mar 19 00:15:11 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 19 Mar 2025 00:15:11 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> Message-ID: On Tue, 18 Mar 2025 07:14:23 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 648: >> >>> 646: assert(!heap->has_forwarded_objects(), "No forwarded objects on this path"); >>> 647: >>> 648: if (heap->mode()->is_generational()) { >> >> I think we do not want to change this code. We only swap remembered set for young-gen because gen will scan the remset and reconstruct it with more updated information. For a global GC, we do not scan the remset and do not reconstruct it. If we swap here, we will lose the information that is currently within the remset. > > Thanks for for explanation, I have been reading and trying the understand how the remembered set works in GenShen. I wasn't sure whether this is actually right. > > In generational mode, if the GC cycle is global, the read table is already cleaned during reset phase, so remembered set verification from `verify_before_concmark` and `verify_before_update_refs` shouldn't work properly, I think the remembered set verification before mark and update references should be disabled, what do you think? Meanwhile, there is no need to clean read table during global cycle in generational mode. Ok. So we will always swap card tables, but we'll do it after verify-before-mark. To clarify the intention, after we swap card table, the write-table is all clean, and the read table holds whatever had been gathered prior to the start of GC. Young and bootstrap collection will update the write card table as a side effect of remembered set scanning. Global collection will update the card table as a side effect of global marking of old objects. >> src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 953: >> >>> 951: // pinned regions. >>> 952: if (!r->is_pinned()) { >>> 953: _heap->marking_context()->reset_top_at_mark_start(r); >> >> Here, and below, I think we want to keep complete_marking_context() rather than changing to marking_context() > > The marking context is not complete anymore after ShenandoahMCResetCompleteBitmapTask, but ShenandoahMCResetCompleteBitmapTask only reset bitmaps for the regions w/o pinned objects, the place calling `set_mark_incomplete()` need to moved to some place after ShenandoahPostCompactClosure being executed if use complete_marking_context here. Can we move heap_region_iterate(&post_compact) and post_compact.update_generation_usage() before heap->workers()->run_task(ShenandoahMCResetCompletedBitmaptask) so that we can use complete_marking_context here? I'm a bit uncomfortable using an incomplete marking context as if it is complete. (I understand "why it works" in this case, but this looks like an "accident waiting to happen" when someone comes back to modify this code in the future. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2002169842 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2002179864 From xpeng at openjdk.org Wed Mar 19 00:19:09 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 19 Mar 2025 00:19:09 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification In-Reply-To: References: Message-ID: <-t-I-pGhv43DJxTgXO3bMPX0G5eMYqsO3LPjLCq9XNg=.682e2b60-8ca6-413f-8b8d-86a44e25a37a@github.com> On Tue, 18 Mar 2025 23:50:24 GMT, Kelvin Nilsen wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 660: > >> 658: >> 659: // Verify before mark is done before swapping card tables, >> 660: // therefore the write card table will be verified before being taken snapshot. > > Not a big deal, but this is two sentences. "... swapping card tables. Therefore, the write card table is verified before we swap read and write card tables." Thanks, I'll fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2002187325 From ysr at openjdk.org Wed Mar 19 00:25:07 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 19 Mar 2025 00:25:07 GMT Subject: RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4] In-Reply-To: <8HGl6b056y3lTi7An0UsJ896JOy-7Ij8SMcc2MULj0I=.26ca6193-f829-449e-afbe-4d068b8533ab@github.com> References: <8HGl6b056y3lTi7An0UsJ896JOy-7Ij8SMcc2MULj0I=.26ca6193-f829-449e-afbe-4d068b8533ab@github.com> Message-ID: On Wed, 12 Mar 2025 17:25:20 GMT, Xiaolong Peng wrote: >> I believe that is the correct behavior. The mutators are waiting until there is memory available. If mutator B cannot allocate, there is no reason to believe mutator A would be able to allocate. In this case, it is fine for both mutators to wait (even if it means A has to wait an extra cycle). > > Thanks for the explanation, re-read the relevant codes I think it make sense, when Mutator B fails to allocate when Concurrent GC is at `op_final_update_refs`, very unlikely there is enough space for Mutator A. In the case of the stop world collectors, the waiters would form cohorts behind GC count epochs, the idea being that if your failure to allocate happened during a specific epoch, you didn't have sufficient head room to allocate which would then require at least a new GC. Depending on how we think of the allocation failures interacting with potential freeing of memory by a concurrent collector and the sizes of the allocations being attempted, I could see this going either way. I do realize that a large number of notifications when space is exhausted might extract a cost, but if we are allocating and collecting concurrently, I can imagine that some notion of a monotonically increasing count and notifying all of the early waiters might yield some benefit. I assume we would need to do collect the dustribution of the allocation failure sizes and the space available to really tell if it makes a difference. A benchmark such as SPECjbb might be able to tell the difference but I am not sure. Intuition can sometimes mislead in these scenarios, so empirical data might help. Can probably be tackled/investigated in the fullness of time, but I thought it was worth leaving my thoughts here anyway. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r2002191328 From wkemper at openjdk.org Wed Mar 19 00:33:14 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 19 Mar 2025 00:33:14 GMT Subject: Integrated: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational In-Reply-To: References: Message-ID: On Tue, 11 Mar 2025 19:31:47 GMT, William Kemper wrote: > Failed allocations may race to cancel the GC with the collector who is working to clear the cancelled GC. When the GC wins this race, it will fail to notify threads that are waiting for the failed GC cycle to complete. This change sees allocation waiters notified any time a GC completes without being cancelled. > > # Testing > Ran TestAllocHumongousFragment#generational 6,500 times without failures. This pull request has now been integrated. Changeset: 20d4fe3a Author: William Kemper URL: https://git.openjdk.org/jdk/commit/20d4fe3a574a33784dc02e7cc653cdb248b697a2 Stats: 5 lines in 3 files changed: 0 ins; 1 del; 4 mod 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational Reviewed-by: xpeng, ysr ------------- PR: https://git.openjdk.org/jdk/pull/23997 From xpeng at openjdk.org Wed Mar 19 00:35:29 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 19 Mar 2025 00:35:29 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v2] In-Reply-To: References: Message-ID: <0tjufPvihcze6ELUIAybBhxFDp3tZk2qgaD0XPHFUjw=.9148d06f-95ba-449b-af11-2ba86ee40c7a@github.com> > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/5a94b141..021f2fef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=00-01 Stats: 8 lines in 2 files changed: 3 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Wed Mar 19 00:35:29 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 19 Mar 2025 00:35:29 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v2] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> Message-ID: <2U6kJoymX-uKOSUoR54QvEQtJ54DyKgttgItf96SYzI=.69334cab-f9ca-4609-93ba-23197c76f430@github.com> On Wed, 19 Mar 2025 00:04:07 GMT, Kelvin Nilsen wrote: >> The marking context is not complete anymore after ShenandoahMCResetCompleteBitmapTask, but ShenandoahMCResetCompleteBitmapTask only reset bitmaps for the regions w/o pinned objects, the place calling `set_mark_incomplete()` need to moved to some place after ShenandoahPostCompactClosure being executed if use complete_marking_context here. > > Can we move heap_region_iterate(&post_compact) and post_compact.update_generation_usage() before heap->workers()->run_task(ShenandoahMCResetCompletedBitmaptask) so that we can use complete_marking_context here? I'm a bit uncomfortable using an incomplete marking context as if it is complete. (I understand "why it works" in this case, but this looks like an "accident waiting to happen" when someone comes back to modify this code in the future. I'm not uncomfortable changing the orders here since I am not sure if there is dependency on the execution order(even it should be working), but I can move set_mark_incomplete() to the a place close to the end of phase5_epilog. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2002196180 From ysr at openjdk.org Wed Mar 19 00:38:08 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 19 Mar 2025 00:38:08 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v2] In-Reply-To: <0tjufPvihcze6ELUIAybBhxFDp3tZk2qgaD0XPHFUjw=.9148d06f-95ba-449b-af11-2ba86ee40c7a@github.com> References: <0tjufPvihcze6ELUIAybBhxFDp3tZk2qgaD0XPHFUjw=.9148d06f-95ba-449b-af11-2ba86ee40c7a@github.com> Message-ID: On Wed, 19 Mar 2025 00:35:29 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Can you sync w/master so GHA (& problem lists) is more uptodate. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24092#issuecomment-2735021061 From xpeng at openjdk.org Wed Mar 19 00:48:31 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 19 Mar 2025 00:48:31 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v3] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'openjdk:master' into JDK-8345399-v3 - Address review comments - Clean and rebuild rem-set in global gc - Set mark incomplete after ShenandoahMCResetCompleteBitmapTask - Only clean rem-set read table in young gc; not verify rem-set in concurrent global GC in generational mode - Always swap card table in generational mode so the table can be properly rebuilt through marking. - Initial works ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/021f2fef..4726b876 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=01-02 Stats: 54221 lines in 805 files changed: 27210 ins; 17485 del; 9526 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Wed Mar 19 00:48:31 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 19 Mar 2025 00:48:31 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v2] In-Reply-To: References: <0tjufPvihcze6ELUIAybBhxFDp3tZk2qgaD0XPHFUjw=.9148d06f-95ba-449b-af11-2ba86ee40c7a@github.com> Message-ID: On Wed, 19 Mar 2025 00:35:39 GMT, Y. Srinivas Ramakrishna wrote: > Can you sync w/master so GHA (& problem lists) is more uptodate. Thanks! Done, now waiting for GHS to rerun, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24092#issuecomment-2735030715 From shade at openjdk.org Wed Mar 19 09:31:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 19 Mar 2025 09:31:09 GMT Subject: RFR: 8350889: GenShen: Break out of infinite loop of old GC cycles In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 22:36:49 GMT, Kelvin Nilsen wrote: > A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. This is part2 of the fix. I don't understand the bug mapping here. JDK-8350889 is already resolved. I think you need to file a follow-up bug and reference that new bug in this PR. ------------- Changes requested by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24106#pullrequestreview-2697641876 From andrew at openjdk.org Wed Mar 19 15:40:29 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 19 Mar 2025 15:40:29 GMT Subject: RFR: Merge jdk8u:master Message-ID: Merge jdk8u342-b01 ------------- Commit messages: - Merge jdk8u342-b01 - Merge - 8282458: Update .jcheck/conf file for 8u move to git - Added tag jdk8u332-ga for changeset 37aca7715d13 - 8285445: cannot open file "NUL:" - 8284772: 8u GHA: Use GCC Major Version Dependencies Only - Merge - Added tag jdk8u332-b09 for changeset 37aca7715d13 - 8190753: (zipfs): Accessing a large entry (> 2^31 bytes) leads to a negative initial size for ByteArrayOutputStream - 8261107: ArrayIndexOutOfBoundsException in the ICC_Profile.getInstance(InputStream) - ... and 15 more: https://git.openjdk.org/shenandoah-jdk8u/compare/0f3b1805...988c585a The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=15&range=00.0 - jdk8u:master: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=15&range=00.1 Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/15/files Stats: 3675 lines in 73 files changed: 2478 ins; 927 del; 270 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/15.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/15/head:pull/15 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/15 From wkemper at openjdk.org Wed Mar 19 16:59:16 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 19 Mar 2025 16:59:16 GMT Subject: Integrated: 8350898: Shenandoah: Eliminate final roots safepoint In-Reply-To: References: Message-ID: On Thu, 27 Feb 2025 19:51:24 GMT, William Kemper wrote: > This PR converts the final roots safepoint operation into a handshake. The safepoint operation still exists, but is only executed when `ShenandoahVerify` is enabled. In addition to this change, this PR also improves the logging for the concurrent preparation for update references from [PR 22688](https://github.com/openjdk/jdk/pull/22688). This pull request has now been integrated. Changeset: 8a1c85ea Author: William Kemper URL: https://git.openjdk.org/jdk/commit/8a1c85eaa902500d49ca82c67b6838d39cb5b24f Stats: 295 lines in 14 files changed: 198 ins; 47 del; 50 mod 8350898: Shenandoah: Eliminate final roots safepoint Reviewed-by: rkennke, kdnilsen, cslucas ------------- PR: https://git.openjdk.org/jdk/pull/23830 From andrew at openjdk.org Wed Mar 19 17:03:39 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Wed, 19 Mar 2025 17:03:39 GMT Subject: RFR: Merge jdk8u:master In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 15:35:57 GMT, Andrew John Hughes wrote: > Merge jdk8u342-b01 GHA builds will not work until [JDK-8284622](https://bugs.openjdk.org/browse/JDK-8284622) is merged in 8u362-b03 ------------- PR Comment: https://git.openjdk.org/shenandoah-jdk8u/pull/15#issuecomment-2737407679 From kdnilsen at openjdk.org Wed Mar 19 17:38:17 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 19 Mar 2025 17:38:17 GMT Subject: RFR: 8352428: GenShen: Old-gen cycles are still looping In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 09:28:53 GMT, Aleksey Shipilev wrote: >> A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. This is part2 of the fix. > > I don't understand the bug mapping here. JDK-8350889 is already resolved. I think you need to file a follow-up bug and reference that new bug in this PR. Thanks @shipilev for your suggestion. I've opened another JBS issue and linked this new PR to that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24106#issuecomment-2737503085 From shade at openjdk.org Wed Mar 19 18:00:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 19 Mar 2025 18:00:09 GMT Subject: RFR: 8352428: GenShen: Old-gen cycles are still looping In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 22:36:49 GMT, Kelvin Nilsen wrote: > A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. This is part2 of the fix. Looks fine to me. Note the are useful "Caused by" and "Related" links in JBS, which you should really use to track the dependencies between the tickets. I added some, see how it looks. Also, these are either "Enhancement" or "Bug". "Task" is usually about something that is not code-related: https://openjdk.org/guide/#types-of-issues ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24106#pullrequestreview-2699539104 From wkemper at openjdk.org Wed Mar 19 18:33:50 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 19 Mar 2025 18:33:50 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v2] In-Reply-To: References: Message-ID: > The sequence of events that creates this state: > 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake > 2. The regulator thread cancels old marking to start a young collection > 3. A mutator thread shortly follows and attempts to cancel the nascent young collection > 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` > 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` > 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Emulate single byte xchg with cmpxchg - Merge remote-tracking branch 'jdk/master' into fix-uncancellable-young-gc - Allow young cycles that interrupt old cycles to be cancelled ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24105/files - new: https://git.openjdk.org/jdk/pull/24105/files/9b0faf0c..adcb999b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=00-01 Stats: 2244 lines in 56 files changed: 964 ins; 745 del; 535 mod Patch: https://git.openjdk.org/jdk/pull/24105.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24105/head:pull/24105 PR: https://git.openjdk.org/jdk/pull/24105 From kdnilsen at openjdk.org Thu Mar 20 00:56:19 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 20 Mar 2025 00:56:19 GMT Subject: Integrated: 8352428: GenShen: Old-gen cycles are still looping In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 22:36:49 GMT, Kelvin Nilsen wrote: > A recent commit failed to address all paths by which an infinite loop of old GC cycles might occur. This new PR handles one other case related to the original problem. This is part2 of the fix. This pull request has now been integrated. Changeset: 74df384a Author: Kelvin Nilsen URL: https://git.openjdk.org/jdk/commit/74df384a9870431efb184158bba032c79c35356e Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod 8352428: GenShen: Old-gen cycles are still looping Reviewed-by: wkemper, shade ------------- PR: https://git.openjdk.org/jdk/pull/24106 From andrew at openjdk.org Thu Mar 20 01:06:43 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 20 Mar 2025 01:06:43 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 25 new changesets Message-ID: <1c9fa0d4-de46-4971-8158-75193739f5e1@openjdk.org> Changeset: 4a19c1a6 Branch: master Author: Andrew John Hughes Date: 2022-03-02 19:10:35 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/4a19c1a65107202800bf8df51684f7255d6ef027 8282458: Update .jcheck/conf file for 8u move to git Reviewed-by: sgehwolf ! .jcheck/conf Changeset: 6f01b534 Branch: master Author: Severin Gehwolf Date: 2022-03-03 09:44:11 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/6f01b5341956285a9f246a9228586d8c000603dc 8282552: Bump update version of OpenJDK: 8u342 Reviewed-by: erikj, andrew ! common/autoconf/version-numbers Changeset: b5bcf6c2 Branch: master Author: Zdenek Zambersky Committer: Severin Gehwolf Date: 2022-03-15 12:50:04 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/b5bcf6c272cdd0397555bcf0d9887081016bc9a4 8279669: test/jdk/com/sun/jdi/TestScaffold.java uses wrong condition Reviewed-by: phh, sgehwolf Backport-of: 4c52eb39431c2479b0d140907bdcc0311d30f871 ! jdk/test/com/sun/jdi/TestScaffold.java Changeset: 94cb2ef9 Branch: master Author: Alexey Bakhtin Date: 2022-03-17 16:05:07 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/94cb2ef9307e1da317b4c17c65be25a724155876 8076190: Customizing the generation of a PKCS12 keystore Reviewed-by: mbalao Backport-of: 9136c7d1d0e1247ea1ac95a6577acbb789169031 + jdk/src/share/classes/com/sun/crypto/provider/HmacPKCS12PBECore.java - jdk/src/share/classes/com/sun/crypto/provider/HmacPKCS12PBESHA1.java ! jdk/src/share/classes/com/sun/crypto/provider/PBES2Parameters.java ! jdk/src/share/classes/com/sun/crypto/provider/SunJCE.java ! jdk/src/share/classes/sun/security/pkcs12/PKCS12KeyStore.java ! jdk/src/share/classes/sun/security/tools/keytool/Main.java ! jdk/src/share/classes/sun/security/x509/AlgorithmId.java ! jdk/src/share/lib/security/java.security-aix ! jdk/src/share/lib/security/java.security-linux ! jdk/src/share/lib/security/java.security-macosx ! jdk/src/share/lib/security/java.security-solaris ! jdk/src/share/lib/security/java.security-windows + jdk/test/sun/security/pkcs12/ParamsPreferences.java + jdk/test/sun/security/pkcs12/ParamsTest.java + jdk/test/sun/security/pkcs12/params/README + jdk/test/sun/security/pkcs12/params/kandc + jdk/test/sun/security/pkcs12/params/ks + jdk/test/sun/security/pkcs12/params/os2 + jdk/test/sun/security/pkcs12/params/os3 + jdk/test/sun/security/pkcs12/params/os4 + jdk/test/sun/security/pkcs12/params/os5 Changeset: 5fcfb7ac Branch: master Author: Dongbo He Committer: Fei Yang Date: 2022-03-21 06:14:50 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/5fcfb7acbe340c8347bd47413a36dd0b1e267dc1 8194154: System property user.dir should not be changed Cached user.dir so getCanonicalPath uses the cached value. Reviewed-by: sgehwolf Backport-of: 4ea684bf31fc4e3cdee2ae51c0000a7b3e914151 ! jdk/src/solaris/classes/java/io/UnixFileSystem.java ! jdk/src/windows/classes/java/io/WinNTFileSystem.java + jdk/test/java/io/File/UserDirChangedTest.java Changeset: 25693fa8 Branch: master Author: Dongbo He Committer: Fei Yang Date: 2022-03-22 06:33:31 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/25693fa8c5dbdc23ead1c7d837050adb4af1b705 8223396: [TESTBUG] several jfr tests do not clean up files created in /tmp Using test utils to create temp files and directories Reviewed-by: andrew Backport-of: 7d3aebccc0b90aa2ca2f656c683fa5931fd0ed3a ! jdk/test/jdk/jfr/event/io/EvilInstrument.java ! jdk/test/jdk/jfr/event/io/TestDisabledEvents.java ! jdk/test/jdk/jfr/event/io/TestFileChannelEvents.java ! jdk/test/jdk/jfr/event/io/TestFileReadOnly.java ! jdk/test/jdk/jfr/event/io/TestFileStreamEvents.java ! jdk/test/jdk/jfr/event/io/TestRandomAccessFileEvents.java ! jdk/test/jdk/jfr/event/io/TestRandomAccessFileThread.java ! jdk/test/jdk/jfr/jcmd/TestJcmdConfigure.java ! jdk/test/jdk/jfr/jmx/JmxHelper.java ! jdk/test/jdk/jfr/jvm/TestJavaEvent.java ! jdk/test/lib/jdk/test/lib/Utils.java Changeset: 2bbec15b Branch: master Author: Dongbo He Committer: Fei Yang Date: 2022-03-22 06:47:59 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/2bbec15b3e571b2b7ead4f7911f965f2b72c245a 8230865: [TESTBUG] jdk/jfr/event/io/EvilInstrument.java fails at-run shell MakeJAR.sh target Prebuilding the test class before adding it into a jar file Backport-of: 725da985e170d72c3ca3dc2dfbb3d7e083b5371a ! jdk/test/jdk/jfr/event/io/EvilInstrument.java Changeset: 969f31e3 Branch: master Author: Andrew John Hughes Date: 2022-03-22 17:53:25 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/969f31e3c4a66df9ae6ee03301116dd582648c46 Merge Changeset: bb69732e Branch: master Author: Alex Kasko Date: 2022-03-22 19:40:21 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/bb69732e30a2f7ba68438787d4820d43787c7244 8129572: Cleanup usage of getResourceAsStream in jaxp Reviewed-by: andrew Backport-of: 4ebbfc918f558e73c05f471cfd3ab2b11dcf5a75 ! jaxp/src/com/sun/org/apache/bcel/internal/util/SecuritySupport.java ! jaxp/src/com/sun/org/apache/xalan/internal/utils/SecuritySupport.java ! jaxp/src/com/sun/org/apache/xerces/internal/utils/SecuritySupport.java ! jaxp/src/com/sun/org/apache/xerces/internal/xinclude/SecuritySupport.java ! jaxp/src/com/sun/org/apache/xml/internal/serialize/SecuritySupport.java ! jaxp/src/com/sun/org/apache/xpath/internal/functions/FuncSystemProperty.java ! jaxp/src/javax/xml/datatype/SecuritySupport.java ! jaxp/src/javax/xml/parsers/SecuritySupport.java ! jaxp/src/javax/xml/stream/SecuritySupport.java ! jaxp/src/javax/xml/transform/SecuritySupport.java ! jaxp/src/javax/xml/validation/SecuritySupport.java ! jaxp/src/javax/xml/xpath/SecuritySupport.java ! jaxp/src/org/xml/sax/helpers/SecuritySupport.java Changeset: 57d8d290 Branch: master Author: Alex Kasko Date: 2022-03-22 19:45:40 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/57d8d29080414dd038fffff1b97d97f3ab3f1570 8132256: jaxp: Investigate removal of com/sun/org/apache/bcel/internal/util/ClassPath.java Com/sun/org/apache/bcel/internal/util/ClassPath.java removed Reviewed-by: andrew Backport-of: 6e586e8a3b8807652218c4caf97ef501f42d7f36 ! jaxp/src/com/sun/org/apache/bcel/internal/Repository.java - jaxp/src/com/sun/org/apache/bcel/internal/util/ClassPath.java ! jaxp/src/com/sun/org/apache/bcel/internal/util/SyntheticRepository.java Changeset: f8a6695a Branch: master Author: Andrew John Hughes Date: 2022-03-29 15:18:21 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/f8a6695a4ca79d922d357a971af183deda41e799 8274658: ISO 4217 Amendment 170 Update Reviewed-by: sgehwolf Backport-of: f2404d60de2b58c590bf885f5cce50c289073673 ! jdk/src/share/classes/java/util/CurrencyData.properties ! jdk/src/share/classes/sun/util/resources/CurrencyNames.properties ! jdk/test/java/util/Currency/ValidateISO4217.java ! jdk/test/java/util/Currency/tablea1.txt ! jdk/test/sun/text/resources/LocaleData ! jdk/test/sun/text/resources/LocaleDataTest.java Changeset: e403fd51 Branch: master Author: Sergey Bylokhov Date: 2022-03-30 01:00:33 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/e403fd517ae3d76ec75ba3409c4478e467d8aa12 8274751: Drag And Drop hangs on Windows Backport-of: 7a0a6c95a53c6cb3340328d6543a97807320b740 ! jdk/src/windows/native/sun/windows/awt_DnDDS.cpp ! jdk/src/windows/native/sun/windows/awt_DnDDT.cpp ! jdk/src/windows/native/sun/windows/awt_Toolkit.cpp ! jdk/src/windows/native/sun/windows/awt_Toolkit.h Changeset: c5ca29fd Branch: master Author: Takakuri Committer: Andrew John Hughes Date: 2022-04-02 15:06:49 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/c5ca29fd4a582fd390595c1c50771dc02ba0f9ba 8255239: The timezone of the hs_err_pid log file is corrupted in Japanese locale Reviewed-by: andrew Backport-of: b46d73bee808af7891b699df30a5a6dec3f5139f ! hotspot/src/share/vm/runtime/os.cpp Changeset: 62bbb3e6 Branch: master Author: Dongbo He Committer: Fei Yang Date: 2022-04-06 12:15:42 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/62bbb3e6415d02173eaf443713f8ca3540c0126a 8281814: Debuginfo.diz contains redundant build path after backport JDK-8025936 8u backport of JDK-8035134 Reviewed-by: sgehwolf ! make/common/NativeCompilation.gmk Changeset: 10029f78 Branch: master Author: Andrew John Hughes Date: 2022-04-06 19:47:14 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/10029f784ef7be458a7b6ff3cc21649ff0abb6f3 8253424: Add support for running pre-submit testing using GitHub Actions 8253865: Pre-submit testing using GitHub Actions does not detect failures reliably 8254054: Pre-submit testing using GitHub Actions should not use the deprecated set-env command 8254173: Add Zero, Minimal hotspot targets to submit workflow 8254175: Build no-pch configuration in debug mode for submit checks 8254282: Add Linux x86_32 builds to submit workflow 8255373: Submit workflow artifact name is always "test-results_.zip" 8255895: Submit workflow artifacts miss hs_errs/replays due to ZIP include mismatch 8256127: Add cross-compiled foreign architectures builds to submit workflow 8256277: Github Action build on macOS should define OS and Xcode versions 8256354: Github Action build on Windows should define OS and MSVC versions 8256414: add optimized build to submit workflow 8256393: Github Actions build on Linux should define OS and GCC versions 8256747: GitHub Actions: decouple the hotspot build-only jobs from Linux x64 testing 8257056: Submit workflow should apt-get update to avoid package installation errors 8259924: GitHub actions fail on Linux x86_32 with "Could not configure libc6:i386" 8260460: GitHub actions still fail on Linux x86_32 with "Could not configure libc6:i386" 8263667: Avoid running GitHub actions on branches named pr/* 8255305: Add Linux x86_32 tier1 to submit workflow 8255352: Archive important test outputs in submit workflow 8282225: GHA: Allow one concurrent run per PR only Co-authored-by: Alex Kasko Co-authored-by: Zden?k ?ambersk? Reviewed-by: sgehwolf Backport-of: 1faefed218051c324bdb5c7c10369050d6c9dd44 + .github/workflows/freetype.vcxproj + .github/workflows/submit.yml + make/conf/test-dependencies Changeset: b2491599 Branch: master Author: Sergey Bylokhov Date: 2022-04-07 21:47:35 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/b2491599ca4de259682d140983ca645a70ea5723 8261107: ArrayIndexOutOfBoundsException in the ICC_Profile.getInstance(InputStream) Reviewed-by: phh Backport-of: 06b33a0ad78d1577711af22020cf5fdf25112523 ! jdk/src/share/classes/java/awt/color/ICC_Profile.java + jdk/test/java/awt/Color/ICC_Profile/GetInstanceBrokenStream.java Changeset: 3647d987 Branch: master Author: Alexey Pavlyutkin Committer: Yuri Nesterenko Date: 2022-04-08 06:57:33 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3647d987abad91eceef72900c71f8f4f55c8e92f 8190753: (zipfs): Accessing a large entry (> 2^31 bytes) leads to a negative initial size for ByteArrayOutputStream Reviewed-by: phh, andrew Backport-of: 8a9cda2d84513ab49a54e1d2a7b530f0bae05c61 ! jdk/src/share/demo/nio/zipfs/src/com/sun/nio/zipfs/ZipFileSystem.java + jdk/test/demo/zipfs/LargeCompressedEntrySizeTest.java + jdk/test/demo/zipfs/ZipFSOutputStreamTest.java Changeset: cc541e91 Branch: master Author: Andrew John Hughes Date: 2022-04-18 02:47:59 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/cc541e91bea75d05b20cc887a47ce936fb693f6f Added tag jdk8u332-b09 for changeset 37aca7715d13 ! .hgtags Changeset: 65c4a5d4 Branch: master Author: Andrew John Hughes Date: 2022-04-28 09:29:39 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/65c4a5d42e05135754774406389b903e2146758c Merge ! jdk/src/share/classes/sun/security/tools/keytool/Main.java ! jdk/src/solaris/classes/java/io/UnixFileSystem.java ! jdk/src/windows/classes/java/io/WinNTFileSystem.java ! jdk/src/share/classes/sun/security/tools/keytool/Main.java ! jdk/src/solaris/classes/java/io/UnixFileSystem.java ! jdk/src/windows/classes/java/io/WinNTFileSystem.java Changeset: 62defc3d Branch: master Author: Andrew John Hughes Date: 2022-04-28 14:41:43 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/62defc3dfc4b9ba5adfe3189f34fe8b3f59b94a0 8284772: 8u GHA: Use GCC Major Version Dependencies Only Reviewed-by: serb ! .github/workflows/submit.yml Changeset: 607b14e2 Branch: master Author: Sergey Bylokhov Date: 2022-04-29 23:13:00 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/607b14e2ba79668abfc1af8e96c47adf176e77af 8285445: cannot open file "NUL:" Backport-of: 03cbb48e6a1d806f204a39bbdbb4bc9be9e57a41 ! jdk/src/windows/classes/java/io/WinNTFileSystem.java + jdk/test/java/io/FileOutputStream/OpenNUL.java Changeset: d91ee59b Branch: master Author: Andrew John Hughes Date: 2022-04-22 16:45:54 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/d91ee59b3c8cd76b945b517336351f496ab3ff56 Added tag jdk8u332-ga for changeset 37aca7715d13 ! .hgtags Changeset: ee82a7d9 Branch: master Author: Andrew John Hughes Date: 2022-03-02 19:10:35 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/ee82a7d98f8d01a59a2140ae04c3a52959dcdf47 8282458: Update .jcheck/conf file for 8u move to git Reviewed-by: sgehwolf ! .jcheck/conf Changeset: 1bc3be25 Branch: master Author: Andrew John Hughes Date: 2022-05-02 01:17:47 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/1bc3be259a1367d0b671ee0e8a85e314d7d05637 Merge Changeset: 988c585a Branch: master Author: Andrew John Hughes Date: 2025-03-06 01:16:07 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/988c585a753cd6dd88a86237e72cb57623f61239 Merge jdk8u342-b01 ! .jcheck/conf ! .jcheck/conf From andrew at openjdk.org Thu Mar 20 01:07:04 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 20 Mar 2025 01:07:04 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u342-b01 for changeset 1bc3be25 Message-ID: Tagged by: Andrew John Hughes Date: 2022-05-02 02:45:06 +0000 Added tag jdk8u342-b01 for changeset 1bc3be259a Changeset: 1bc3be25 Author: Andrew John Hughes Date: 2022-05-02 01:17:47 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/1bc3be259a1367d0b671ee0e8a85e314d7d05637 From andrew at openjdk.org Thu Mar 20 01:07:08 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 20 Mar 2025 01:07:08 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u342-b01 for changeset 988c585a Message-ID: Tagged by: Andrew John Hughes Date: 2025-03-06 16:54:24 +0000 Added tag shenandoah8u342-b01 for changeset 988c585a753 Changeset: 988c585a Author: Andrew John Hughes Date: 2025-03-06 01:16:07 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/988c585a753cd6dd88a86237e72cb57623f61239 From andrew at openjdk.org Thu Mar 20 01:07:08 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 20 Mar 2025 01:07:08 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: References: Message-ID: > Merge jdk8u342-b01 Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/15/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/15/files/988c585a..988c585a Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=15&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=15&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/15.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/15/head:pull/15 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/15 From iris at openjdk.org Thu Mar 20 01:07:09 2025 From: iris at openjdk.org (Iris Clark) Date: Thu, 20 Mar 2025 01:07:09 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 15:35:57 GMT, Andrew John Hughes wrote: > Merge jdk8u342-b01 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/15 From xpeng at openjdk.org Thu Mar 20 02:36:56 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 02:36:56 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v4] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: remembered set can't be verified w/o complete old marking or parsable old generation. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/4726b876..f16dd729 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=02-03 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 02:45:34 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 02:45:34 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v5] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8345399-v3 - remembered set can't be verified w/o complete old marking or parsable old generation. - Merge branch 'openjdk:master' into JDK-8345399-v3 - Address review comments - Clean and rebuild rem-set in global gc - Set mark incomplete after ShenandoahMCResetCompleteBitmapTask - Only clean rem-set read table in young gc; not verify rem-set in concurrent global GC in generational mode - Always swap card table in generational mode so the table can be properly rebuilt through marking. - Initial works ------------- Changes: https://git.openjdk.org/jdk/pull/24092/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=04 Stats: 49 lines in 4 files changed: 24 ins; 16 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 03:48:49 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 03:48:49 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v6] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Fix test failure ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/6c420c4a..3947f36d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=04-05 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 07:24:42 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 07:24:42 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v7] In-Reply-To: References: Message-ID: <_DhsSyxboYzJQHLs_pzwb-IPixh2jdXkxxO6p36Z-n8=.66db88ee-f15f-47cc-9eae-3579e81af6b6@github.com> > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: set old gen parsable to false when complete mixed evacuations ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/3947f36d..f0e8d694 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 08:47:46 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 08:47:46 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v8] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: - Revert "set old gen parsable to false when complete mixed evacuations" This reverts commit f0e8d694f58b0b2a513b3ff3206a9eea1c998868. - Not verify rem-set before init-mark in global gc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/f0e8d694..e3327245 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=06-07 Stats: 12 lines in 3 files changed: 0 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 09:20:31 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 09:20:31 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v9] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Use read table for verify_rem_set_before_mark ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/e3327245..2b538632 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=07-08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From wkemper at openjdk.org Thu Mar 20 17:06:53 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 20 Mar 2025 17:06:53 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode Message-ID: Not a clean backport. # Testing GHA, Dacapo, Extremem, Hyperalloc, Specjbb2015, Specjvm2008, Diluvian (with and without stress flags). ------------- Commit messages: - 8336685: Shenandoah: Remove experimental incremental update mode Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/160/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=160&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336685 Stats: 1709 lines in 70 files changed: 4 ins; 1668 del; 37 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/160.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/160/head:pull/160 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/160 From xpeng at openjdk.org Thu Mar 20 20:02:52 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 20:02:52 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v10] In-Reply-To: References: Message-ID: <21_ZHL-2drP3fW6JDDSIk0dEiZ0VzXxJht0ayo_0Vco=.f8205d1e-ebdb-4ec5-a9cc-e9b6982499b8@github.com> > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: - Not verify remembered set w/o parseable old gen when old mark is incomplete - Use decode_raw instead of decode_not_null ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/2b538632..feceef3e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=08-09 Stats: 25 lines in 2 files changed: 18 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 20:06:58 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 20:06:58 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v11] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Fix wrong comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/feceef3e..39707d15 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=09-10 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From wkemper at openjdk.org Thu Mar 20 20:20:10 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 20 Mar 2025 20:20:10 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v11] In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 20:06:58 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Fix wrong comments Couple of nits. src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 711: > 709: } > 710: > 711: if (ShenandoahVerify && heap->mode()->is_generational()) { Are we calling `verify_before_concmark` twice in generational mode? Should we delete this second call here? src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1377: > 1375: log_debug(gc)("Verifying remembered set at %s mark", old_generation->is_doing_mixed_evacuations() ? "mixed" : "young"); > 1376: > 1377: ShenandoahWriteTableScanner scanner(ShenandoahGenerationalHeap::heap()->old_generation()->card_scan()); Can use existing local variable: `old_generation`? ------------- Changes requested by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2703995429 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006389044 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006383550 From ysr at openjdk.org Thu Mar 20 22:21:10 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 20 Mar 2025 22:21:10 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v11] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> Message-ID: <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> On Tue, 18 Mar 2025 23:48:53 GMT, Kelvin Nilsen wrote: >> Thanks for for explanation, I have been reading and trying the understand how the remembered set works in GenShen. I wasn't sure whether this is actually right. >> >> In generational mode, if the GC cycle is global, the read table is already cleaned during reset phase, so remembered set verification from `verify_before_concmark` and `verify_before_update_refs` shouldn't work properly, I think the remembered set verification before mark and update references should be disabled, what do you think? Meanwhile, there is no need to clean read table during global cycle in generational mode. > > Ok. So we will always swap card tables, but we'll do it after verify-before-mark. To clarify the intention, after we swap card table, the write-table is all clean, and the read table holds whatever had been gathered prior to the start of GC. Young and bootstrap collection will update the write card table as a side effect of remembered set scanning. Global collection will update the card table as a side effect of global marking of old objects. I'd leave a comment to this effect (along the lines of Kelvin's last comment) here. Did we measure the impact of this change on performance? In particular it would seem that the number of dirty old cards might now reduce after a global gc compared to before this change. Ideally, this would be a change that would go in on its own. (There is no impact on correctness, since in the absence of this change, the dirty card set is an over-approximation.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006539947 From xpeng at openjdk.org Thu Mar 20 22:39:24 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 22:39:24 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v12] In-Reply-To: References: Message-ID: <6tFCTl-s2bUS0Tu3oqK0kMPx45J1JEru-tf0Ec0WMZc=.35a8b864-df0e-4adb-ab7c-f511ffa07e0b@github.com> > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision: - Not validate remembered set w/o complete old marking - Use decode_raw_not_null instead of decode_raw ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/39707d15..73e95e8a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=10-11 Stats: 25 lines in 2 files changed: 1 ins; 14 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 22:48:24 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 22:48:24 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: tide up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/73e95e8a..16494d48 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Thu Mar 20 23:20:09 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 23:20:09 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v11] In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 20:16:14 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix wrong comments > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 711: > >> 709: } >> 710: >> 711: if (ShenandoahVerify && heap->mode()->is_generational()) { > > Are we calling `verify_before_concmark` twice in generational mode? Should we delete this second call here? Thanks, the condition here is wrong. I have updated code only verify after swap read/write table. > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1377: > >> 1375: log_debug(gc)("Verifying remembered set at %s mark", old_generation->is_doing_mixed_evacuations() ? "mixed" : "young"); >> 1376: >> 1377: ShenandoahWriteTableScanner scanner(ShenandoahGenerationalHeap::heap()->old_generation()->card_scan()); > > Can use existing local variable: `old_generation`? Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006592728 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006590066 From xpeng at openjdk.org Thu Mar 20 23:24:10 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 20 Mar 2025 23:24:10 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> Message-ID: On Thu, 20 Mar 2025 22:18:43 GMT, Y. Srinivas Ramakrishna wrote: >> Ok. So we will always swap card tables, but we'll do it after verify-before-mark. To clarify the intention, after we swap card table, the write-table is all clean, and the read table holds whatever had been gathered prior to the start of GC. Young and bootstrap collection will update the write card table as a side effect of remembered set scanning. Global collection will update the card table as a side effect of global marking of old objects. > > I'd leave a comment to this effect (along the lines of Kelvin's last comment) here. Did we measure the impact of this change on performance? In particular it would seem that the number of dirty old cards might now reduce after a global gc compared to before this change. > > Ideally, this would be a change that would go in on its own. (There is no impact on correctness, since in the absence of this change, the dirty card set is an over-approximation.) It is a bit hard to measure the impact on performance I think, but given the rem-set is more accurate, there shouldn't be any performance regression. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006595141 From xpeng at openjdk.org Fri Mar 21 00:25:14 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 00:25:14 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> Message-ID: <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> On Thu, 20 Mar 2025 23:21:37 GMT, Xiaolong Peng wrote: >> I'd leave a comment to this effect (along the lines of Kelvin's last comment) here. Did we measure the impact of this change on performance? In particular it would seem that the number of dirty old cards might now reduce after a global gc compared to before this change. >> >> Ideally, this would be a change that would go in on its own. (There is no impact on correctness, since in the absence of this change, the dirty card set is an over-approximation.) > > It is a bit hard to measure the impact on performance I think, but given the rem-set is more accurate, there shouldn't be any performance regression. I'll add comment here as you are suggesting. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006650222 From ysr at openjdk.org Fri Mar 21 04:03:17 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 21 Mar 2025 04:03:17 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> Message-ID: On Fri, 21 Mar 2025 00:22:46 GMT, Xiaolong Peng wrote: >> It is a bit hard to measure the impact on performance I think, but given the rem-set is more accurate, there shouldn't be any performance regression. > > I'll add comment here as you are suggesting. I was suggesting looking to see if normal perf measures showed any improvements. E.g. if you ran say SPECjbb and compared the remset scan times for the minor GC's that followed global collections. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2006806542 From xpeng at openjdk.org Fri Mar 21 15:11:19 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 15:11:19 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> Message-ID: On Fri, 21 Mar 2025 04:00:17 GMT, Y. Srinivas Ramakrishna wrote: >> I'll add comment here as you are suggesting. > > I was suggesting looking to see if normal perf measures showed any improvements. E.g. if you ran say SPECjbb and compared the remset scan times for the minor GC's that followed global collections. I have run h2 benchmark, here is the remembered set scan times after a global GC, it does seem to improve remembered set scan time in this case: PR version: [2025-03-21T07:35:41.801+0000][10.292s][19715][info ][gc ] GC(6) Concurrent remembered set scanning 13.069ms [2025-03-21T07:35:48.088+0000][16.579s][19715][info ][gc ] GC(9) Concurrent remembered set scanning 5.537ms [2025-03-21T07:35:56.610+0000][25.101s][19715][info ][gc ] GC(14) Concurrent remembered set scanning 6.186ms [2025-03-21T07:36:03.967+0000][32.459s][19715][info ][gc ] GC(18) Concurrent remembered set scanning 9.562ms [2025-03-21T07:36:11.234+0000][39.725s][19715][info ][gc ] GC(22) Concurrent remembered set scanning 2.591ms [2025-03-21T07:36:17.303+0000][45.794s][19715][info ][gc ] GC(25) Concurrent remembered set scanning 0.999ms [2025-03-21T07:36:25.647+0000][54.139s][19715][info ][gc ] GC(30) Concurrent remembered set scanning 1.665ms [2025-03-21T07:36:32.790+0000][61.281s][19715][info ][gc ] GC(33) Concurrent remembered set scanning 2.851ms [2025-03-21T07:36:40.241+0000][68.732s][19715][info ][gc ] GC(36) Concurrent remembered set scanning 0.716ms [2025-03-21T07:36:47.440+0000][75.931s][19715][info ][gc ] GC(39) Concurrent remembered set scanning 1.932ms master: [2025-03-21T07:34:04.978+0000][10.765s][17923][info ][gc ] GC(6) Concurrent remembered set scanning 22.813ms [2025-03-21T07:34:11.250+0000][17.038s][17923][info ][gc ] GC(9) Concurrent remembered set scanning 14.457ms [2025-03-21T07:34:18.692+0000][24.480s][17923][info ][gc ] GC(14) Concurrent remembered set scanning 4.972ms [2025-03-21T07:34:26.033+0000][31.820s][17923][info ][gc ] GC(18) Concurrent remembered set scanning 9.134ms [2025-03-21T07:34:34.416+0000][40.203s][17923][info ][gc ] GC(22) Concurrent remembered set scanning 3.655ms [2025-03-21T07:34:42.180+0000][47.967s][17923][info ][gc ] GC(26) Concurrent remembered set scanning 3.253ms [2025-03-21T07:34:49.371+0000][55.168s][17923][info ][gc ] GC(29) Concurrent remembered set scanning 1.615ms [2025-03-21T07:34:56.592+0000][62.396s][17923][info ][gc ] GC(32) Concurrent remembered set scanning 1.570ms [2025-03-21T07:35:03.766+0000][69.575s][17923][info ][gc ] GC(35) Concurrent remembered set scanning 1.040ms [2025-03-21T07:35:10.941+0000][76.753s][17923][info ][gc ] GC(38) Concurrent remembered set scanning 1.947ms ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2007788818 From rkennke at openjdk.org Fri Mar 21 15:56:21 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 21 Mar 2025 15:56:21 GMT Subject: RFR: 8352091: GenShen: assert(!(request.generation->is_old() && _heap->old_generation()->is_doing_mixed_evacuations())) failed: Old heuristic should not request cycles while it waits for mixed evacuation In-Reply-To: References: Message-ID: On Fri, 14 Mar 2025 23:45:28 GMT, William Kemper wrote: > Consider the following: > 1. Regulator thread sees that control thread is `idle` and requests an old cycle > 2. Regulator thread waits until control thread is not `idle` > 3. Control thread starts old cycle and notifies the Regulator thread (as expected) > 4. Regulator thread stays off CPU for a _long_ time > 5. Control thread _completes_ old marking and returns to `idle` state > 6. Regulator thread finally wakes up and sees that Control thread is _still_ idle > 7. In fact, the control thread has completed old marking and the regulator thread should not request another cycle Looks reasonable. ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24069#pullrequestreview-2706448918 From wkemper at openjdk.org Fri Mar 21 16:07:20 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 21 Mar 2025 16:07:20 GMT Subject: Integrated: 8352091: GenShen: assert(!(request.generation->is_old() && _heap->old_generation()->is_doing_mixed_evacuations())) failed: Old heuristic should not request cycles while it waits for mixed evacuation In-Reply-To: References: Message-ID: <1XBrk_Rxi-fGi6YyFJ2xvw9Gaq-5y3pcVidNHeeTDgE=.927d9df1-8788-40fe-b92d-6457da3e3ca1@github.com> On Fri, 14 Mar 2025 23:45:28 GMT, William Kemper wrote: > Consider the following: > 1. Regulator thread sees that control thread is `idle` and requests an old cycle > 2. Regulator thread waits until control thread is not `idle` > 3. Control thread starts old cycle and notifies the Regulator thread (as expected) > 4. Regulator thread stays off CPU for a _long_ time > 5. Control thread _completes_ old marking and returns to `idle` state > 6. Regulator thread finally wakes up and sees that Control thread is _still_ idle > 7. In fact, the control thread has completed old marking and the regulator thread should not request another cycle This pull request has now been integrated. Changeset: 52c6ce6c Author: William Kemper URL: https://git.openjdk.org/jdk/commit/52c6ce6c73194762970fd9521121333713495fa3 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8352091: GenShen: assert(!(request.generation->is_old() && _heap->old_generation()->is_doing_mixed_evacuations())) failed: Old heuristic should not request cycles while it waits for mixed evacuation Reviewed-by: rkennke ------------- PR: https://git.openjdk.org/jdk/pull/24069 From xpeng at openjdk.org Fri Mar 21 22:09:32 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 22:09:32 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId In-Reply-To: References: Message-ID: <8wQiOKS3dt30v5KKmBI-YFk0KRsFSdakjaJfpvHU8ow=.9702993f-d73f-4c49-ba7c-57c9dcae87d0@github.com> On Fri, 21 Mar 2025 19:09:46 GMT, Xiaolong Peng wrote: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in JFR. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [ ] TEST=hotspot_gc_shenandoah src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: > 135: // GC is starting, bump the internal gc count and set GCIdMark > 136: update_gc_count(); > 137: GCIdMark gc_id_mark(static_cast(get_gc_id())); static cast from size_t to uint here since GCIdMark use uint. If needed, I can change the data type of gc id and count in ShenandoahController to uint, but need to update more files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2008389233 From xpeng at openjdk.org Fri Mar 21 22:09:32 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 22:09:32 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId Message-ID: ### Root cause Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in JFR. ### Solution it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. ### Test - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" - [ ] TEST=hotspot_gc_shenandoah ------------- Commit messages: - Add static cast from size_t to uint - Rename _gc_id of ShenandoahController to _gc_count, and gc id will be derived from _gc_count - tide up - gc_id should start from 0 - GenShen: Enabling JFR asserts when getting GCId Changes: https://git.openjdk.org/jdk/pull/24166/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352588 Stats: 37 lines in 4 files changed: 11 ins; 7 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From wkemper at openjdk.org Fri Mar 21 22:31:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 21 Mar 2025 22:31:07 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId In-Reply-To: References: Message-ID: On Fri, 21 Mar 2025 19:09:46 GMT, Xiaolong Peng wrote: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah Changes requested by wkemper (Reviewer). src/hotspot/share/gc/shenandoah/shenandoahController.cpp line 50: > 48: } > 49: > 50: size_t ShenandoahController::get_gc_id() { Do we need to keep this method? Can't everything just use `get_gc_count` now? ------------- PR Review: https://git.openjdk.org/jdk/pull/24166#pullrequestreview-2707440082 PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2008418183 From wkemper at openjdk.org Fri Mar 21 22:31:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 21 Mar 2025 22:31:08 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId In-Reply-To: <8wQiOKS3dt30v5KKmBI-YFk0KRsFSdakjaJfpvHU8ow=.9702993f-d73f-4c49-ba7c-57c9dcae87d0@github.com> References: <8wQiOKS3dt30v5KKmBI-YFk0KRsFSdakjaJfpvHU8ow=.9702993f-d73f-4c49-ba7c-57c9dcae87d0@github.com> Message-ID: On Fri, 21 Mar 2025 22:01:40 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah > > src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: > >> 135: // GC is starting, bump the internal gc count and set GCIdMark >> 136: update_gc_count(); >> 137: GCIdMark gc_id_mark(static_cast(get_gc_id())); > > static cast from size_t to uint here since GCIdMark use uint. > If needed, I can change the data type of gc id and count in ShenandoahController to uint, but need to update more files. `static_cast` is fine, but `checked_cast` would be better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2008416303 From xpeng at openjdk.org Fri Mar 21 23:13:07 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 23:13:07 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId In-Reply-To: References: Message-ID: On Fri, 21 Mar 2025 22:27:52 GMT, William Kemper wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah > > src/hotspot/share/gc/shenandoah/shenandoahController.cpp line 50: > >> 48: } >> 49: >> 50: size_t ShenandoahController::get_gc_id() { > > Do we need to keep this method? Can't everything just use `get_gc_count` now? We don't have to keep it, it needs a bit more changes to touch up. I think it better to remove it to avoid the confusion with the gc_id() method, ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2008461020 From xpeng at openjdk.org Fri Mar 21 23:26:46 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 23:26:46 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v2] In-Reply-To: References: Message-ID: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Remove get_gc_id() method ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24166/files - new: https://git.openjdk.org/jdk/pull/24166/files/13cea142..b091159e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=00-01 Stats: 32 lines in 7 files changed: 0 ins; 8 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From xpeng at openjdk.org Fri Mar 21 23:26:46 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 23:26:46 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v2] In-Reply-To: References: Message-ID: On Fri, 21 Mar 2025 23:10:48 GMT, Xiaolong Peng wrote: >> src/hotspot/share/gc/shenandoah/shenandoahController.cpp line 50: >> >>> 48: } >>> 49: >>> 50: size_t ShenandoahController::get_gc_id() { >> >> Do we need to keep this method? Can't everything just use `get_gc_count` now? > > We don't have to keep it, it needs a bit more changes to touch up. I think it better to remove it to avoid the confusion with the gc_id() method, I have removed method ShenandoahController::get_gc_id() in the update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2008466559 From xpeng at openjdk.org Fri Mar 21 23:29:22 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 21 Mar 2025 23:29:22 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v3] In-Reply-To: References: Message-ID: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: touch up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24166/files - new: https://git.openjdk.org/jdk/pull/24166/files/b091159e..1920cf09 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From andrew at openjdk.org Sat Mar 22 00:24:29 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Sat, 22 Mar 2025 00:24:29 GMT Subject: RFR: Merge jdk8u:master Message-ID: Merge jdk8u342-b02 ------------- Commit messages: - Merge jdk8u342-b02 - 8221988: add possibility to build with Visual Studio 2019 - 8170530: bash configure output contains a typo in a suggested library name - 8235211: serviceability/attach/RemovingUnixDomainSocketTest.java fails with AttachNotSupportedException: Unable to open socket file The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/16/files Stats: 300 lines in 9 files changed: 260 ins; 2 del; 38 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/16.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/16/head:pull/16 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/16 From andrew at openjdk.org Sat Mar 22 02:34:29 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Sat, 22 Mar 2025 02:34:29 GMT Subject: RFR: Merge jdk8u:master In-Reply-To: References: Message-ID: <5kY4fxqZWaHHH5T-2MzlaKk5oT6cHwORQj-4xm8c-94=.f9592a4b-7925-4efd-97ae-5a1935e028f4@github.com> On Sat, 22 Mar 2025 00:20:37 GMT, Andrew John Hughes wrote: > Merge jdk8u342-b02 GHA builds will not work until [JDK-8284622](https://bugs.openjdk.org/browse/JDK-8284622) is merged in 8u362-b03 ------------- PR Comment: https://git.openjdk.org/shenandoah-jdk8u/pull/16#issuecomment-2744906569 From kdnilsen at openjdk.org Mon Mar 24 14:41:43 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 24 Mar 2025 14:41:43 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 16:57:18 GMT, William Kemper wrote: > Not a clean backport. > > # Testing > GHA, Dacapo, Extremem, Hyperalloc, Specjbb2015, Specjvm2008, Diluvian (with and without stress flags). Lots of code here. Thanks for sorting through the conflicts. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/160#pullrequestreview-2710631398 From xpeng at openjdk.org Mon Mar 24 15:18:25 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 15:18:25 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: References: Message-ID: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah > - [x] GHA Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: No need to calculate gc_id using gc_count ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24166/files - new: https://git.openjdk.org/jdk/pull/24166/files/1920cf09..4c8c8136 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=02-03 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From adinn at openjdk.org Mon Mar 24 15:27:14 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 24 Mar 2025 15:27:14 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v4] In-Reply-To: References: <2jI87up85vKeQq7xy6WoI987MOuqTqA6I8G75VvC74g=.e8ef9f9c-b8b3-496d-9b48-28c83dc1fb64@github.com> Message-ID: On Fri, 28 Feb 2025 20:36:23 GMT, Dean Long wrote: >> Refreshing my memory, isn't the real problem with trying to fix this with a minimum codecache size is that some of these stubs are not allocated during initial single-threaded JVM startup, but later when the first compiler threads start, and that allows other code blobs to fill up the codecache? > >> Even so, it might be a good idea to additionally increase the minimum code cache anyway. @dean-long do you think it would make sense to file an RFE for that? > > Sure, if it's still an issue. @dean-long You are right that the root of the problem is delayed stub init when running in tiered mode. It's actually a race between the C1 and C2 threads to consume code cache space. Neither C1 nor C2 can generate compiled method code before they have initialized their respective stubs. However, initialization of C2 stubs may potentially occur after a C1 compiler thread has initialized its own stubs and generated some number of C1-compiled methods. Clearly that can't happen the other way around because in tiered mode a C2 thread will only end up compiling a method after 1) a prior C1-compiled method exists in the runtime and 2) its own stubs have been generated. An implication of 1) is that the C1 stubs have already been generated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23630#issuecomment-2748518267 From dlong at openjdk.org Mon Mar 24 15:30:20 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 24 Mar 2025 15:30:20 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v5] In-Reply-To: References: Message-ID: On Mon, 3 Mar 2025 12:54:26 GMT, Damon Fenacci wrote: >> # Issue >> The test `src/hotspot/share/opto/c2compiler.cpp` fails intermittently due to a crash that happens when trying to allocate code cache space for C1 and C2 in `RuntimeStub::new_runtime_stub` and `SingletonBlob::operator new`. >> >> # Causes >> There are a few call paths during the initialization of C1 and C2 that can lead to the code cache allocations in `RuntimeStub::new_runtime_stub` (through `RuntimeStub::operator new`) and `SingletonBlob::operator new` triggering a fatal error if there is no more space. The paths in question are: >> 1. `Compiler::init_c1_runtime` -> `Runtime1::initialize` -> `Runtime1::generate_blob_for` -> `Runtime1::generate_blob` -> `RuntimeStub::new_runtime_stub` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_stub` -> `Compile::Compile` -> `Compile::Code_Gen` -> `PhaseOutput::install` -> `PhaseOutput::install_stub` -> `RuntimeStub::new_runtime_stub` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_uncommon_trap_blob` -> `UncommonTrapBlob::create` -> `new UncommonTrapBlob` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_exception_blob` -> `ExceptionBlob::create` -> `new ExceptionBlob` >> >> # Solution >> Instead of fatally crashing the we can use the `alloc_fail_is_fatal` flag of `RuntimeStub::new_runtime_stub` to avoid crashing in cases 1 and 2 and add a similar flag to `SingletonBlob::operator new` for cases 3 and 4. In the latter case we need to adjust all calls accordingly. >> >> Note: In [JDK-8326615](https://bugs.openjdk.org/browse/JDK-8326615) it was argued that increasing the minimum code cache size would solve the issue but that wasn't entirely accurate: doing so possibly decreases the chances of a failed allocation in these 4 places but doesn't totally avoid it. >> >> # Testing >> The original failing regression test in `test/hotspot/jtreg/compiler/startup/StartupOutput.java` has been modified to run multiple times with randomized values (within the original failing range) to increase the chances of hitting the fatal assertion. >> >> Tests: Tier 1-4 (windows-x64, linux-x64/aarch64, and macosx-x64/aarch64; release and debug mode) > > Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8347406: move assert into else clause Looks good! ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23630#pullrequestreview-2710792841 From adinn at openjdk.org Mon Mar 24 16:02:33 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 24 Mar 2025 16:02:33 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v5] In-Reply-To: References: Message-ID: <4f76zEvW4-bub4zcI8vb3DizrE2NLmxDHrYMN9uoXlo=.b39a0d20-da8c-4e31-8f18-79bf4fa95b2a@github.com> On Mon, 3 Mar 2025 12:54:26 GMT, Damon Fenacci wrote: >> # Issue >> The test `src/hotspot/share/opto/c2compiler.cpp` fails intermittently due to a crash that happens when trying to allocate code cache space for C1 and C2 in `RuntimeStub::new_runtime_stub` and `SingletonBlob::operator new`. >> >> # Causes >> There are a few call paths during the initialization of C1 and C2 that can lead to the code cache allocations in `RuntimeStub::new_runtime_stub` (through `RuntimeStub::operator new`) and `SingletonBlob::operator new` triggering a fatal error if there is no more space. The paths in question are: >> 1. `Compiler::init_c1_runtime` -> `Runtime1::initialize` -> `Runtime1::generate_blob_for` -> `Runtime1::generate_blob` -> `RuntimeStub::new_runtime_stub` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_stub` -> `Compile::Compile` -> `Compile::Code_Gen` -> `PhaseOutput::install` -> `PhaseOutput::install_stub` -> `RuntimeStub::new_runtime_stub` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_uncommon_trap_blob` -> `UncommonTrapBlob::create` -> `new UncommonTrapBlob` >> 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_exception_blob` -> `ExceptionBlob::create` -> `new ExceptionBlob` >> >> # Solution >> Instead of fatally crashing the we can use the `alloc_fail_is_fatal` flag of `RuntimeStub::new_runtime_stub` to avoid crashing in cases 1 and 2 and add a similar flag to `SingletonBlob::operator new` for cases 3 and 4. In the latter case we need to adjust all calls accordingly. >> >> Note: In [JDK-8326615](https://bugs.openjdk.org/browse/JDK-8326615) it was argued that increasing the minimum code cache size would solve the issue but that wasn't entirely accurate: doing so possibly decreases the chances of a failed allocation in these 4 places but doesn't totally avoid it. >> >> # Testing >> The original failing regression test in `test/hotspot/jtreg/compiler/startup/StartupOutput.java` has been modified to run multiple times with randomized values (within the original failing range) to increase the chances of hitting the fatal assertion. >> >> Tests: Tier 1-4 (windows-x64, linux-x64/aarch64, and macosx-x64/aarch64; release and debug mode) > > Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8347406: move assert into else clause Changes look good to me too. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23630#pullrequestreview-2710904030 From kdnilsen at openjdk.org Mon Mar 24 18:19:09 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 24 Mar 2025 18:19:09 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v2] In-Reply-To: <7PFHErLXXCsFeCjx55B_u8JisUcDGX9VFLa5azzsCso=.92f7d81d-8989-4aff-b57e-d2128403e01f@github.com> References: <7PFHErLXXCsFeCjx55B_u8JisUcDGX9VFLa5azzsCso=.92f7d81d-8989-4aff-b57e-d2128403e01f@github.com> Message-ID: On Tue, 18 Mar 2025 22:58:12 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahSharedVariables.hpp line 243: >> >>> 241: assert (new_value < (sizeof(ShenandoahSharedValue) * CHAR_MAX), "sanity"); >>> 242: // Hmm, no platform template specialization defined for exchanging one byte... (up cast to intptr is workaround). >>> 243: return (T)Atomic::xchg((intptr_t*)&value, (intptr_t)new_value); >> >> That... likely gets awkward on different endianness. See the complicated dance `Atomic::CmpxchgByteUsingInt` has to do to handle it. >> >> Not to mention we are likely writing to adjacent memory location. Which is _currently_ innocuous, since we hit padding, but it is not very reliable. > > `PlatformCmpxchg` has specializations on aarch64 and x86 for `sizeof(T) == 1`. Should we also add platform specializations for `PlatformXchg` for `sizeof(T) == 1`? (It has them for `4` and `8`). Could also do what `XchgUsingCmpxchg` does... Maybe it is easiest/safest to change declaration of value to intptr_t. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2010707057 From kdnilsen at openjdk.org Mon Mar 24 18:24:08 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 24 Mar 2025 18:24:08 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v2] In-Reply-To: References: Message-ID: On Wed, 19 Mar 2025 18:33:50 GMT, William Kemper wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Emulate single byte xchg with cmpxchg > - Merge remote-tracking branch 'jdk/master' into fix-uncancellable-young-gc > - Allow young cycles that interrupt old cycles to be cancelled src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2143: > 2141: > 2142: bool ShenandoahHeap::try_cancel_gc(GCCause::Cause cause) { > 2143: const jbyte prev = _cancelled_gc.xchg(cause); I guess maybe we want cause and prev to be integer type. Then the template will expand into a type that is known to that Atomic::xchg operation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2010713586 From wkemper at openjdk.org Mon Mar 24 18:25:23 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 24 Mar 2025 18:25:23 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: References: Message-ID: > Not a clean backport. > > # Testing > GHA, Dacapo, Extremem, Hyperalloc, Specjbb2015, Specjvm2008, Diluvian (with and without stress flags). William Kemper has updated the pull request incrementally with one additional commit since the last revision: Remove iu mode tests from ProblemList.txt ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk21u/pull/160/files - new: https://git.openjdk.org/shenandoah-jdk21u/pull/160/files/e4ee4511..c7894ea3 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=160&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=160&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/160.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/160/head:pull/160 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/160 From wkemper at openjdk.org Mon Mar 24 18:33:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 24 Mar 2025 18:33:08 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: References: Message-ID: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> On Mon, 24 Mar 2025 15:18:25 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > No need to calculate gc_id using gc_count Changes requested by wkemper (Reviewer). src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: > 135: // GC is starting, bump the internal gc count and set GCIdMark > 136: update_gc_count(); > 137: GCIdMark gc_id_mark; Can we still set the `GCIdMark` with our internal counter? I'd prefer they stay in sync explicitly. src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 576: > 574: "At end of Concurrent Young GC"; > 575: if (_heap->collection_set()->has_old_regions()) { > 576: mmu_tracker->record_mixed(gc_id()); Should these be `get_gc_count` now? ------------- PR Review: https://git.openjdk.org/jdk/pull/24166#pullrequestreview-2711348336 PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2010726624 PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2010725090 From xpeng at openjdk.org Mon Mar 24 18:48:15 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 18:48:15 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> References: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> Message-ID: On Mon, 24 Mar 2025 18:30:23 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> No need to calculate gc_id using gc_count > > src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: > >> 135: // GC is starting, bump the internal gc count and set GCIdMark >> 136: update_gc_count(); >> 137: GCIdMark gc_id_mark; > > Can we still set the `GCIdMark` with our internal counter? I'd prefer they stay in sync explicitly. GCIdMark use [GCId::_next_id](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcId.cpp#L31) here to generate GC id, if we do that the GCId::_next_id will remain 0, which is a the behavior change I had concern about. We can do it in another approach to keep both counters in sync explicitly: GCIdMark gc_id_mark; update_gc_count(gc_id() + 1) What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2010743091 From xpeng at openjdk.org Mon Mar 24 18:53:07 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 18:53:07 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> References: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> Message-ID: On Mon, 24 Mar 2025 18:29:27 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> No need to calculate gc_id using gc_count > > src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 576: > >> 574: "At end of Concurrent Young GC"; >> 575: if (_heap->collection_set()->has_old_regions()) { >> 576: mmu_tracker->record_mixed(gc_id()); > > Should these be `get_gc_count` now? Shouldn't we always use gc id for MMUTracker? Although the internal gc counter of Shenandoah is also fine here. I'm ok to change it back to get_gc_count, but will also update the declaration of the relevant methods like below to make them consistent: void record_global(size_t gc_count) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2010754634 From wkemper at openjdk.org Mon Mar 24 19:00:45 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 24 Mar 2025 19:00:45 GMT Subject: Integrated: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 16:57:18 GMT, William Kemper wrote: > Not a clean backport. > > # Testing > GHA, Dacapo, Extremem, Hyperalloc, Specjbb2015, Specjvm2008, Diluvian (with and without stress flags). This pull request has now been integrated. Changeset: 84c8e271 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/84c8e2719e1dd32db3524638855f4405666a54c5 Stats: 1712 lines in 71 files changed: 4 ins; 1671 del; 37 mod 8336685: Shenandoah: Remove experimental incremental update mode Reviewed-by: kdnilsen Backport-of: 0584af23255b6b8f49190eaf2618f3bcc299adfe ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/160 From xpeng at openjdk.org Mon Mar 24 19:21:23 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 19:21:23 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v5] In-Reply-To: References: Message-ID: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah > - [x] GHA Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Keep gc id and shenandoah internal gc count in sync ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24166/files - new: https://git.openjdk.org/jdk/pull/24166/files/4c8c8136..f4844848 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=03-04 Stats: 9 lines in 4 files changed: 3 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From wkemper at openjdk.org Mon Mar 24 21:45:21 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 24 Mar 2025 21:45:21 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v3] In-Reply-To: References: Message-ID: > The sequence of events that creates this state: > 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake > 2. The regulator thread cancels old marking to start a young collection > 3. A mutator thread shortly follows and attempts to cancel the nascent young collection > 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` > 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` > 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Widen type of shared enum value to unlock platform support of atomic xchg ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24105/files - new: https://git.openjdk.org/jdk/pull/24105/files/adcb999b..ca45ff02 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=01-02 Stats: 9 lines in 1 file changed: 1 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24105.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24105/head:pull/24105 PR: https://git.openjdk.org/jdk/pull/24105 From xpeng at openjdk.org Mon Mar 24 22:27:45 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 22:27:45 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah > - [x] GHA Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Revert ShenandoahController::_gc_count related refactor ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24166/files - new: https://git.openjdk.org/jdk/pull/24166/files/f4844848..57c43ef3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24166&range=04-05 Stats: 50 lines in 7 files changed: 3 ins; 3 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/24166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24166/head:pull/24166 PR: https://git.openjdk.org/jdk/pull/24166 From xpeng at openjdk.org Mon Mar 24 22:27:45 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 22:27:45 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v5] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 19:21:23 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Keep gc id and shenandoah internal gc count in sync Removed all code related to the refactor of henandoahController::_gc_id, now the change should be a pure fix for the bug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2749534672 From ysr at openjdk.org Mon Mar 24 23:00:09 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 24 Mar 2025 23:00:09 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: <7g5yci-7XKxmgaKSWma2-EQraeVr7cjABnKH9ifMZU4=.2b2ff9d2-2735-488c-bfe4-4d15f2990577@github.com> On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor I haven't started reviewing, but in cases where we have a "mark" (a thread local stack scoped constant variable, such as used for logging etc.) and an under;ying "true value", the expectation is that the "mark" is a snapshot of the "true", and represents a label for the work being done in that specific scope. Once you keep this model/idiom in mind, the code should become clean, and the same 0-based conventions should cleanly apply. I hope to review the code soon'ish. Sorry for the delay. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2749579634 From ysr at openjdk.org Mon Mar 24 23:11:09 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 24 Mar 2025 23:11:09 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> References: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> Message-ID: On Mon, 24 Mar 2025 18:30:23 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> No need to calculate gc_id using gc_count > > src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: > >> 135: // GC is starting, bump the internal gc count and set GCIdMark >> 136: update_gc_count(); >> 137: GCIdMark gc_id_mark; > > Can we still set the `GCIdMark` with our internal counter? I'd prefer they stay in sync explicitly. @earthling-amzn : Is your concern that GC count is incremented concurrently by two different callers? If so, I'd have the atomic increment return the pre- or post-increment value as the case may be and have the caller use that in their mark label. (Question: do we have different Id's for young and a concurrent/interrupted old? -- I would imagine so, with the old carrying an older id, and each subsequent young getting a newer id). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2011041651 From ysr at openjdk.org Mon Mar 24 23:17:06 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 24 Mar 2025 23:17:06 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor > Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is undefined, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. This would be by design and, as you discovered, was because a suitable GCIdMark scope was missing which would have supplied the correct ID. It is important that the JFR event issues from the intended scope for the corresponding ID for which the metrics/event are being generated. In particular, if there are multiple concurrent GC ID's in progress, with a common pool of worker threads that multiplex this work, any appropriate event metrics should be correctly attributed to the right ID in question. I am making general comments here without knowledge of the specific details, sorry! :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2749602926 From wkemper at openjdk.org Mon Mar 24 23:28:02 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 24 Mar 2025 23:28:02 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint Message-ID: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> Not clean, has two follow up fixes in this PR. ------------- Commit messages: - Fix merge conflict - 8348092: Shenandoah: assert(nk >= _lowest_valid_narrow_klass_id && nk <= _highest_valid_narrow_klass_id) failed: narrowKlass ID out of range (3131947710) - Backport: 8c09d40d6c3 - Backport: 764d70b7df18e288582e616c62b0d7078f1ff3aa Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/161/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=161&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8344049 Stats: 303 lines in 12 files changed: 172 ins; 84 del; 47 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/161.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/161/head:pull/161 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/161 From xpeng at openjdk.org Mon Mar 24 23:41:11 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 24 Mar 2025 23:41:11 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 23:14:36 GMT, Y. Srinivas Ramakrishna wrote: > > Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is undefined, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > This would be by design and, as you discovered, was because a suitable GCIdMark scope was missing which would have supplied the correct ID. It is important that the JFR event issues from the intended scope for the corresponding ID for which the metrics/event are being generated. In particular, if there are multiple concurrent GC ID's in progress, with a common pool of worker threads that multiplex this work, any appropriate event metrics should be correctly attributed to the right ID in question. > > I am making general comments here without knowledge of the specific details, sorry! :-) Thank you @ysramakrishna for reviewing the PR, appreciate it! Yes, it is a simple bug related to the GCIdMark scope, so the fix is to make sure GCIdMark scope is correct. For common pool of worker threads, each thread should copy the gc_id to local with the constructor GCIdMark(gc_id), there some existing examples doing this in hotspot, e.g. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/workerThread.cpp#L68 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2749637430 From dfenacci at openjdk.org Tue Mar 25 07:14:13 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Tue, 25 Mar 2025 07:14:13 GMT Subject: RFR: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) [v4] In-Reply-To: References: <2jI87up85vKeQq7xy6WoI987MOuqTqA6I8G75VvC74g=.e8ef9f9c-b8b3-496d-9b48-28c83dc1fb64@github.com> Message-ID: <0Ck3LigYC74nHGVrxvZOlWJ2m5Jsxp1zaMjmES4pA_g=.313cf7fc-46ee-4b8c-94bd-519dab3e4aba@github.com> On Fri, 28 Feb 2025 20:36:23 GMT, Dean Long wrote: >> Refreshing my memory, isn't the real problem with trying to fix this with a minimum codecache size is that some of these stubs are not allocated during initial single-threaded JVM startup, but later when the first compiler threads start, and that allows other code blobs to fill up the codecache? > >> Even so, it might be a good idea to additionally increase the minimum code cache anyway. @dean-long do you think it would make sense to file an RFE for that? > > Sure, if it's still an issue. Thank you very much for your reviews @dean-long and @adinn! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23630#issuecomment-2750298159 From dfenacci at openjdk.org Tue Mar 25 07:14:14 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Tue, 25 Mar 2025 07:14:14 GMT Subject: Integrated: 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) In-Reply-To: References: Message-ID: <1FYW5izBcd8fJ5zo507OispjDjN6EMRJR96PIlo9-Rs=.92b2883e-9ae5-4a5b-930e-16b6e3ff56c3@github.com> On Fri, 14 Feb 2025 11:04:20 GMT, Damon Fenacci wrote: > # Issue > The test `src/hotspot/share/opto/c2compiler.cpp` fails intermittently due to a crash that happens when trying to allocate code cache space for C1 and C2 in `RuntimeStub::new_runtime_stub` and `SingletonBlob::operator new`. > > # Causes > There are a few call paths during the initialization of C1 and C2 that can lead to the code cache allocations in `RuntimeStub::new_runtime_stub` (through `RuntimeStub::operator new`) and `SingletonBlob::operator new` triggering a fatal error if there is no more space. The paths in question are: > 1. `Compiler::init_c1_runtime` -> `Runtime1::initialize` -> `Runtime1::generate_blob_for` -> `Runtime1::generate_blob` -> `RuntimeStub::new_runtime_stub` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_stub` -> `Compile::Compile` -> `Compile::Code_Gen` -> `PhaseOutput::install` -> `PhaseOutput::install_stub` -> `RuntimeStub::new_runtime_stub` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_uncommon_trap_blob` -> `UncommonTrapBlob::create` -> `new UncommonTrapBlob` > 1. `C2Compiler::initialize` -> `C2Compiler::init_c2_runtime` -> `OptoRuntime::generate` -> `OptoRuntime::generate_exception_blob` -> `ExceptionBlob::create` -> `new ExceptionBlob` > > # Solution > Instead of fatally crashing the we can use the `alloc_fail_is_fatal` flag of `RuntimeStub::new_runtime_stub` to avoid crashing in cases 1 and 2 and add a similar flag to `SingletonBlob::operator new` for cases 3 and 4. In the latter case we need to adjust all calls accordingly. > > Note: In [JDK-8326615](https://bugs.openjdk.org/browse/JDK-8326615) it was argued that increasing the minimum code cache size would solve the issue but that wasn't entirely accurate: doing so possibly decreases the chances of a failed allocation in these 4 places but doesn't totally avoid it. > > # Testing > The original failing regression test in `test/hotspot/jtreg/compiler/startup/StartupOutput.java` has been modified to run multiple times with randomized values (within the original failing range) to increase the chances of hitting the fatal assertion. > > Tests: Tier 1-4 (windows-x64, linux-x64/aarch64, and macosx-x64/aarch64; release and debug mode) This pull request has now been integrated. Changeset: 48fac662 Author: Damon Fenacci URL: https://git.openjdk.org/jdk/commit/48fac6626c605f4679544e3dd24d5ad70561494a Stats: 139 lines in 27 files changed: 55 ins; 4 del; 80 mod 8347406: [REDO] C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) Reviewed-by: dlong, adinn ------------- PR: https://git.openjdk.org/jdk/pull/23630 From shade at openjdk.org Tue Mar 25 10:43:32 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 10:43:32 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v3] In-Reply-To: References: Message-ID: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone - Drop commented out block from deprecations - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 - 8345169: Implement JEP 503: Remove the 32-bit x86 Port ------------- Changes: https://git.openjdk.org/jdk/pull/23906/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23906&range=02 Stats: 29733 lines in 25 files changed: 4 ins; 29728 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23906/head:pull/23906 PR: https://git.openjdk.org/jdk/pull/23906 From shade at openjdk.org Tue Mar 25 11:13:24 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 11:13:24 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v3] In-Reply-To: References: Message-ID: <7WjwCHk4uVXhc0eAyxzIrplCMu0DLQm1U_thb56D0as=.d24099dc-498c-45c5-9cc6-1bffa34a5c05@github.com> On Mon, 24 Mar 2025 18:21:14 GMT, Kelvin Nilsen wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Widen type of shared enum value to unlock platform support of atomic xchg > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2143: > >> 2141: >> 2142: bool ShenandoahHeap::try_cancel_gc(GCCause::Cause cause) { >> 2143: const jbyte prev = _cancelled_gc.xchg(cause); > > I guess maybe we want cause and prev to be integer type. Then the template will expand into a type that is known to that Atomic::xchg operation. So this thing is no longer `jbyte`, so implicit cast to `jbyte` is no longer safe. I think we should really be casting to `GCCause::Cause`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2011861121 From shade at openjdk.org Tue Mar 25 13:29:30 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 13:29:30 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v3] In-Reply-To: References: Message-ID: <6aH6pqUjOOhfguuCXDjuRPNpieiu2rzJ7XxnTFQ2D4w=.fe2d0cd3-181d-4189-a3cb-6637bf85d89c@github.com> On Tue, 25 Mar 2025 10:43:32 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port JEP is now targeted to JDK 25. I remerged from master, resolved a few easy conflicts in files that are removed by this PR anyway, and did some light testing. Everything looks green. I only miss the re-review after the merge. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2751258762 From mdoerr at openjdk.org Tue Mar 25 15:22:22 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 25 Mar 2025 15:22:22 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v3] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 10:43:32 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2714187303 From ihse at openjdk.org Tue Mar 25 15:27:14 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 25 Mar 2025 15:27:14 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v3] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 10:43:32 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port Marked as reviewed by ihse (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23906#pullrequestreview-2714207496 From wkemper at openjdk.org Tue Mar 25 17:51:41 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 17:51:41 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: > The sequence of events that creates this state: > 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake > 2. The regulator thread cancels old marking to start a young collection > 3. A mutator thread shortly follows and attempts to cancel the nascent young collection > 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` > 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` > 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Stop casting GCCause to jbyte ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24105/files - new: https://git.openjdk.org/jdk/pull/24105/files/ca45ff02..abce0381 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24105&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24105.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24105/head:pull/24105 PR: https://git.openjdk.org/jdk/pull/24105 From wkemper at openjdk.org Tue Mar 25 17:51:41 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 17:51:41 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: <7WjwCHk4uVXhc0eAyxzIrplCMu0DLQm1U_thb56D0as=.d24099dc-498c-45c5-9cc6-1bffa34a5c05@github.com> References: <7WjwCHk4uVXhc0eAyxzIrplCMu0DLQm1U_thb56D0as=.d24099dc-498c-45c5-9cc6-1bffa34a5c05@github.com> Message-ID: On Tue, 25 Mar 2025 11:08:58 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2143: >> >>> 2141: >>> 2142: bool ShenandoahHeap::try_cancel_gc(GCCause::Cause cause) { >>> 2143: const jbyte prev = _cancelled_gc.xchg(cause); >> >> I guess maybe we want cause and prev to be integer type. Then the template will expand into a type that is known to that Atomic::xchg operation. > > So this thing is no longer `jbyte`, so implicit cast to `jbyte` is no longer safe. I think we should really be casting to `GCCause::Cause`. Yes! Good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24105#discussion_r2012624868 From shade at openjdk.org Tue Mar 25 18:11:12 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 18:11:12 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 17:51:41 GMT, William Kemper wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Stop casting GCCause to jbyte I think we want to remove `addr_of` and related methods for `ShenandoahSharedEnumFlag` to avoid accidents. `ShenandoahSharedValue` was defined specifically to stick to `jbyte` for the sake of generated code. If we are not expected to have accesses to generated code to this flag, we should remove the APIs that allow it. Going forward, I think we should consider redefining `ShenandoahSharedValue` to `uint32_t` to begin with. This would require fiddling with barrier sets that might read them. ------------- PR Review: https://git.openjdk.org/jdk/pull/24105#pullrequestreview-2714735454 From wkemper at openjdk.org Tue Mar 25 19:03:07 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 19:03:07 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 17:51:41 GMT, William Kemper wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Stop casting GCCause to jbyte `ShenandoahSharedEnumFlag` is only used for this one variable in `ShenandoahHeap`, do you want to remove it entirely and just have a plain `volatile GCCause::Cause _gc_cancelled` member? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24105#issuecomment-2752251839 From shade at openjdk.org Tue Mar 25 19:31:11 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 19:31:11 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 19:00:53 GMT, William Kemper wrote: > `ShenandoahSharedEnumFlag` is only used for this one variable in `ShenandoahHeap`, do you want to remove it entirely and just have a plain `volatile GCCause::Cause _gc_cancelled` member? Maybe? I was suspecting we want to have padding around the field to make sure we do not accidentally false-share it with anything. But that might not be a real issue. I think we should keep wrapping shared variables in `ShenandoahShared*` to clearly capture which fields are normally accessed by multiple threads, as to encapsulate all the atomic ops. Actually, leave `addr_of` alone, but file a RFE to redefine `ShenandoahSharedValue` to `uint32_t`, which would eliminate the deviation for the underlying `ShenandoahSharedEnumFlag`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24105#issuecomment-2752314412 From shade at openjdk.org Tue Mar 25 19:37:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Mar 2025 19:37:17 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 17:51:41 GMT, William Kemper wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Stop casting GCCause to jbyte Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24105#pullrequestreview-2714976353 From wkemper at openjdk.org Tue Mar 25 19:52:18 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 19:52:18 GMT Subject: Integrated: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled In-Reply-To: References: Message-ID: On Tue, 18 Mar 2025 21:51:34 GMT, William Kemper wrote: > The sequence of events that creates this state: > 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake > 2. The regulator thread cancels old marking to start a young collection > 3. A mutator thread shortly follows and attempts to cancel the nascent young collection > 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` > 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` > 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. This pull request has now been integrated. Changeset: dbc620fb Author: William Kemper URL: https://git.openjdk.org/jdk/commit/dbc620fb1f754ca84f2a07abfdfbd4c5fcb55087 Stats: 15 lines in 2 files changed: 7 ins; 0 del; 8 mod 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/24105 From wkemper at openjdk.org Tue Mar 25 19:52:18 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 19:52:18 GMT Subject: RFR: 8352299: GenShen: Young cycles that interrupt old cycles cannot be cancelled [v4] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 17:51:41 GMT, William Kemper wrote: >> The sequence of events that creates this state: >> 1. An old collection is trying to finish marking by flushing SATB buffers with a Handshake >> 2. The regulator thread cancels old marking to start a young collection >> 3. A mutator thread shortly follows and attempts to cancel the nascent young collection >> 4. Step `3` fails (because of this bug) and cancellation reason does _not_ become `allocation failure` >> 5. The mutator thread enters a tight loop in which it retries allocations without `waiting` >> 6. The mutator thread remains in the `thread_in_vm` state and prevents the VM thread from completing step `1`. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Stop casting GCCause to jbyte Okay, filed: https://bugs.openjdk.org/browse/JDK-8352914. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24105#issuecomment-2752357291 From ysr at openjdk.org Tue Mar 25 19:57:15 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 25 Mar 2025 19:57:15 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v2] In-Reply-To: <-73CoqTBA5dJPEwr7bxSvDmMFC9g_LZpW-q7XSjjtrE=.4966fa3b-e98f-4a50-9492-22bf99eecf1f@github.com> References: <-73CoqTBA5dJPEwr7bxSvDmMFC9g_LZpW-q7XSjjtrE=.4966fa3b-e98f-4a50-9492-22bf99eecf1f@github.com> Message-ID: On Wed, 12 Mar 2025 23:17:44 GMT, William Kemper wrote: >> Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Do not enforce size constraints on generations" > > This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. Generally looks right, but a few comments for your consideration. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1293: > 1291: > 1292: ShenandoahGenerationalHeap* gen_heap = ShenandoahGenerationalHeap::heap(); > 1293: const size_t region_capacity = alloc_capacity(r); A general note on terminology. We have generally used "capacity" to mean the total space, including that which has been allocated, and "used" for the space that has been allocated and isn't available to allocate. I'd use "free" here and avoid the extra arithmetic. I notice that the method actually uses "used", rather than "free". I think the interface for _partitions `move_from_...` is unnecessarily fat. Since we send the region idx to the `move_from_...` method, why not let that method get the amount free, rather than passing it as an additional parameter? I see that we essentially use this value only at line 1300 to correct the evacuation reserve figure. (Side question: Why don't we do that when we do the swap after line 1327?) src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1321: > 1319: > 1320: if (unusable_trash != -1) { > 1321: // 2. Move it to the mutator partition // Move the unusable trash region we found to the mutator partition. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1324: > 1322: _partitions.move_from_partition_to_partition(unusable_trash, > 1323: ShenandoahFreeSetPartitionId::OldCollector, > 1324: ShenandoahFreeSetPartitionId::Mutator, region_capacity); Shouldn't `region_capacity` argument be the free space in the unusable trash region? Wouldn't that be 0 (else why "unusable"?) src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 327: > 325: // hold evacuated objects. If this occurs and memory is still available in the Mutator's free set, we will flip a region from > 326: // the Mutator free set into the Collector or OldCollector free set. > 327: void flip_to_gc(ShenandoahHeapRegion* r); It seems as if (the current implementation of) `flip_to_gc()` always succeeds in flipping. I'd add that to its spec comment. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 329: > 327: void flip_to_gc(ShenandoahHeapRegion* r); > 328: > 329: bool flip_to_old_gc(ShenandoahHeapRegion* r); // Return true if and only if successfully flipped to old partition. ------------- PR Review: https://git.openjdk.org/jdk/pull/23998#pullrequestreview-2714934231 PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012841573 PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012830401 PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012835985 PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012789624 PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012790543 From wkemper at openjdk.org Tue Mar 25 20:50:14 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 20:50:14 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v2] In-Reply-To: References: <-73CoqTBA5dJPEwr7bxSvDmMFC9g_LZpW-q7XSjjtrE=.4966fa3b-e98f-4a50-9492-22bf99eecf1f@github.com> Message-ID: On Tue, 25 Mar 2025 19:50:38 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Do not enforce size constraints on generations" >> >> This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. > > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1324: > >> 1322: _partitions.move_from_partition_to_partition(unusable_trash, >> 1323: ShenandoahFreeSetPartitionId::OldCollector, >> 1324: ShenandoahFreeSetPartitionId::Mutator, region_capacity); > > Shouldn't `region_capacity` argument be the free space in the unusable trash region? Wouldn't that be 0 (else why "unusable"?) Yes, good catch. However, it won't be `0` because this region is only _temporarily_ unusable while concurrent weak roots is in progress. Elsewhere, when the freeset is rebuilt, the `alloc_capacity` of trash regions is considered equal to the region size (regardless if weak roots is in progress). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012916069 From wkemper at openjdk.org Tue Mar 25 21:03:11 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 21:03:11 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v2] In-Reply-To: References: <-73CoqTBA5dJPEwr7bxSvDmMFC9g_LZpW-q7XSjjtrE=.4966fa3b-e98f-4a50-9492-22bf99eecf1f@github.com> Message-ID: <4koDTG-c84SvK4641HlEpHJ-ICUze2za6BnZkchYdIA=.a6b39699-53da-4514-b48f-82f990d85b59@github.com> On Tue, 25 Mar 2025 19:53:35 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Do not enforce size constraints on generations" >> >> This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. > > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1293: > >> 1291: >> 1292: ShenandoahGenerationalHeap* gen_heap = ShenandoahGenerationalHeap::heap(); >> 1293: const size_t region_capacity = alloc_capacity(r); > > A general note on terminology. We have generally used "capacity" to mean the total space, including that which has been allocated, and "used" for the space that has been allocated and isn't available to allocate. I'd use "free" here and avoid the extra arithmetic. > > I notice that the method actually uses "used", rather than "free". > > I think the interface for _partitions `move_from_...` is unnecessarily fat. Since we send the region idx to the `move_from_...` method, why not let that method get the amount free, rather than passing it as an additional parameter? > > I see that we essentially use this value only at line 1300 to correct the evacuation reserve figure. (Side question: Why don't we do that when we do the swap after line 1327?) I see your point, but there are cases where the business logic depends on the allocation capacity before adding the region to the freeset. In those cases, we'd compute allocation capacity twice. We could have an overload to compute and forward allocation capacity for the other cases? That's a good question. We should probably subtract the capacity of the unusable trash region from the reserve, and add the capacity of the usable region back in. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23998#discussion_r2012933965 From wkemper at openjdk.org Tue Mar 25 21:16:06 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 21:16:06 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v3] In-Reply-To: References: Message-ID: > Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Update evac reserve when swapping trash region for non-trash region - Use capacity of transferred region - Improve comments - Merge remote-tracking branch 'jdk/master' into fix-flip-to-old-reserve - Revert "Do not enforce size constraints on generations" This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. - Do not enforce size constraints on generations This will make it easier for the old generation collector to take regions from the mutator when necessary - Don't allocate in regions that cannot be flipped to old gc - Do not allocate from mutator if young gen cannot spare the region ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23998/files - new: https://git.openjdk.org/jdk/pull/23998/files/a42efe5a..1807b2ac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23998&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23998&range=01-02 Stats: 121111 lines in 3547 files changed: 60844 ins; 38996 del; 21271 mod Patch: https://git.openjdk.org/jdk/pull/23998.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23998/head:pull/23998 PR: https://git.openjdk.org/jdk/pull/23998 From ysr at openjdk.org Tue Mar 25 21:28:10 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 25 Mar 2025 21:28:10 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v3] In-Reply-To: References: Message-ID: <6ibXMioOPZ80OpozxJbnv9WsWQO2aKxiRIxWrteaDxs=.685aaf7e-9e56-4ab5-8737-b10e9c0d784a@github.com> On Tue, 25 Mar 2025 21:16:06 GMT, William Kemper wrote: >> Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Update evac reserve when swapping trash region for non-trash region > - Use capacity of transferred region > - Improve comments > - Merge remote-tracking branch 'jdk/master' into fix-flip-to-old-reserve > - Revert "Do not enforce size constraints on generations" > > This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. > - Do not enforce size constraints on generations > > This will make it easier for the old generation collector to take regions from the mutator when necessary > - Don't allocate in regions that cannot be flipped to old gc > - Do not allocate from mutator if young gen cannot spare the region Thanks for the more careful arithmetic in the adjustments. Let's rerun GHA and testing to make sure this doesn't have any knock-on effects that trigger other checks elsewhere. Looks good otherwise. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23998#pullrequestreview-2715240382 From kdnilsen at openjdk.org Tue Mar 25 23:20:39 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 25 Mar 2025 23:20:39 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 22:48:24 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > tide up Thanks for the refinements. LGTM. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2715403856 From wkemper at openjdk.org Tue Mar 25 23:20:39 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 23:20:39 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 22:48:24 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > tide up Do we think this will fix https://bugs.openjdk.org/browse/JDK-8345399, should we add it as an issue to this PR? ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2715405105 From wkemper at openjdk.org Tue Mar 25 23:23:13 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 23:23:13 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: References: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> Message-ID: On Mon, 24 Mar 2025 23:08:18 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 137: >> >>> 135: // GC is starting, bump the internal gc count and set GCIdMark >>> 136: update_gc_count(); >>> 137: GCIdMark gc_id_mark; >> >> Can we still set the `GCIdMark` with our internal counter? I'd prefer they stay in sync explicitly. > > @earthling-amzn : Is your concern that GC count is incremented concurrently by two different callers? If so, I'd have the atomic increment return the pre- or post-increment value as the case may be and have the caller use that in their mark label. (Question: do we have different Id's for young and a concurrent/interrupted old? -- I would imagine so, with the old carrying an older id, and each subsequent young getting a newer id). My concern was more that `ShenandoahController::_gc_id` hides a field in its base class `NamedThread::_gc_id`, but `ShenandoahController::_gc_id` starts from `1`, while `NamedThread::_gc_id` starts from `0`. I think this will be addressed in a separate PR. This PR has been simplified to only fix the root cause of the assertion failure. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2013061689 From wkemper at openjdk.org Tue Mar 25 23:23:12 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 25 Mar 2025 23:23:12 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor LGTM ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24166#pullrequestreview-2715408360 From ysr at openjdk.org Tue Mar 25 23:32:09 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 25 Mar 2025 23:32:09 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24166#pullrequestreview-2715416184 From ysr at openjdk.org Tue Mar 25 23:32:10 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 25 Mar 2025 23:32:10 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v4] In-Reply-To: References: <__h_W-Ubi-14v0aDUciY2v5VuQnFHJOlabA7ZWIQcQM=.367913e0-84c3-46c5-86de-981509367951@github.com> Message-ID: On Tue, 25 Mar 2025 23:20:28 GMT, William Kemper wrote: >> @earthling-amzn : Is your concern that GC count is incremented concurrently by two different callers? If so, I'd have the atomic increment return the pre- or post-increment value as the case may be and have the caller use that in their mark label. (Question: do we have different Id's for young and a concurrent/interrupted old? -- I would imagine so, with the old carrying an older id, and each subsequent young getting a newer id). > > My concern was more that `ShenandoahController::_gc_id` hides a field in its base class `NamedThread::_gc_id`, but `ShenandoahController::_gc_id` starts from `1`, while `NamedThread::_gc_id` starts from `0`. I think this will be addressed in a separate PR. This PR has been simplified to only fix the root cause of the assertion failure. ah, i see. Good point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24166#discussion_r2013067152 From xpeng at openjdk.org Tue Mar 25 23:32:21 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 25 Mar 2025 23:32:21 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 23:17:06 GMT, William Kemper wrote: > Do we think this will fix https://bugs.openjdk.org/browse/JDK-8345399, should we add it as an issue to this PR? It will likely fix the JDK-8345399, I mentioned it in JBS. will see if I can get a ppc64le hardware to verify this week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24092#issuecomment-2752762251 From duke at openjdk.org Tue Mar 25 23:53:08 2025 From: duke at openjdk.org (duke) Date: Tue, 25 Mar 2025 23:53:08 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor @pengxiaolong Your change (at version 57c43ef3417dec14a80e3f4dcab7d3666e93b033) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2752787229 From xpeng at openjdk.org Tue Mar 25 23:53:07 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 25 Mar 2025 23:53:07 GMT Subject: RFR: 8352588: GenShen: Enabling JFR asserts when getting GCId [v6] In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 22:27:45 GMT, Xiaolong Peng wrote: >> ### Root cause >> Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. >> >> ### Solution >> it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` >> >> In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. >> >> ### Test >> - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" >> - [x] TEST=hotspot_gc_shenandoah >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Revert ShenandoahController::_gc_count related refactor Thanks for the reviews and suggestions! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24166#issuecomment-2752786931 From ysr at openjdk.org Wed Mar 26 00:20:11 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 26 Mar 2025 00:20:11 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> Message-ID: On Fri, 21 Mar 2025 15:08:22 GMT, Xiaolong Peng wrote: >> I was suggesting looking to see if normal perf measures showed any improvements. E.g. if you ran say SPECjbb and compared the remset scan times for the minor GC's that followed global collections. > > I have run h2 benchmark, here is the remembered set scan times after a global GC, it does seem to improve remembered set scan time in this case: > > PR version: > > [2025-03-21T07:35:41.801+0000][10.292s][19715][info ][gc ] GC(6) Concurrent remembered set scanning 13.069ms > [2025-03-21T07:35:48.088+0000][16.579s][19715][info ][gc ] GC(9) Concurrent remembered set scanning 5.537ms > [2025-03-21T07:35:56.610+0000][25.101s][19715][info ][gc ] GC(14) Concurrent remembered set scanning 6.186ms > [2025-03-21T07:36:03.967+0000][32.459s][19715][info ][gc ] GC(18) Concurrent remembered set scanning 9.562ms > [2025-03-21T07:36:11.234+0000][39.725s][19715][info ][gc ] GC(22) Concurrent remembered set scanning 2.591ms > [2025-03-21T07:36:17.303+0000][45.794s][19715][info ][gc ] GC(25) Concurrent remembered set scanning 0.999ms > [2025-03-21T07:36:25.647+0000][54.139s][19715][info ][gc ] GC(30) Concurrent remembered set scanning 1.665ms > [2025-03-21T07:36:32.790+0000][61.281s][19715][info ][gc ] GC(33) Concurrent remembered set scanning 2.851ms > [2025-03-21T07:36:40.241+0000][68.732s][19715][info ][gc ] GC(36) Concurrent remembered set scanning 0.716ms > [2025-03-21T07:36:47.440+0000][75.931s][19715][info ][gc ] GC(39) Concurrent remembered set scanning 1.932ms > > > master: > > [2025-03-21T07:34:04.978+0000][10.765s][17923][info ][gc ] GC(6) Concurrent remembered set scanning 22.813ms > [2025-03-21T07:34:11.250+0000][17.038s][17923][info ][gc ] GC(9) Concurrent remembered set scanning 14.457ms > [2025-03-21T07:34:18.692+0000][24.480s][17923][info ][gc ] GC(14) Concurrent remembered set scanning 4.972ms > [2025-03-21T07:34:26.033+0000][31.820s][17923][info ][gc ] GC(18) Concurrent remembered set scanning 9.134ms > [2025-03-21T07:34:34.416+0000][40.203s][17923][info ][gc ] GC(22) Concurrent remembered set scanning 3.655ms > [2025-03-21T07:34:42.180+0000][47.967s][17923][info ][gc ] GC(26) Concurrent remembered set scanning 3.253ms > [2025-03-21T07:34:49.371+0000][55.168s][17923][info ][gc ] GC(29) Concurrent remembered set scanning 1.615ms > [2025-03-21T07:34:56.592+0000][62.396s][17923][info ][gc ] GC(32) Concurrent remembered set scanning 1.570ms > [2025-03-21T07:35:03.766+0000][69.575s][17923][info ][gc ] GC(35) Concurrent remembered set scanning 1.040ms > [2025-03-21T07:35:10.941+0000][... very cool! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2013099621 From ysr at openjdk.org Wed Mar 26 00:56:20 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 26 Mar 2025 00:56:20 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: On Thu, 20 Mar 2025 22:48:24 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > tide up I think the change can be pushed as is, but I am not convinced that the verification can't be tightened when old marking information is missing as long as we have a valid TAMS and there are no unparsable objects (which should only happen when coalease-&-fill has been interrupted, leaving dead objects with x-gen pointers that would cause false positives or upon class unloading when dead objects may end up being unparsable). The current condition of skipping verification when old bit maps are cleared seems to miss verification opportunities that would be valid after a completed C&F. Left some related comments, but I won't hold back this PR further. The tightening can be done subsequently (and I am happy to pick that up afterwards as needed). Thanks for your patience with my tardy and long-winded reviews! :-) src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1060: > 1058: VerifyRememberedSet verify_remembered_set = _verify_remembered_before_marking; > 1059: if (_heap->mode()->is_generational() && > 1060: !_heap->old_generation()->is_mark_complete()) { Why not the following stronger condition to skip verification? My sense is that the only case we cannot verify is if we do not have marking info _and_ old gen has been left "unparsable" (because of an incomplete/interrupted C&F which may have us look at dead objects -- that are either unparsable because of class unloading, or are parsable but hold cross-gen pointers). In all other cases, we can do a safe and complete verification. is_generational() && !old_gen->is_mark_complete() && !old_gen->is_parsable() src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1125: > 1123: VerifyRememberedSet verify_remembered_set = _verify_remembered_before_updating_references; > 1124: if (_heap->mode()->is_generational() && > 1125: !_heap->old_generation()->is_mark_complete()) { Same comment re stronger condition as previous one above. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2715476210 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2013133891 PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2013142910 From ysr at openjdk.org Wed Mar 26 00:56:21 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 26 Mar 2025 00:56:21 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> Message-ID: On Wed, 26 Mar 2025 00:17:44 GMT, Y. Srinivas Ramakrishna wrote: >> I have run h2 benchmark, here is the remembered set scan times after a global GC, it does seem to improve remembered set scan time in this case: >> >> PR version: >> >> [2025-03-21T07:35:41.801+0000][10.292s][19715][info ][gc ] GC(6) Concurrent remembered set scanning 13.069ms >> [2025-03-21T07:35:48.088+0000][16.579s][19715][info ][gc ] GC(9) Concurrent remembered set scanning 5.537ms >> [2025-03-21T07:35:56.610+0000][25.101s][19715][info ][gc ] GC(14) Concurrent remembered set scanning 6.186ms >> [2025-03-21T07:36:03.967+0000][32.459s][19715][info ][gc ] GC(18) Concurrent remembered set scanning 9.562ms >> [2025-03-21T07:36:11.234+0000][39.725s][19715][info ][gc ] GC(22) Concurrent remembered set scanning 2.591ms >> [2025-03-21T07:36:17.303+0000][45.794s][19715][info ][gc ] GC(25) Concurrent remembered set scanning 0.999ms >> [2025-03-21T07:36:25.647+0000][54.139s][19715][info ][gc ] GC(30) Concurrent remembered set scanning 1.665ms >> [2025-03-21T07:36:32.790+0000][61.281s][19715][info ][gc ] GC(33) Concurrent remembered set scanning 2.851ms >> [2025-03-21T07:36:40.241+0000][68.732s][19715][info ][gc ] GC(36) Concurrent remembered set scanning 0.716ms >> [2025-03-21T07:36:47.440+0000][75.931s][19715][info ][gc ] GC(39) Concurrent remembered set scanning 1.932ms >> >> >> master: >> >> [2025-03-21T07:34:04.978+0000][10.765s][17923][info ][gc ] GC(6) Concurrent remembered set scanning 22.813ms >> [2025-03-21T07:34:11.250+0000][17.038s][17923][info ][gc ] GC(9) Concurrent remembered set scanning 14.457ms >> [2025-03-21T07:34:18.692+0000][24.480s][17923][info ][gc ] GC(14) Concurrent remembered set scanning 4.972ms >> [2025-03-21T07:34:26.033+0000][31.820s][17923][info ][gc ] GC(18) Concurrent remembered set scanning 9.134ms >> [2025-03-21T07:34:34.416+0000][40.203s][17923][info ][gc ] GC(22) Concurrent remembered set scanning 3.655ms >> [2025-03-21T07:34:42.180+0000][47.967s][17923][info ][gc ] GC(26) Concurrent remembered set scanning 3.253ms >> [2025-03-21T07:34:49.371+0000][55.168s][17923][info ][gc ] GC(29) Concurrent remembered set scanning 1.615ms >> [2025-03-21T07:34:56.592+0000][62.396s][17923][info ][gc ] GC(32) Concurrent remembered set scanning 1.570ms >> [2025-03-21T07:35:03.766+0000][69.575s][17923][info ][gc ] GC(35) Concurrent remembere... > > very cool! May be as intended earlier, leave a documentation comment between lines 697 & 698 along the lines of Kelvin's comment: // After we swap card table below, the write-table is all clean, and the read table holds // cards dirty prior to the start of GC. Young and bootstrap collection will update // the write card table as a side effect of remembered set scanning. Global collection will // update the card table as a side effect of global marking of old objects. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2013107472 From shade at openjdk.org Wed Mar 26 09:26:22 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Mar 2025 09:26:22 GMT Subject: RFR: 8345169: Implement JEP 503: Remove the 32-bit x86 Port [v3] In-Reply-To: References: Message-ID: <38zw9WI_zW70F66Y44GWS6c5fXWHY0tBXmrnUqo7g3k=.e5d35577-fd80-44db-88bf-523e9f982ffc@github.com> On Tue, 25 Mar 2025 10:43:32 GMT, Aleksey Shipilev wrote: >> This PR implements JEP 503: Remove the 32-bit x86 Port. >> >> The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. >> >> This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. >> >> The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. >> >> Additional testing: >> - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) >> - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) >> - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Drop commented out block from deprecations > - Merge branch 'master' into JDK-8345169-32bit-x86-be-gone > - Generic 32-bit x86 configure error supercedes Windows 32-bit x86 > - 8345169: Implement JEP 503: Remove the 32-bit x86 Port There we go! Thanks all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23906#issuecomment-2753715713 From shade at openjdk.org Wed Mar 26 09:26:23 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Mar 2025 09:26:23 GMT Subject: Integrated: 8345169: Implement JEP 503: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 4 Mar 2025 16:52:16 GMT, Aleksey Shipilev wrote: > This PR implements JEP 503: Remove the 32-bit x86 Port. > > The JEP is proposed to target 25, we would not integrate until JEP is ready. Reviews are appreciated meanwhile. > > This is only the removal of obvious 32-bit x86 parts, mostly files with `x86_32` in their name. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The approach for removing x86_32 files only also makes this PR borderline trivial, and requires no additional testing beyond normal pre-integration checks. > > The rest of the code is quite heavily intertwined with x86_64 and/or Zero, and would require accurate untangling. It would be much easier to review and test once we purge the free-standing parts of 32-bit x86 port, which is also a bulk of the port. The tangling with 32-bit x86 Zero is also why I did not touch most of the build system paths that handle x86. There is [JDK-8351148](https://bugs.openjdk.org/browse/JDK-8351148) umbrella that tracks further cleanup work. One can peek the final state that can be reached with all the cleanups in my earlier exploratory https://github.com/openjdk/jdk/pull/22567. > > Additional testing: > - [x] Linux x86_32 Server fastdebug, `make bootcycle-images` (now fails configure) > - [x] Linux x86_64 Server fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_32 Zero fastdebug, `make bootcycle-images` (still works) > - [x] Linux x86_64 Zero fastdebug, `make bootcycle-images` (still works) This pull request has now been integrated. Changeset: ee710fec Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/ee710fec21c4e886769576c17ad6db2ab91a84b4 Stats: 29733 lines in 25 files changed: 4 ins; 29728 del; 1 mod 8345169: Implement JEP 503: Remove the 32-bit x86 Port Reviewed-by: ihse, mdoerr, vlivanov, kvn, coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/23906 From duke at openjdk.org Wed Mar 26 15:23:47 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Mar 2025 15:23:47 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make Message-ID: This patch remove slice parameter from LoadNode::make Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 Hi team, I am new, I'd appreciate any guidance. Thank a lot! ------------- Commit messages: - 8344116: C2: remove slice parameter from LoadNode::make Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8344116 Stats: 54 lines in 13 files changed: 3 ins; 14 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From xpeng at openjdk.org Wed Mar 26 15:41:25 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 26 Mar 2025 15:41:25 GMT Subject: Integrated: 8352588: GenShen: Enabling JFR asserts when getting GCId In-Reply-To: References: Message-ID: On Fri, 21 Mar 2025 19:09:46 GMT, Xiaolong Peng wrote: > ### Root cause > Shenandoah has its own way to generate gc id([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L234), [link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahController.hpp#L43)), but when it runs a specific GC cycle, it still use the default GCIdMark([link](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp#L389)) to generate a gc id and set it to NamedThread::_gc_id. Once the specific GC cycle finishes, the NamedThread::_gc_id is restored to the original value which is `undefined`, which causes the asserts when Enabling JFR, in release build it should cause invalid GC id in some of JFR events. > > ### Solution > it is confusing that Shenandoah generates its own gc id but not use it for GC logging and JFR, the solution is fairly simple, the control thread just need inject gc id with GCIdMark(gc_id) it generates in `ShenandoahControlThread::run_service` and `ShenandoahGenerationalControlThread::run_gc_cycle` > > In the test, I also noticed the value of gc_id generated by Shenandoah control thread starts from 1, which is different from the default behavior of GCIdMark which generates id starting from 0, this PR will also fix it. > > ### Test > - [x] TEST=gc/shenandoah/TestWithLogLevel.java TEST_VM_OPTS="-XX:StartFlightRecording" > - [x] TEST=hotspot_gc_shenandoah > - [x] GHA This pull request has now been integrated. Changeset: a2a64dac Author: Xiaolong Peng Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/a2a64dac1680e97dd9eb511ead951bf1be8121c6 Stats: 11 lines in 2 files changed: 4 ins; 7 del; 0 mod 8352588: GenShen: Enabling JFR asserts when getting GCId Reviewed-by: wkemper, ysr ------------- PR: https://git.openjdk.org/jdk/pull/24166 From duke at openjdk.org Wed Mar 26 15:43:30 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Mar 2025 15:43:30 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v2] In-Reply-To: References: Message-ID: <6NXNfV1dqzZxpogva4dsv0kxkAQtJlgmLnSHvgZm5YA=.461d9a09-1e23-4acd-8230-0840348183ef@github.com> > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'openjdk:master' into 8344116 - 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/27df4a01..f4ef46dc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=00-01 Stats: 34071 lines in 1200 files changed: 1990 ins; 30272 del; 1809 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From wkemper at openjdk.org Wed Mar 26 17:36:21 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 26 Mar 2025 17:36:21 GMT Subject: RFR: 8348400: GenShen: assert(ShenandoahHeap::heap()->is_full_gc_in_progress() || (used_regions_size() <= _max_capacity)) failed: Cannot use more than capacity # [v3] In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 21:16:06 GMT, William Kemper wrote: >> Shenandoah cannot recycle immediate trash regions during the concurrent weak roots phase, however some of these regions may be assigned to the old generation collector's reserve. When an evacuation/promotion tries to allocate in such a region, it will fail (as expected) and try to 'steal' a region from the mutator's partition of the free set. There are cases when this cannot be allowed due to capacity constraints. However, in some of these cases it will be possible to 'swap' a region between the old reserve and the mutator's partition. This change covers this case. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Update evac reserve when swapping trash region for non-trash region > - Use capacity of transferred region > - Improve comments > - Merge remote-tracking branch 'jdk/master' into fix-flip-to-old-reserve > - Revert "Do not enforce size constraints on generations" > > This reverts commit 11ff0677449fa6749df8830f4a03f1c7861ba314. > - Do not enforce size constraints on generations > > This will make it easier for the old generation collector to take regions from the mutator when necessary > - Don't allocate in regions that cannot be flipped to old gc > - Do not allocate from mutator if young gen cannot spare the region No assertions after running `TestPauseNotifications` 40,000 times, no failures in performance/stress test pipelines. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23998#issuecomment-2755193068 From shade at openjdk.org Wed Mar 26 18:38:33 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Mar 2025 18:38:33 GMT Subject: RFR: 8351157: Clean up x86 GC barriers after 32-bit x86 removal Message-ID: Assembler GC barriers have quite a bit of coding to support 32-bit x86. As 32-bit x86 is removed, we can clean up those parts. We can eliminate `!LP64` blocks quite easily. We can also prune passing around `thread` argument, and just trust that `r15_thread` is always available. Additional testing: - [x] Linux x86_64 server fastdebug, `tier1` - [ ] Linux x86_64 server fastdebug, `all` ------------- Commit messages: - Also do tlab_allocate - Rely on R15 to be a thread register - Work Changes: https://git.openjdk.org/jdk/pull/24253/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24253&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351157 Stats: 546 lines in 20 files changed: 1 ins; 429 del; 116 mod Patch: https://git.openjdk.org/jdk/pull/24253.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24253/head:pull/24253 PR: https://git.openjdk.org/jdk/pull/24253 From wkemper at openjdk.org Wed Mar 26 19:09:04 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 26 Mar 2025 19:09:04 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v2] In-Reply-To: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> References: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> Message-ID: > Not clean, has two follow up fixes in this PR. William Kemper has updated the pull request incrementally with three additional commits since the last revision: - Allow UPDATE_REFS in gc state when verifying before update refs - Update gc state resetter for verifier - Remove code block missed when IU mode was removed ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk21u/pull/161/files - new: https://git.openjdk.org/shenandoah-jdk21u/pull/161/files/fcb58bb7..219c4ef2 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=161&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=161&range=00-01 Stats: 25 lines in 5 files changed: 14 ins; 8 del; 3 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/161.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/161/head:pull/161 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/161 From wkemper at openjdk.org Wed Mar 26 19:21:17 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 26 Mar 2025 19:21:17 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended Message-ID: When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) ------------- Commit messages: - Fix verifier's gc_state resettter Changes: https://git.openjdk.org/jdk/pull/24264/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24264&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352918 Stats: 8 lines in 2 files changed: 7 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24264/head:pull/24264 PR: https://git.openjdk.org/jdk/pull/24264 From kbarrett at openjdk.org Wed Mar 26 19:22:15 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 26 Mar 2025 19:22:15 GMT Subject: RFR: 8351157: Clean up x86 GC barriers after 32-bit x86 removal In-Reply-To: References: Message-ID: <1rIX0wehaIIaJsnvIoAGshNeioyVi-E6JiPW3lleQ00=.22f8c01f-bcde-4ad8-8ca1-518b727796af@github.com> On Wed, 26 Mar 2025 12:48:13 GMT, Aleksey Shipilev wrote: > Assembler GC barriers have quite a bit of coding to support 32-bit x86. As 32-bit x86 is removed, we can clean up those parts. > > We can eliminate `!LP64` blocks quite easily. We can also prune passing around `thread` argument, and just trust that `r15_thread` is always available. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24253#pullrequestreview-2718385753 From kdnilsen at openjdk.org Wed Mar 26 20:04:13 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 26 Mar 2025 20:04:13 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 19:17:33 GMT, William Kemper wrote: > When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24264#pullrequestreview-2718492011 From shade at openjdk.org Wed Mar 26 20:09:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Mar 2025 20:09:14 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 19:17:33 GMT, William Kemper wrote: > When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) Ouch. Took me a while to understand which field is updated, since fields are named the same in `ShenandoahHeap` and here. I suggest renaming fields in `ShenandoahGCStateResetter` to `_saved_gc_state` and `_saved_gc_state_changed`. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24264#pullrequestreview-2718502261 From xpeng at openjdk.org Wed Mar 26 20:37:59 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 26 Mar 2025 20:37:59 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v14] In-Reply-To: References: Message-ID: > There are some scenarios in which GenShen may have improper remembered set verification logic: > > 1. Concurrent young cycles following a Full GC: > > In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification > > > ShenandoahVerifier > ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { > shenandoah_assert_generations_reconciled(); > if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { > return _heap->complete_marking_context(); > } > return nullptr; > } > > > For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. > > 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. > > 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. > > 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. > > ### Solution > * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. > * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. > > ### Test > - [x] `make test TEST=hotspot_gc_shenandoah` > - [x] GHA Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Add comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24092/files - new: https://git.openjdk.org/jdk/pull/24092/files/16494d48..e11c6fc3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24092&range=12-13 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24092/head:pull/24092 PR: https://git.openjdk.org/jdk/pull/24092 From xpeng at openjdk.org Wed Mar 26 20:37:59 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 26 Mar 2025 20:37:59 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v14] In-Reply-To: References: <_GG5htdXFZ2Jv3qTAyG6djSrvXDtGx-jTLGoA2JbEXU=.b8588ac1-e51f-4ddf-afda-c64e6a789440@github.com> <7pNU1UWNVucen0QwESfFkOiKIP59gBVZNF5gCHveOQ0=.144c0152-20a4-41f4-8929-b906a284be7f@github.com> <2jgKqoBrD8WfxKs9cLqfzWa5AMS__muV4O0IxTiWFbA=.b41179e5-5252-476f-8bc9-c42e2a6a507b@github.com> Message-ID: <0lv1sy_XF3zy1Xr2JTfNOZFk5lQDf7uifcfY8HF2mYw=.2590df3d-49a2-4fca-afa0-859ebc1cf44e@github.com> On Wed, 26 Mar 2025 00:30:55 GMT, Y. Srinivas Ramakrishna wrote: >> very cool! > > May be as intended earlier, leave a documentation comment between lines 697 & 698 along the lines of Kelvin's comment: > > > // After we swap card table below, the write-table is all clean, and the read table holds > // cards dirty prior to the start of GC. Young and bootstrap collection will update > // the write card table as a side effect of remembered set scanning. Global collection will > // update the card table as a side effect of global marking of old objects. Sorry, I added comments as you suggested but forgot to push.Now the comment has been added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2014959494 From wkemper at openjdk.org Wed Mar 26 20:46:32 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 26 Mar 2025 20:46:32 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended [v2] In-Reply-To: References: Message-ID: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> > When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) William Kemper has updated the pull request incrementally with one additional commit since the last revision: Rename saved fields so they are easier to distinguish from the heap's fields ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24264/files - new: https://git.openjdk.org/jdk/pull/24264/files/64403639..2c86db47 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24264&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24264&range=00-01 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24264/head:pull/24264 PR: https://git.openjdk.org/jdk/pull/24264 From xpeng at openjdk.org Wed Mar 26 20:49:17 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 26 Mar 2025 20:49:17 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v13] In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 00:44:18 GMT, Y. Srinivas Ramakrishna wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> tide up > > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1060: > >> 1058: VerifyRememberedSet verify_remembered_set = _verify_remembered_before_marking; >> 1059: if (_heap->mode()->is_generational() && >> 1060: !_heap->old_generation()->is_mark_complete()) { > > Why not the following stronger condition to skip verification? My sense is that the only case we cannot verify is if we do not have marking info _and_ old gen has been left "unparsable" (because of an incomplete/interrupted C&F which may have us look at dead objects -- that are either unparsable because of class unloading, or are parsable but hold cross-gen pointers). In all other cases, we can do a safe and complete verification. > > > is_generational() && !old_gen->is_mark_complete() && !old_gen->is_parsable() We may not need to worry about it, old_gen becomes not parsable in class unloading phase of a global concurrent GC, marking is already done for the global including old gen, there should be always complete marking for old when old gen is not parsable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2014974376 From kdnilsen at openjdk.org Wed Mar 26 21:19:14 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 26 Mar 2025 21:19:14 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended [v2] In-Reply-To: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> References: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> Message-ID: On Wed, 26 Mar 2025 20:46:32 GMT, William Kemper wrote: >> When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Rename saved fields so they are easier to distinguish from the heap's fields Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24264#pullrequestreview-2718638195 From ysr at openjdk.org Wed Mar 26 22:07:21 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 26 Mar 2025 22:07:21 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v14] In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 20:37:59 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add comments ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24092#pullrequestreview-2718713521 From shade at openjdk.org Wed Mar 26 22:17:13 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Mar 2025 22:17:13 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended [v2] In-Reply-To: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> References: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> Message-ID: On Wed, 26 Mar 2025 20:46:32 GMT, William Kemper wrote: >> When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Rename saved fields so they are easier to distinguish from the heap's fields Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24264#pullrequestreview-2718728122 From wkemper at openjdk.org Wed Mar 26 23:55:44 2025 From: wkemper at openjdk.org (William Kemper) Date: Wed, 26 Mar 2025 23:55:44 GMT Subject: RFR: 8351892: GenShen: Remove enforcement of generation sizes Message-ID: * The option to configure minimum and maximum sizes for the young generation have been combined into `ShenandoahInitYoungPercentage`. * The remaining functionality in `shGenerationSizer` wasn't enough to warrant being its own class, so the functionality was rolled into `shGenerationalHeap`. ------------- Commit messages: - Stop enforcing young/old generation sizes. Changes: https://git.openjdk.org/jdk/pull/24268/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24268&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351892 Stats: 395 lines in 11 files changed: 57 ins; 315 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/24268.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24268/head:pull/24268 PR: https://git.openjdk.org/jdk/pull/24268 From ysr at openjdk.org Thu Mar 27 01:01:20 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 27 Mar 2025 01:01:20 GMT Subject: RFR: 8352918: Shenandoah: Verifier does not deactivate barriers as intended [v2] In-Reply-To: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> References: <8Df_0rWlNraMfHLDM9nkevd6bgJE6yTN65MuMXiC458=.4a00283d-b6a4-408f-b143-8c9057b3fd21@github.com> Message-ID: On Wed, 26 Mar 2025 20:46:32 GMT, William Kemper wrote: >> When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Rename saved fields so they are easier to distinguish from the heap's fields ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24264#pullrequestreview-2718945983 From shade at openjdk.org Thu Mar 27 12:31:21 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Mar 2025 12:31:21 GMT Subject: RFR: 8351157: Clean up x86 GC barriers after 32-bit x86 removal [v2] In-Reply-To: References: Message-ID: <6aXRsWRRGrrJdkmNcZHPw8JBD5piGr6UrmjOdnHjlMY=.3dde2c28-bdfc-4eb1-8d1d-7a4c85d3234f@github.com> > Assembler GC barriers have quite a bit of coding to support 32-bit x86. As 32-bit x86 is removed, we can clean up those parts. > > We can eliminate `!LP64` blocks quite easily. We can also prune passing around `thread` argument, and just trust that `r15_thread` is always available. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' into JDK-8351157-x86-gc-barriers - Also do tlab_allocate - Rely on R15 to be a thread register - Work ------------- Changes: https://git.openjdk.org/jdk/pull/24253/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24253&range=01 Stats: 543 lines in 20 files changed: 1 ins; 426 del; 116 mod Patch: https://git.openjdk.org/jdk/pull/24253.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24253/head:pull/24253 PR: https://git.openjdk.org/jdk/pull/24253 From duke at openjdk.org Thu Mar 27 12:40:29 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 27 Mar 2025 12:40:29 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v3] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/f4ef46dc..08c1a382 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=01-02 Stats: 3892 lines in 94 files changed: 1545 ins; 2033 del; 314 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From wkemper at openjdk.org Thu Mar 27 14:27:41 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 27 Mar 2025 14:27:41 GMT Subject: RFR: Merge openjdk/jdk21u:master Message-ID: Merges tag jdk-21.0.7+5 ------------- Commit messages: - 8352097: (tz) zone.tab update missed in 2025a backport The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/162/files Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/162.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/162/head:pull/162 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/162 From wkemper at openjdk.org Thu Mar 27 16:37:20 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 27 Mar 2025 16:37:20 GMT Subject: Integrated: 8352918: Shenandoah: Verifier does not deactivate barriers as intended In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 19:17:33 GMT, William Kemper wrote: > When verifying reachable objects, Shenandoah's verifier clears the `_gc_state` with the intention of deactivating barriers. However, the mechanism for this is a `friend` of the heap and does not toggle the flag to cause threads to use the value set on the verifier's safepoint. The net effect here is that the barriers are _not_ deactivated during verification. Leaving the barriers on while the verifier traverses the heap may have unintended consequences (cards marked, objects evacuated, etc.) This pull request has now been integrated. Changeset: 1bd0ce1f Author: William Kemper URL: https://git.openjdk.org/jdk/commit/1bd0ce1f51760d2e57e94b19b83d3ee0fa4aebcd Stats: 11 lines in 2 files changed: 7 ins; 0 del; 4 mod 8352918: Shenandoah: Verifier does not deactivate barriers as intended Reviewed-by: kdnilsen, shade, ysr ------------- PR: https://git.openjdk.org/jdk/pull/24264 From wkemper at openjdk.org Thu Mar 27 16:37:32 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 27 Mar 2025 16:37:32 GMT Subject: RFR: Merge openjdk/jdk21u:master [v2] In-Reply-To: References: Message-ID: > Merges tag jdk-21.0.7+5 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk21u/pull/162/files - new: https://git.openjdk.org/shenandoah-jdk21u/pull/162/files/4d3a3c0e..4d3a3c0e Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=162&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=162&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/162.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/162/head:pull/162 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/162 From wkemper at openjdk.org Thu Mar 27 16:37:33 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 27 Mar 2025 16:37:33 GMT Subject: Integrated: Merge openjdk/jdk21u:master In-Reply-To: References: Message-ID: On Thu, 27 Mar 2025 14:21:24 GMT, William Kemper wrote: > Merges tag jdk-21.0.7+5 This pull request has now been integrated. Changeset: 14ba6af6 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/14ba6af62b22730afd3a5fcae53df5956cacb890 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Merge ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/162 From kdnilsen at openjdk.org Thu Mar 27 19:35:49 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 27 Mar 2025 19:35:49 GMT Subject: RFR: 8344049: Shenandoah: Eliminate init-update-refs safepoint [v2] In-Reply-To: References: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> Message-ID: On Wed, 26 Mar 2025 19:09:04 GMT, William Kemper wrote: >> Not clean, has two follow up fixes in this PR. > > William Kemper has updated the pull request incrementally with three additional commits since the last revision: > > - Allow UPDATE_REFS in gc state when verifying before update refs > - Update gc state resetter for verifier > - Remove code block missed when IU mode was removed Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/161#pullrequestreview-2723198393 From gziemski at openjdk.org Thu Mar 27 19:43:13 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 27 Mar 2025 19:43:13 GMT Subject: RFR: 8344883: Do not use mtNone if we know the tag type Message-ID: This is a follow-up to #21843. Here we are focusing on removing the mem tag paremeter with default value of mtNone, to force everyone to provide mem tag, if known. I tried to fill in tag, when I was pretty certain that I had the right type. At least one more follow-up will be needed after this, to change the remaining mtNone to valid values. ------------- Commit messages: - work Changes: https://git.openjdk.org/jdk/pull/24282/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24282&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8344883 Stats: 145 lines in 47 files changed: 19 ins; 0 del; 126 mod Patch: https://git.openjdk.org/jdk/pull/24282.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24282/head:pull/24282 PR: https://git.openjdk.org/jdk/pull/24282 From gziemski at openjdk.org Thu Mar 27 19:47:56 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 27 Mar 2025 19:47:56 GMT Subject: RFR: 8344883: Do not use mtNone if we know the tag type [v2] In-Reply-To: References: Message-ID: > This is a follow-up to #21843. Here we are focusing on removing the mem tag paremeter with default value of mtNone, to force everyone to provide mem tag, if known. > > I tried to fill in tag, when I was pretty certain that I had the right type. > > At least one more follow-up will be needed after this, to change the remaining mtNone to valid values. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24282/files - new: https://git.openjdk.org/jdk/pull/24282/files/a749ee60..582b1860 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24282&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24282&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24282.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24282/head:pull/24282 PR: https://git.openjdk.org/jdk/pull/24282 From andrew at openjdk.org Thu Mar 27 20:03:44 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 27 Mar 2025 20:03:44 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 4 new changesets Message-ID: Changeset: 1067c545 Branch: master Author: Xin Liu Committer: Paul Hohensee Date: 2022-05-09 16:38:10 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/1067c545658ff0643ad625073f4fddc2f7a0daf0 8235211: serviceability/attach/RemovingUnixDomainSocketTest.java fails with AttachNotSupportedException: Unable to open socket file 8244973: serviceability/attach/RemovingUnixDomainSocketTest.java fails "stderr was not empty" Reviewed-by: phh, andrew Backport-of: 073e095e6053550b17b1daf33df2be4f4c4b40ad ! hotspot/src/os/aix/vm/attachListener_aix.cpp ! hotspot/src/os/bsd/vm/attachListener_bsd.cpp ! hotspot/src/os/linux/vm/attachListener_linux.cpp ! hotspot/test/serviceability/attach/RemovingUnixDomainSocketTest.java ! jdk/test/lib/jdk/test/lib/apps/LingeredApp.java Changeset: 1b3e8ea8 Branch: master Author: Dongbo He Committer: Fei Yang Date: 2022-05-10 01:13:38 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/1b3e8ea8ec14b462e86603d3ec6f35173d066541 8170530: bash configure output contains a typo in a suggested library name Reviewed-by: sgehwolf Backport-of: 5c5ffe13e3dced0962ce2b137c1c3b30f2f3a924 ! common/autoconf/generated-configure.sh ! common/autoconf/help.m4 Changeset: 51f69d91 Branch: master Author: Alexey Pavlyutkin Committer: Yuri Nesterenko Date: 2022-05-12 05:27:31 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/51f69d9125e72adaf05fbd04a5ac17a9d2f6c6a0 8221988: add possibility to build with Visual Studio 2019 Reviewed-by: sgehwolf, andrew Backport-of: f2240cc5b1f6e82c3d36551f36cd22916c1772bd ! common/autoconf/toolchain_windows.m4 + make/devkit/createWindowsDevkit2019.sh Changeset: 4471d67a Branch: master Author: Andrew John Hughes Date: 2025-03-20 18:10:02 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/4471d67a5047bf662e7b5928491d571337e3b300 Merge jdk8u342-b02 From andrew at openjdk.org Thu Mar 27 20:03:58 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 27 Mar 2025 20:03:58 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u342-b00 for changeset bdc2203a Message-ID: <4dff569f-f136-4064-852e-c1c3b813fcae@openjdk.org> Tagged by: Andrew John Hughes Date: 2025-03-22 00:19:22 +0000 Added tag shenandoah8u342-b00 for changeset bdc2203a44d Changeset: bdc2203a Author: Andrew John Hughes Date: 2023-02-20 21:45:34 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/bdc2203a44df159d94ecd0e04a230e65cb84297e From andrew at openjdk.org Thu Mar 27 20:04:01 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 27 Mar 2025 20:04:01 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u342-b02 for changeset 51f69d91 Message-ID: <64194cfd-365a-46b2-a1e0-c7833f04fe38@openjdk.org> Tagged by: Andrew John Hughes Date: 2022-05-16 14:00:19 +0000 Added tag jdk8u342-b02 for changeset 51f69d9125 Changeset: 51f69d91 Author: Alexey Pavlyutkin Committer: Yuri Nesterenko Date: 2022-05-12 05:27:31 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/51f69d9125e72adaf05fbd04a5ac17a9d2f6c6a0 From andrew at openjdk.org Thu Mar 27 20:04:12 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 27 Mar 2025 20:04:12 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u342-b02 for changeset 4471d67a Message-ID: <28c77177-0a1e-4094-b255-2d91eb917b8f@openjdk.org> Tagged by: Andrew John Hughes Date: 2025-03-20 18:49:32 +0000 Added tag shenandoah8u342-b02 for changeset 4471d67a504 Changeset: 4471d67a Author: Andrew John Hughes Date: 2025-03-20 18:10:02 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/4471d67a5047bf662e7b5928491d571337e3b300 From andrew at openjdk.org Thu Mar 27 20:05:56 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 27 Mar 2025 20:05:56 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: References: Message-ID: > Merge jdk8u342-b02 Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/16/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/16/files/4471d67a..4471d67a Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=16&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=16&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/16.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/16/head:pull/16 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/16 From iris at openjdk.org Thu Mar 27 20:05:56 2025 From: iris at openjdk.org (Iris Clark) Date: Thu, 27 Mar 2025 20:05:56 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: References: Message-ID: On Sat, 22 Mar 2025 00:20:37 GMT, Andrew John Hughes wrote: > Merge jdk8u342-b02 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/16 From wkemper at openjdk.org Thu Mar 27 22:09:05 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 27 Mar 2025 22:09:05 GMT Subject: RFR: 8351892: GenShen: Remove enforcement of generation sizes [v2] In-Reply-To: References: Message-ID: <-BEi4FpPLjKx07-J7ix9fHkKVhkcYylA0ojI-a1zrJs=.a3c073d3-7e52-46fd-8e2a-1ea601bd2074@github.com> > * The option to configure minimum and maximum sizes for the young generation have been combined into `ShenandoahInitYoungPercentage`. > * The remaining functionality in `shGenerationSizer` wasn't enough to warrant being its own class, so the functionality was rolled into `shGenerationalHeap`. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Don't let old have the entire heap ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24268/files - new: https://git.openjdk.org/jdk/pull/24268/files/e32ed37c..bc171089 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24268&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24268&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24268.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24268/head:pull/24268 PR: https://git.openjdk.org/jdk/pull/24268 From xpeng at openjdk.org Fri Mar 28 00:54:25 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 28 Mar 2025 00:54:25 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v14] In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 20:37:59 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add comments I have reproduced the bug https://bugs.openjdk.org/browse/JDK-8345399 on ppc64le hardware with tip, crash happens in a young cycle after a full GC, which is one of the problems I'm trying to fix in this PR: [13.990s][info][gc,start ] GC(101) Pause Full [13.990s][info][gc,task ] GC(101) Using 4 of 4 workers for full gc [13.990s][info][gc,start ] GC(101) Verify Before Full GC, Level 4 [13.998s][info][gc ] GC(101) Verify Before Full GC, Level 4 (22772 reachable, 0 marked) [13.998s][info][gc,phases,start] GC(101) Phase 1: Mark live objects [14.003s][info][gc,ref ] GC(101) Clearing All SoftReferences [14.003s][info][gc,ref ] GC(101) Clearing All SoftReferences [14.009s][info][gc,ref ] GC(101) Encountered references: Soft: 49, Weak: 101, Final: 0, Phantom: 8 [14.009s][info][gc,ref ] GC(101) Discovered references: Soft: 31, Weak: 39, Final: 0, Phantom: 8 [14.009s][info][gc,ref ] GC(101) Enqueued references: Soft: 0, Weak: 0, Final: 0, Phantom: 0 [14.012s][info][gc,phases ] GC(101) Phase 1: Mark live objects 13.674ms [14.012s][info][gc,phases,start] GC(101) Phase 2: Compute new object addresses [14.026s][info][gc,phases ] GC(101) Phase 2: Compute new object addresses 14.166ms [14.026s][info][gc,phases,start] GC(101) Phase 3: Adjust pointers [14.030s][info][gc,phases ] GC(101) Phase 3: Adjust pointers 3.626ms [14.030s][info][gc,phases,start] GC(101) Phase 4: Move objects [14.128s][info][gc,phases ] GC(101) Phase 4: Move objects 98.264ms [14.128s][info][gc,phases,start] GC(101) Phase 5: Full GC epilog [14.146s][info][gc,ergo ] GC(101) Transfer 234 region(s) from Old to Young, yielding increased size: 790M [14.146s][info][gc,ergo ] GC(101) FullGC done: young usage: 450M, old usage: 231M [14.146s][info][gc,free ] Free: 296M, Max: 512K regular, 296M humongous, Frag: 0% external, 0% internal; Used: 0B, Mutator Free: 592 Collector Reserve: 40959K, Max: 512K; Used: 16B Old Collector Reserve: 1307K, Max: 511K; Used: 740K [14.146s][info][gc,ergo ] GC(101) After Full GC, successfully transferred 0 regions to none to prepare for next gc, old available: 1307K, young_available: 296M [14.146s][info][gc,barrier ] GC(101) Cleaned read_table from 0x0000754a50290000 to 0x0000754a5048ffff [14.146s][info][gc,barrier ] GC(101) Current write_card_table: 0x0000754a4fc90000 [14.148s][info][gc,phases ] GC(101) Phase 5: Full GC epilog 20.265ms [14.148s][info][gc,start ] GC(101) Verify After Full GC, Level 4 [14.182s][info][gc ] GC(101) Verify After Full GC, Level 4 (22664 reachable, 125 marked) [14.182s][info][gc,ergo ] GC(101) At end of Full GC: GCU: 6.9%, MU: 9.9% during period of 0.261s [14.182s][info][gc,ergo ] GC(101) At end of Full GC: Young generation used: 450M, used regions: 454M, humongous waste: 3532K, soft capacity: 1024M, max capacity: 790M, available: 296M [14.182s][info][gc,ergo ] GC(101) At end of Full GC: Old generation used: 231M, used regions: 234M, humongous waste: 1654K, soft capacity: 0B, max capacity: 234M, available: 1307K [14.182s][info][gc,ergo ] GC(101) Good progress for free space: 296M, need 10485K [14.182s][info][gc,ergo ] GC(101) Good progress for used space: 148M, need 512K [14.182s][info][gc ] GC(101) Pause Full 829M->681M(1024M) 192.311ms ... [14.196s][info][gc ] Trigger (Young): Free (65536K) is below minimum threshold (80895K) [14.196s][info][gc,free ] Free: 65536K, Max: 512K regular, 65536K humongous, Frag: 0% external, 0% internal; Used: 0B, Mutator Free: 128 Collector Reserve: 40959K, Max: 512K; Used: 16B Old Collector Reserve: 1307K, Max: 511K; Used: 740K [14.196s][info][gc,ergo ] GC(102) Start GC cycle (Young) [14.196s][info][gc,start ] GC(102) Concurrent reset (Young) [14.196s][info][gc,task ] GC(102) Using 2 of 4 workers for Concurrent reset (Young) [14.196s][info][gc,ergo ] GC(102) Pacer for Reset. Non-Taxable: 1024M Allocated: 732 Mb Allocated: 699 Mb Allocated: 715 Mb [14.200s][info][gc,thread ] Cancelling GC: unknown GCCause [14.200s][info][gc ] Failed to allocate Shared, 61709K [14.202s][info][gc ] GC(102) Concurrent reset (Young) 6.371ms [14.203s][info][gc,barrier ] GC(102) Cleaned read_table from 0x0000754a50080000 to 0x0000754a5027ffff [14.203s][info][gc,start ] GC(102) Pause Init Mark (Young) [14.203s][info][gc,task ] GC(102) Using 4 of 4 workers for init marking [14.205s][info][gc,barrier ] GC(102) Current write_card_table: 0x0000754a4fa80000 [14.205s][info][gc,start ] GC(102) Verify Before Mark, Level 4 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/xlpeng/repos/jdk/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp:1270), pid=2167519, tid=2167538 # Error: Verify init-mark remembered set violation; clean card, it should be dirty. Referenced from: interior location: 0x00000000c00c2bfc inside Java heap not in collection set region: | 1|R |O|BTE c0080000, c00c2c78, c0100000|TAMS c0080000|UWM c00c2c78|U 267K|T 0B|G 0B|P 0B|S 267K|L 267K|CP 0 Object: 0x00000000e8c00000 - klass 0x000001df001abfa0 [I not allocated after mark start not after update watermark not marked strong not marked weak not in collection set age: 0 mark: mark(is_unlocked no_hash age=0) region: | 1304|H |Y|BTE e8c00000, e8c80000, e8c80000|TAMS e8c80000|UWM e8c80000|U 512K|T 0B|G 0B|P 0B|S 512K|L 0B|CP 0 Forwardee: (the object itself) I'll run the same test to confirm whether this PR fix the bug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24092#issuecomment-2759904598 From ysr at openjdk.org Fri Mar 28 01:56:53 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 28 Mar 2025 01:56:53 GMT Subject: Withdrawn: 8342570: gc.*.gcUntilOld() assumes that full gc always promotes objects to old generation in a generational heap In-Reply-To: References: Message-ID: On Mon, 21 Oct 2024 15:42:07 GMT, Y. Srinivas Ramakrishna wrote: > Some gc's such as Generational Shenandoah may run in modes where we choose not to (ever) promote objects to the old generation. The tests do not need the assumption of eventual promotion of an object into the old generation in such cases and can be relaxed. A new WhiteBox API is queried by the tests to check for this. > > **Testing**: Ran the tests with all the existing collectors. For existing collectors, there is no change in the tests. For Generational Shenandoah, being the only collector that does not necessarily promote objects to the old generation upon a full gc, we relax that requirement from the tests. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah/pull/521 From stefank at openjdk.org Fri Mar 28 08:28:23 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 28 Mar 2025 08:28:23 GMT Subject: RFR: 8344883: Do not use mtNone if we know the tag type [v2] In-Reply-To: References: Message-ID: On Thu, 27 Mar 2025 19:47:56 GMT, Gerard Ziemski wrote: >> This is a follow-up to #21843. Here we are focusing on removing the mem tag paremeter with default value of mtNone, to force everyone to provide mem tag, if known. >> >> I tried to fill in tag, when I was pretty certain that I had the right type. >> >> At least one more follow-up will be needed after this, to change the remaining mtNone to valid values. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > work I went over the patch and added suggestions for places where I think you're using the wrong tag, or where I think it is obvious that there's a better tag than mtNone. I've also suggested removal of the now redundant 'executable' argument, which I want to see as little of as possible given that it is a wart on the memory reservation APIs (IMHO). src/hotspot/os/bsd/gc/z/zPhysicalMemoryBacking_bsd.cpp line 81: > 79: > 80: // Reserve address space for backing memory > 81: _base = (uintptr_t)os::reserve_memory(max_capacity, mtJavaHeap, false); Suggestion: _base = (uintptr_t)os::reserve_memory(max_capacity, mtJavaHeap); src/hotspot/os/windows/os_windows.cpp line 3261: > 3259: assert(extra_size >= size, "overflow, size is too large to allow alignment"); > 3260: > 3261: Suggestion: src/hotspot/os/windows/os_windows.cpp line 3267: > 3265: for (int attempt = 0; attempt < max_attempts && aligned_base == nullptr; attempt ++) { > 3266: char* extra_base = file_desc != -1 ? os::map_memory_to_file(extra_size, file_desc, mem_tag) : > 3267: os::reserve_memory(extra_size, mem_tag, false); Suggestion: os::reserve_memory(extra_size, mem_tag); src/hotspot/os/windows/os_windows.cpp line 3284: > 3282: // Which may fail, hence the loop. > 3283: aligned_base = file_desc != -1 ? os::attempt_map_memory_to_file_at(aligned_base, size, file_desc, mem_tag) : > 3284: os::attempt_reserve_memory_at(aligned_base, size, mem_tag, false); Suggestion: os::attempt_reserve_memory_at(aligned_base, size, mem_tag); src/hotspot/os/windows/perfMemory_windows.cpp line 57: > 55: > 56: // allocate an aligned chuck of memory > 57: char* mapAddress = os::reserve_memory(size, mtNone); To match with the other platforms: Suggestion: char* mapAddress = os::reserve_memory(size, mtInternal); src/hotspot/share/classfile/compactHashtable.cpp line 229: > 227: quit("Unable to open hashtable dump file", filename); > 228: } > 229: _base = os::map_memory(_fd, filename, 0, nullptr, _size, mtSymbol, true, false); This seems to be used to read Symbols *OR* String. This probably needs to be something else. I suggest to revert to mtNone and figure out the appropriate tag later. Suggestion: _base = os::map_memory(_fd, filename, 0, nullptr, _size, mtNone, true, false); src/hotspot/share/gc/parallel/parMarkBitMap.cpp line 52: > 50: rs_align, > 51: page_sz, > 52: mtNone); Suggestion: mtGC); src/hotspot/share/gc/shenandoah/shenandoahCardTable.cpp line 62: > 60: _write_byte_map_base = _byte_map_base; > 61: > 62: ReservedSpace read_space = MemoryReserver::reserve(_byte_map_size, rs_align, _page_size, mtNone); Suggestion: ReservedSpace read_space = MemoryReserver::reserve(_byte_map_size, rs_align, _page_size, mtGC); src/hotspot/share/memory/allocation.inline.hpp line 61: > 59: size_t size = size_for(length); > 60: > 61: char* addr = os::reserve_memory(size, mem_tag, !ExecMem); Suggestion: char* addr = os::reserve_memory(size, mem_tag); src/hotspot/share/memory/allocation.inline.hpp line 78: > 76: size_t size = size_for(length); > 77: > 78: char* addr = os::reserve_memory(size, mem_tag, !ExecMem); Suggestion: char* addr = os::reserve_memory(size, mem_tag); src/hotspot/share/memory/metaspace/testHelpers.cpp line 85: > 83: if (reserve_limit > 0) { > 84: // have reserve limit -> non-expandable context > 85: _rs = MemoryReserver::reserve(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtNone); Suggestion: _rs = MemoryReserver::reserve(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtTest); src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 259: > 257: ReservedSpace rs = MemoryReserver::reserve(word_size * BytesPerWord, > 258: Settings::virtual_space_node_reserve_alignment_words() * BytesPerWord, > 259: os::vm_page_size(), mtNone); Suggestion: os::vm_page_size(), mtMetaspace); src/hotspot/share/prims/jni.cpp line 2403: > 2401: if (bad_address == nullptr) { > 2402: size_t size = os::vm_allocation_granularity(); > 2403: bad_address = os::reserve_memory(size, mtInternal, false); Suggestion: bad_address = os::reserve_memory(size, mtInternal); src/hotspot/share/prims/whitebox.cpp line 720: > 718: > 719: WB_ENTRY(jlong, WB_NMTReserveMemory(JNIEnv* env, jobject o, jlong size)) > 720: return (jlong)(uintptr_t)os::reserve_memory(size, mtTest, false); Suggestion: return (jlong)(uintptr_t)os::reserve_memory(size, mtTest); src/hotspot/share/prims/whitebox.cpp line 724: > 722: > 723: WB_ENTRY(jlong, WB_NMTAttemptReserveMemoryAt(JNIEnv* env, jobject o, jlong addr, jlong size)) > 724: return (jlong)(uintptr_t)os::attempt_reserve_memory_at((char*)(uintptr_t)addr, (size_t)size, mtTest, false); Suggestion: return (jlong)(uintptr_t)os::attempt_reserve_memory_at((char*)(uintptr_t)addr, (size_t)size, mtTest); src/hotspot/share/prims/whitebox.cpp line 1515: > 1513: static volatile char* p; > 1514: > 1515: p = os::reserve_memory(os::vm_allocation_granularity(), mtSymbol); Suggestion: p = os::reserve_memory(os::vm_allocation_granularity(), mtTest); src/hotspot/share/runtime/os.cpp line 2130: > 2128: log_trace(os, map)(ERRFMT, ERRFMTARGS); > 2129: log_debug(os, map)("successfully attached at " PTR_FORMAT, p2i(result)); > 2130: MemTracker::record_virtual_memory_reserve((address)result, bytes, CALLER_PC, mtNone); I think attempt_reserve_memory_between should provide the correct mem tag. src/hotspot/share/runtime/os.cpp line 2336: > 2334: if (result != nullptr) { > 2335: // The memory is committed > 2336: MemTracker::record_virtual_memory_reserve_and_commit((address)result, size, CALLER_PC, mtNone); reserve_memory_special should take a mem tag, but I guess you intend to do that as a follow-up RFE? src/hotspot/share/runtime/safepointMechanism.cpp line 60: > 58: const size_t page_size = os::vm_page_size(); > 59: const size_t allocation_size = 2 * page_size; > 60: char* polling_page = os::reserve_memory(allocation_size, mtSafepoint, !ExecMem); Suggestion: char* polling_page = os::reserve_memory(allocation_size, mtSafepoint); src/hotspot/share/utilities/debug.cpp line 715: > 713: #ifdef CAN_SHOW_REGISTERS_ON_ASSERT > 714: void initialize_assert_poison() { > 715: char* page = os::reserve_memory(os::vm_page_size(), mtInternal, !ExecMem); Suggestion: char* page = os::reserve_memory(os::vm_page_size(), mtInternal); test/hotspot/gtest/gc/g1/test_stressCommitUncommit.cpp line 86: > 84: os::vm_allocation_granularity(), > 85: os::vm_page_size(), > 86: mtNone); Suggestion: mtTest); test/hotspot/gtest/gc/g1/test_stressCommitUncommit.cpp line 113: > 111: os::vm_allocation_granularity(), > 112: os::vm_page_size(), > 113: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 62: > 60: ASSERT_PRED2(is_size_aligned, size, os::vm_allocation_granularity()); > 61: > 62: ReservedSpace rs = MemoryReserver::reserve(size, mtNone); Why did you do this change? Suggestion: ReservedSpace rs = MemoryReserver::reserve(size, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 76: > 74: ASSERT_PRED2(is_size_aligned, size, alignment) << "Incorrect input parameters"; > 75: size_t page_size = UseLargePages ? os::large_page_size() : os::vm_page_size(); > 76: ReservedSpace rs = MemoryReserver::reserve(size, alignment, page_size, mtNone); Suggestion: ReservedSpace rs = MemoryReserver::reserve(size, alignment, page_size, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 104: > 102: size_t page_size = large ? os::large_page_size() : os::vm_page_size(); > 103: > 104: ReservedSpace rs = MemoryReserver::reserve(size, alignment, page_size, mtNone); Suggestion: ReservedSpace rs = MemoryReserver::reserve(size, alignment, page_size, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 215: > 213: case Default: > 214: case Reserve: > 215: return MemoryReserver::reserve(reserve_size_aligned, mtNone); Suggestion: return MemoryReserver::reserve(reserve_size_aligned, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 221: > 219: os::vm_allocation_granularity(), > 220: os::vm_page_size(), > 221: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 300: > 298: size_t large_page_size = os::large_page_size(); > 299: > 300: ReservedSpace reserved = MemoryReserver::reserve(large_page_size, large_page_size, large_page_size, mtNone); Suggestion: ReservedSpace reserved = MemoryReserver::reserve(large_page_size, large_page_size, large_page_size, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 370: > 368: alignment, > 369: page_size, > 370: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 388: > 386: ASSERT_TRUE(is_aligned(size, os::vm_allocation_granularity())) << "Must be at least AG aligned"; > 387: > 388: ReservedSpace rs = MemoryReserver::reserve(size, mtNone); Suggestion: ReservedSpace rs = MemoryReserver::reserve(size, mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 416: > 414: alignment, > 415: page_size, > 416: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 521: > 519: case Reserve: > 520: return MemoryReserver::reserve(reserve_size_aligned, > 521: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 527: > 525: os::vm_allocation_granularity(), > 526: os::vm_page_size(), > 527: mtNone); Suggestion: mtTest); test/hotspot/gtest/memory/test_virtualspace.cpp line 585: > 583: large_page_size, > 584: large_page_size, > 585: mtNone); Suggestion: mtTest); test/hotspot/gtest/nmt/test_nmt_locationprinting.cpp line 116: > 114: > 115: static void test_for_mmap(size_t sz, ssize_t offset) { > 116: char* addr = os::reserve_memory(sz, mtTest, false); Suggestion: char* addr = os::reserve_memory(sz, mtTest); test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp line 94: > 92: const size_t page_sz = os::vm_page_size(); > 93: const size_t size = num_pages * page_sz; > 94: char* base = os::reserve_memory(size, mtThreadStack, !ExecMem); Suggestion: char* base = os::reserve_memory(size, mtThreadStack); test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp line 162: > 160: const size_t num_pages = 4; > 161: const size_t size = num_pages * page_sz; > 162: char* base = os::reserve_memory(size, mtTest, !ExecMem); Suggestion: char* base = os::reserve_memory(size, mtTest); test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp line 208: > 206: const size_t size = num_pages * page_sz; > 207: > 208: char* base = os::reserve_memory(size, mtTest, !ExecMem); Suggestion: char* base = os::reserve_memory(size, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 261: > 259: // two pages, first one protected. > 260: const size_t ps = os::vm_page_size(); > 261: char* two_pages = os::reserve_memory(ps * 2, mtTest, false); Suggestion: char* two_pages = os::reserve_memory(ps * 2, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 533: > 531: size_t total_range_len = num_stripes * stripe_len; > 532: // Reserve a large contiguous area to get the address space... > 533: p = (address)os::reserve_memory(total_range_len, mtNone); Suggestion: p = (address)os::reserve_memory(total_range_len, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 547: > 545: const bool executable = stripe % 2 == 0; > 546: #endif > 547: q = (address)os::attempt_reserve_memory_at((char*)q, stripe_len, mtNone, executable); Suggestion: q = (address)os::attempt_reserve_memory_at((char*)q, stripe_len, mtTest, executable); test/hotspot/gtest/runtime/test_os.cpp line 567: > 565: assert(is_aligned(stripe_len, os::vm_allocation_granularity()), "Sanity"); > 566: size_t total_range_len = num_stripes * stripe_len; > 567: address p = (address)os::reserve_memory(total_range_len, mtNone); Suggestion: address p = (address)os::reserve_memory(total_range_len, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 634: > 632: > 633: // ...re-reserve the middle stripes. This should work unless release silently failed. > 634: address p2 = (address)os::attempt_reserve_memory_at((char*)p_middle_stripes, middle_stripe_len, mtNone); Suggestion: address p2 = (address)os::attempt_reserve_memory_at((char*)p_middle_stripes, middle_stripe_len, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 657: > 655: TEST_VM(os, release_bad_ranges) { > 656: #endif > 657: char* p = os::reserve_memory(4 * M, mtNone); Suggestion: char* p = os::reserve_memory(4 * M, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 692: > 690: // // make things even more difficult by trying to reserve at the border of the region > 691: address border = p + num_stripes * stripe_len; > 692: address p2 = (address)os::attempt_reserve_memory_at((char*)border, stripe_len, mtNone); Suggestion: address p2 = (address)os::attempt_reserve_memory_at((char*)border, stripe_len, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 733: > 731: // Reserve a small range and fill it with a marker string, should show up > 732: // on implementations displaying range snippets > 733: char* p = os::reserve_memory(1 * M, mtInternal, false); Suggestion: char* p = os::reserve_memory(1 * M, mtInternal); test/hotspot/gtest/runtime/test_os.cpp line 757: > 755: // A simple allocation > 756: { > 757: address p = (address)os::reserve_memory(total_range_len, mtNone); Suggestion: address p = (address)os::reserve_memory(total_range_len, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 1062: > 1060: > 1061: TEST_VM(os, reserve_at_wish_address_shall_not_replace_mappings_smallpages) { > 1062: char* p1 = os::reserve_memory(M, mtTest, false); Suggestion: char* p1 = os::reserve_memory(M, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 1072: > 1070: if (UseLargePages && !os::can_commit_large_page_memory()) { // aka special > 1071: const size_t lpsz = os::large_page_size(); > 1072: char* p1 = os::reserve_memory_aligned(lpsz, lpsz, mtTest, false); Suggestion: char* p1 = os::reserve_memory_aligned(lpsz, lpsz, mtTest); test/hotspot/gtest/runtime/test_os.cpp line 1098: > 1096: const size_t size = pages * page_sz; > 1097: > 1098: char* base = os::reserve_memory(size, mtTest, false); Suggestion: char* base = os::reserve_memory(size, mtTest); test/hotspot/gtest/runtime/test_os_linux.cpp line 357: > 355: const bool useThp = UseTransparentHugePages; > 356: UseTransparentHugePages = true; > 357: char* const heap = os::reserve_memory(size, mtInternal, false); Suggestion: char* const heap = os::reserve_memory(size, mtInternal); ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24282#pullrequestreview-2724588390 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018111000 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018112894 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018113093 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018113247 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018114051 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018121233 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018121891 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018122312 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018122564 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018122715 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018123591 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018124074 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018125155 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018125568 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018125876 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018126443 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018128479 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018129358 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018098166 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018098617 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018098956 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018099157 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018100108 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018100521 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018100717 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018100870 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101049 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101200 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101373 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101541 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101692 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018101911 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018102064 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018102298 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018102633 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018103596 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018103912 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018104093 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018104376 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018104743 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105030 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105216 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105410 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105552 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105686 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105792 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018105928 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018106123 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018106324 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018106474 PR Review Comment: https://git.openjdk.org/jdk/pull/24282#discussion_r2018107303 From duke at openjdk.org Fri Mar 28 13:03:48 2025 From: duke at openjdk.org (Zihao Lin) Date: Fri, 28 Mar 2025 13:03:48 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v4] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/08c1a382..f6b2fbec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=02-03 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From gziemski at openjdk.org Fri Mar 28 14:33:40 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Fri, 28 Mar 2025 14:33:40 GMT Subject: RFR: 8344883: Do not use mtNone if we know the tag type [v3] In-Reply-To: References: Message-ID: > This is a follow-up to #21843. Here we are focusing on removing the mem tag paremeter with default value of mtNone, to force everyone to provide mem tag, if known. > > I tried to fill in tag, when I was pretty certain that I had the right type. > > At least one more follow-up will be needed after this, to change the remaining mtNone to valid values. Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into JDK-8344883 - work - work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24282/files - new: https://git.openjdk.org/jdk/pull/24282/files/582b1860..5a1a75e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24282&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24282&range=01-02 Stats: 2531 lines in 75 files changed: 1529 ins; 871 del; 131 mod Patch: https://git.openjdk.org/jdk/pull/24282.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24282/head:pull/24282 PR: https://git.openjdk.org/jdk/pull/24282 From gziemski at openjdk.org Fri Mar 28 14:58:22 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Fri, 28 Mar 2025 14:58:22 GMT Subject: RFR: 8344883: Do not use mtNone if we know the tag type [v2] In-Reply-To: References: Message-ID: On Fri, 28 Mar 2025 08:25:08 GMT, Stefan Karlsson wrote: > I went over the patch and added suggestions for places where I think you're using the wrong tag, or where I think it is obvious that there's a better tag than mtNone. I've also suggested removal of the now redundant 'executable' argument, which I want to see as little of as possible given that it is a wart on the memory reservation APIs (IMHO). Thank you for the feedback! I was wondering whether I should get rid of the exec, since it has the default parameter value, so I am glad that you have suggested it. It will be a definite improvement to get rid of it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24282#issuecomment-2761605696 From wkemper at openjdk.org Fri Mar 28 19:47:31 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 28 Mar 2025 19:47:31 GMT Subject: Integrated: 8344049: Shenandoah: Eliminate init-update-refs safepoint In-Reply-To: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> References: <8gwa_ocC60WaJ-nI6d-IOIcHOqVdY_1a-OI68Uot3lg=.ae0588ae-607f-429b-bd41-8539727519b3@github.com> Message-ID: On Mon, 24 Mar 2025 23:20:30 GMT, William Kemper wrote: > Not clean, has two follow up fixes in this PR. This pull request has now been integrated. Changeset: b5cf88ce Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/b5cf88ce43050a5a54386f1347fe84a42fe08da3 Stats: 328 lines in 15 files changed: 186 ins; 92 del; 50 mod 8344049: Shenandoah: Eliminate init-update-refs safepoint 8344050: Shenandoah: Retire GC LABs concurrently 8344055: Shenandoah: Make all threads use local gc state 8348268: Test gc/shenandoah/TestResizeTLAB.java#compact: fatal error: Before Updating References: Thread C2 CompilerThread1: expected gc-state 9, actual 21 8348092: Shenandoah: assert(nk >= _lowest_valid_narrow_klass_id && nk <= _highest_valid_narrow_klass_id) failed: narrowKlass ID out of range (3131947710) Reviewed-by: kdnilsen Backport-of: 764d70b7df18e288582e616c62b0d7078f1ff3aa ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/161 From wkemper at openjdk.org Fri Mar 28 21:12:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 28 Mar 2025 21:12:08 GMT Subject: RFR: 8347617: Shenandoah: Use consistent name for update references phase Message-ID: Trivial merge conflict. ------------- Commit messages: - Backport 0330ca4221ba7bacb0eaeed1a8cdc3d5c3653a83 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/163/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=163&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347617 Stats: 64 lines in 18 files changed: 0 ins; 0 del; 64 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/163.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/163/head:pull/163 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/163 From kdnilsen at openjdk.org Sat Mar 29 00:07:40 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 29 Mar 2025 00:07:40 GMT Subject: RFR: 8347617: Shenandoah: Use consistent name for update references phase In-Reply-To: References: Message-ID: On Fri, 28 Mar 2025 21:06:47 GMT, William Kemper wrote: > Trivial merge conflict. Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/163#pullrequestreview-2727090760 From kdnilsen at openjdk.org Sat Mar 29 00:15:08 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 29 Mar 2025 00:15:08 GMT Subject: RFR: 8351892: GenShen: Remove enforcement of generation sizes [v2] In-Reply-To: <-BEi4FpPLjKx07-J7ix9fHkKVhkcYylA0ojI-a1zrJs=.a3c073d3-7e52-46fd-8e2a-1ea601bd2074@github.com> References: <-BEi4FpPLjKx07-J7ix9fHkKVhkcYylA0ojI-a1zrJs=.a3c073d3-7e52-46fd-8e2a-1ea601bd2074@github.com> Message-ID: On Thu, 27 Mar 2025 22:09:05 GMT, William Kemper wrote: >> * The option to configure minimum and maximum sizes for the young generation have been combined into `ShenandoahInitYoungPercentage`. >> * The remaining functionality in `shGenerationSizer` wasn't enough to warrant being its own class, so the functionality was rolled into `shGenerationalHeap`. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Don't let old have the entire heap Looks good. Thanks for this simplification and improved consistency. src/hotspot/share/gc/shenandoah/shenandoahGenerationalFullGC.cpp line 120: > 118: if (old_capacity > old_usage) { > 119: size_t excess_old_regions = (old_capacity - old_usage) / ShenandoahHeapRegion::region_size_bytes(); > 120: gen_heap->transfer_to_young(excess_old_regions); should we assert result is successful? Or replace with force_transfer? (just seems bad practice to ignore a status result) src/hotspot/share/gc/shenandoah/shenandoahGenerationalHeap.cpp line 134: > 132: ShenandoahHeap::initialize_heuristics(); > 133: > 134: // Max capacity is the maximum _allowed_ capacity. This means the sum of the maximum I don't understand the relevance of this comment. Is there still a mximum allowed for old and a maximum allowed for young? ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/24268#pullrequestreview-2727093354 PR Review Comment: https://git.openjdk.org/jdk/pull/24268#discussion_r2019606575 PR Review Comment: https://git.openjdk.org/jdk/pull/24268#discussion_r2019607905 From duke at openjdk.org Sat Mar 29 00:35:12 2025 From: duke at openjdk.org (duke) Date: Sat, 29 Mar 2025 00:35:12 GMT Subject: RFR: 8352185: Shenandoah: Invalid logic for remembered set verification [v14] In-Reply-To: References: Message-ID: <0SZ62G1n4JFHTQ0XnfQMmWTp5Wkhi9SFS0f22p5cgA8=.7b47c928-2606-4bbc-a35d-fbbc3366633c@github.com> On Wed, 26 Mar 2025 20:37:59 GMT, Xiaolong Peng wrote: >> There are some scenarios in which GenShen may have improper remembered set verification logic: >> >> 1. Concurrent young cycles following a Full GC: >> >> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification >> >> >> ShenandoahVerifier >> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() { >> shenandoah_assert_generations_reconciled(); >> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) { >> return _heap->complete_marking_context(); >> } >> return nullptr; >> } >> >> >> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result. >> >> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case. >> >> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2. >> >> 4. After concurrent young cycle evacuates objects from a young region, it update refs using marking bitmaps from marking context, therefore it won't update references of dead old objects(is_marked(obj) is false: obj is not marking strong/weak and it is below tams). In this case, if the next cycle if global concurrent GC, remembered set can't be verified before init-mark because of the dead pointers. >> >> ### Solution >> * After a full GC, always set marking completeness flag to false after reseting the marking bitmaps. >> * Because there could be dead pointers in old gen were not updated to point to new address after evacuation and refs update, we should disable rem-set validation before init-mark&update-refs if old marking context is incomplete. >> >> ### Test >> - [x] `make test TEST=hotspot_gc_shenandoah` >> - [x] GHA > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add comments @pengxiaolong Your change (at version e11c6fc3f8ccc25064be26c87273d10125540222) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24092#issuecomment-2762939177 From duke at openjdk.org Sat Mar 29 07:19:21 2025 From: duke at openjdk.org (Zihao Lin) Date: Sat, 29 Mar 2025 07:19:21 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v5] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with two additional commits since the last revision: - Fix build - Fix test failed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/f6b2fbec..a1924c35 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=03-04 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258