From kdnilsen at openjdk.org Sat Nov 1 05:40:56 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 1 Nov 2025 05:40:56 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v4] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: fix bugs in implementation of weakly referenced object handling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/80198abe..e16ea231 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=02-03 Stats: 9 lines in 2 files changed: 1 ins; 2 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From kbarrett at openjdk.org Sat Nov 1 07:34:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 1 Nov 2025 07:34:03 GMT Subject: RFR: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library [v2] In-Reply-To: <4Yh4KeUItjdDYihe9d1u66tBROofb0pRLDKIdhw-XZo=.cf383363-c963-4871-941d-5e358f74c048@github.com> References:

<6u0Ia6A2xQnv51Ti0Jchg6rnncfRRvtK5oGS963CeIA=.bc9c1481-60ad-443d-923b-dc86277dbb14@github.com> <4Yh4KeUItjdDYihe9d1u66tBROofb0pRLDKIdhw-XZo=.cf383363-c963-4871-941d-5e358f74c048@github.com> Message-ID: On Mon, 27 Oct 2025 08:24:41 GMT, Florian Weimer wrote: >> We (you and me, @fweimer-rh) discussed this a couple of years ago: >> https://mail.openjdk.org/pipermail/hotspot-dev/2023-December/082324.html >> >> Quoting from here: >> https://mail.openjdk.org/pipermail/hotspot-dev/2023-December/083142.html >> >> " >> Empirically, a recursive initialization attempt doesn't make any attempt to >> throw. Rather, it blocks forever waiting for a futex signal from a thread that >> succeeds in the initialization. Which of course will never come. >> >> And that makes sense, now that I've looked at the code. >> >> In __cxa_guard_acquire, with _GLIBCXX_USE_FUTEX, if the guard indicates >> initialization hasn't yet been completed, then it goes into a while loop. >> This while loop tries to claim initialization. Failing that, it checks >> whether initialization is complete. Failing that, it does a SYS_futex >> syscall, waiting for some other thread to perform the initialization. There's >> nothing there to check for recursion. >> >> throw_recursive_init_exception is only called if single-threaded (either by >> configuration or at runtime). >> " >> >> It doesn't look like there have been any relevant changes in that area since >> then. So I think there is still not a problem here. > > @kimbarrett Sorry, I forgot about the old thread. You can get the exception in a single-threaded scenario, something like this: > > > struct S { > S() { > static S s; > *this = s; > } > } global; > > > Maybe the actual rule is more like this? > >> Functions that may throw exceptions must not be used, unless individual calls ensure that these particular invocations cannot throw exceptions. Recursively entering a block-scoped static is undefined behavior. That some configurations of glibc might throw an exception in that situation (even despite the caller being compiled with exceptions disabled) seems like a mistake in glibc, and not really our concern. Our code should avoid such a situation because it's UB, regardless of whether the actual behavior involves exceptions or nasal demons. The exception only gets thrown when the application is single-threaded. But at least the common way to start java (via the launcher) is already multi-threaded on entry to Threads::create_vm(). So that case doesn't normally apply to us anyway. Also, I really don't think we want people trying to figure out whether a particular call might or might not throw (neither when writing nor when reading code). So no, I don't think the proposed rule should be changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27601#discussion_r2483178071 From kvn at openjdk.org Sat Nov 1 15:06:06 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 1 Nov 2025 15:06:06 GMT Subject: RFR: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library [v4] In-Reply-To: References:

Message-ID: On Sat, 18 Oct 2025 17:11:48 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to suggest that C++ >> Standard Library components may be used, after appropriate vetting and >> discussion, rather than just a blanket "no, don't use it" with a few very >> narrow exceptions. It provides some guidance on that vetting process and >> the criteria to use, along with usage patterns. >> >> In particular, it proposes that Standard Library headers should not be >> included directly, but instead through HotSpot-provided wrapper headers. This >> gives us a place to document usage, provide workarounds for platform issues in >> a single place, and so on. >> >> Such wrapper headers are provided by this PR for ``, ``, and >> ``, along with updates to use them. I have a separate change for >> `` that I plan to propose later, under JDK-8369187. There will be >> additional followups for other C compatibility headers besides ``. >> >> This PR also cleans up some nomenclature issues around forbid vs exclude and >> the like. >> >> Testing: mach5 tier1-5, GHA sanity tests > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - jrose comments > - move tuple to undecided category > - add wrapper for > - add wrapper for > - add wrapper for > - style guide permits some standard library facilities Thanks for reviews and comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27601#issuecomment-3477511265 From kbarrett at openjdk.org Sun Nov 2 07:05:26 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 2 Nov 2025 07:05:26 GMT Subject: Integrated: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library In-Reply-To: References: Message-ID: On Thu, 2 Oct 2025 07:11:51 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide to suggest that C++ > Standard Library components may be used, after appropriate vetting and > discussion, rather than just a blanket "no, don't use it" with a few very > narrow exceptions. It provides some guidance on that vetting process and > the criteria to use, along with usage patterns. > > In particular, it proposes that Standard Library headers should not be > included directly, but instead through HotSpot-provided wrapper headers. This > gives us a place to document usage, provide workarounds for platform issues in > a single place, and so on. > > Such wrapper headers are provided by this PR for ``, ``, and > ``, along with updates to use them. I have a separate change for > `` that I plan to propose later, under JDK-8369187. There will be > additional followups for other C compatibility headers besides ``. > > This PR also cleans up some nomenclature issues around forbid vs exclude and > the like. > > Testing: mach5 tier1-5, GHA sanity tests This pull request has now been integrated. Changeset: e8a1a870 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/e8a1a8707ee6192c85ac62a2a51c815e07613c38 Stats: 670 lines in 68 files changed: 430 ins; 134 del; 106 mod 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library Reviewed-by: jrose, lkorinth, iwalulya, kvn, stefank ------------- PR: https://git.openjdk.org/jdk/pull/27601 From andrew at openjdk.org Sun Nov 2 22:55:48 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Sun, 2 Nov 2025 22:55:48 GMT Subject: RFR: Merge jdk8u:master Message-ID: <4Bn0JXdtXoqusP9GZa4nbc8wHQ67W8eGH4KioZ69XK8=.7bfd3448-9996-4fc7-aaac-8285c7e626c3@github.com> Merge `jdk8u472-b08` ------------- Commit messages: - Merge jdk8u472-b08 - 8360937: Enhance certificate handling - 8356294: Enhance Path Factories - 8352637: Enhance bytecode verification The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/114/files Stats: 196 lines in 11 files changed: 157 ins; 8 del; 31 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/114.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/114/head:pull/114 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/114 From jsikstro at openjdk.org Mon Nov 3 09:57:38 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 3 Nov 2025 09:57:38 GMT Subject: RFR: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods Message-ID: Hello, The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. Testing: * Running through tier1-2 ------------- Commit messages: - 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods Changes: https://git.openjdk.org/jdk/pull/28107/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28107&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371131 Stats: 65 lines in 20 files changed: 0 ins; 2 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/28107.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28107/head:pull/28107 PR: https://git.openjdk.org/jdk/pull/28107 From jsikstro at openjdk.org Mon Nov 3 09:57:39 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 3 Nov 2025 09:57:39 GMT Subject: RFR: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 09:49:02 GMT, Joel Sikstr?m wrote: > Hello, > > The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. > > Testing: > * Running through tier1-2 src/hotspot/share/gc/shared/threadLocalAllocBuffer.inline.hpp line 57: > 55: // Compute the size for the new TLAB. > 56: // The "last" tlab may be smaller to reduce fragmentation. > 57: // unsafe_max_tlab_alloc is just a hint. I suggest we remove this comment as it is misleading since not all GCs take the `available_size` calculated here as a hint, but an unconditional size. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28107#discussion_r2485894298 From ayang at openjdk.org Mon Nov 3 10:02:04 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 3 Nov 2025 10:02:04 GMT Subject: RFR: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 09:49:02 GMT, Joel Sikstr?m wrote: > Hello, > > The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. > > Testing: > * Running through tier1-2 Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28107#pullrequestreview-3410254336 From tschatzl at openjdk.org Mon Nov 3 12:09:23 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 3 Nov 2025 12:09:23 GMT Subject: RFR: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods In-Reply-To: References: Message-ID: <88fRRh2O2B2STEwKyd27MN-c6akGhu6JIRd5IWXDzXo=.2d643739-44c4-4f3b-83b7-6def71ff28a4@github.com> On Mon, 3 Nov 2025 09:49:02 GMT, Joel Sikstr?m wrote: > Hello, > > The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. > > Testing: > * Oracle's tier1-2 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28107#pullrequestreview-3410714487 From wkemper at openjdk.org Mon Nov 3 18:01:31 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 3 Nov 2025 18:01:31 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: References: Message-ID: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> > When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. > > # Background > When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots - Flush SATB buffers upon entering degenerated cycle when old marking is in progress This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. - Fix typo in comment - Remove duplicate satb flush closure - Only flush satb once during degenerated cycle - Cleanup and comments - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots - Fix assertion - Oops, move inline definition out of ifdef ASSERT - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de ------------- Changes: https://git.openjdk.org/jdk/pull/27983/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27983&range=02 Stats: 309 lines in 11 files changed: 98 ins; 188 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/27983.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27983/head:pull/27983 PR: https://git.openjdk.org/jdk/pull/27983 From duke at openjdk.org Tue Nov 4 00:11:20 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Nov 2025 00:11:20 GMT Subject: Withdrawn: 8354555: Add generic JFR events for TaskTerminator In-Reply-To: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> References: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> Message-ID: <6C63KrwSNS7jTc5SHtpknktCbt6kCAx12FM6NDEDPt8=.0da94a30-3bae-40ac-b4c0-a40b55876123@github.com> On Wed, 16 Apr 2025 08:24:15 GMT, Xiaolong Peng wrote: > The purpose of the PR is to add generic JFR events for TaskTerminator to track the attempts and timings that GC threads have tried to terminate GC tasks. > > Today only G1 emits JFR event with name `Termination` from [G1ParEvacuateFollowersClosure](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1YoungCollector.cpp#L555-L563), all other garbage collectors don't emit any JFR event for the termination attempt at all. > > By adding this, it gives performance engineers the visibility to the termination attempts and termination time when GC threads trying to finish GC tasks, we could build tool to analyze the jfr events to determine if there is potential data structure issue in application code, e.g. very large LinkedList or LinkedBlockingQueue. > > For the test, I have manually tested different GCs with Flight Recording enabled and verified the events: > G1: > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0108 ms > gcId = 0 > gcWorkerId = 8 > name = "Termination" > eventThread = "GC Thread#4" (osThreadId = 20483) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0467 ms > gcId = 0 > gcWorkerId = 2 > name = "Termination" > eventThread = "GC Thread#2" (osThreadId = 21251) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0474 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination" > eventThread = "GC Thread#8" (osThreadId = 36359) > } > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000834 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000166 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > > Shenandoah: > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0202 ms > gcId = 0 > gcWorkerId = 0 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#3" (osThreadId = 13827) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0205 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#1" (osThreadId = 14339) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0127 ms > gcId = 0 > gcWorkerId = 5 > name = "Termination: Final Mark" > eventThread = "Shenandoah G... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24676 From jsikstro at openjdk.org Tue Nov 4 09:39:52 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 4 Nov 2025 09:39:52 GMT Subject: RFR: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods In-Reply-To: References:

Message-ID: On Mon, 3 Nov 2025 09:59:39 GMT, Albert Mingkun Yang wrote: >> Hello, >> >> The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. >> >> Testing: >> * Oracle's tier1-2 > > Marked as reviewed by ayang (Reviewer). Thank you for the reviews! @albertnetymk @tschatzl Nice to get this cleanup in. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28107#issuecomment-3484890434 From jsikstro at openjdk.org Tue Nov 4 09:39:54 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 4 Nov 2025 09:39:54 GMT Subject: Integrated: 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods In-Reply-To: References: Message-ID: <5BDwGukXMsP_RV8oGiVADupc5T2gZRhcVe_GGXYRhJE=.a3b3714e-f322-4d13-9d22-f583e45d74ea@github.com> On Mon, 3 Nov 2025 09:49:02 GMT, Joel Sikstr?m wrote: > Hello, > > The last usage of the the Thread parameter that is passed to `CollectedHeap::{tlab_capacity, tlab_used, unsafe_max_tlab_alloc}` was removed in [JDK-8370345](https://bugs.openjdk.org/browse/JDK-8370345). Following this we should remove the Thread parameter completely from CollectedHeap and all GCs that derive from it. > > Testing: > * Oracle's tier1-2 This pull request has now been integrated. Changeset: 19cca0a2 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/19cca0a2a829396291fa4140b2082ef518425518 Stats: 65 lines in 20 files changed: 0 ins; 2 del; 63 mod 8371131: Cleanup Thread parameter in CollectedHeap TLAB methods Reviewed-by: ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28107 From duke at openjdk.org Tue Nov 4 15:41:20 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Nov 2025 15:41:20 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 4 new changesets Message-ID: <4f1ce1d1-1b68-4616-9434-250ce60267e9@openjdk.org> Changeset: 32730331 Branch: master Author: Jan Kratochvil Committer: Andrew John Hughes Date: 2025-09-10 18:12:18 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/327303315367f15113c17c8525d8eb3bc0f66782 8352637: Enhance bytecode verification Reviewed-by: fferrari, andrew Backport-of: 974f4da2e53ebde8b06224f4ba0b80aa74c5f434 ! hotspot/src/share/vm/classfile/stackMapTable.cpp ! hotspot/src/share/vm/classfile/stackMapTable.hpp ! hotspot/src/share/vm/classfile/verifier.cpp ! hotspot/src/share/vm/interpreter/bytecodeStream.hpp ! jdk/src/share/native/common/check_code.c Changeset: dde48aed Branch: master Author: Aleksei Voitylov Committer: Andrew John Hughes Date: 2025-09-03 01:47:08 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/dde48aedce2fe649e1737b3eec829428381bcb8f 8356294: Enhance Path Factories Reviewed-by: abakhtin, fferrari, andrew Backport-of: 2c7f45612d11199a9d5eaa6d61a2893ec4afa687 ! jaxp/src/com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.java ! jaxp/src/com/sun/org/apache/xpath/internal/jaxp/XPathImpl.java ! jaxp/src/jdk/xml/internal/XMLSecurityManager.java Changeset: d5ac2ad8 Branch: master Author: Alexey Bakhtin Committer: Andrew John Hughes Date: 2025-08-27 13:50:17 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/d5ac2ad89a369697a48e7f3e6b889e22afa50a2f 8360937: Enhance certificate handling Reviewed-by: fferrari, andrew Backport-of: d3b1c2be9e87aad07cac29d94679130fe5807c17 ! jdk/src/share/classes/sun/security/util/DerValue.java ! jdk/src/share/classes/sun/security/x509/AVA.java ! jdk/test/java/security/testlibrary/CertificateBuilder.java Changeset: 3a926122 Branch: master Author: Andrew Hughes Date: 2025-10-17 23:09:02 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3a926122dc31e0f0a184d7bedba27167079763e8 Merge jdk8u472-b08 Added tag jdk8u472-b08 for changeset d5ac2ad89a3 From andrew at openjdk.org Tue Nov 4 15:41:30 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Nov 2025 15:41:30 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u472-b08 for changeset 3a926122 Message-ID: Tagged by: Andrew John Hughes Date: 2025-11-02 22:45:33 +0000 Added tag shenandoah8u472-b08 for changeset 3a926122dc3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRMled0VQO0j4ExaDP2g+bNZZCIgUCaQffDQAKCRDP2g+bNZZC IhRjAP0UHp6GoRGLeZut37FB0UrkHMxUacJcclpqvX3dNsYOwwEA0MVnK/+S6xgQ mYoxhh5AiJEqWEGud5+fON1z1Ld75wU= =2BcR -----END PGP SIGNATURE----- Changeset: 3a926122 Author: Andrew Hughes Date: 2025-10-17 23:09:02 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3a926122dc31e0f0a184d7bedba27167079763e8 From andrew at openjdk.org Tue Nov 4 15:41:34 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Nov 2025 15:41:34 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u472-ga for changeset 3a926122 Message-ID: Tagged by: Andrew John Hughes Date: 2025-11-02 22:46:06 +0000 Added tag shenandoah8u472-ga for changeset 3a926122dc3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRMled0VQO0j4ExaDP2g+bNZZCIgUCaQffLgAKCRDP2g+bNZZC ItibAQC6StJRRgeno5BVS5/KpH1P69A1bLO6dD/C+YpbArx4GQD/XDKkWTUlQCAS UiQ5sDj7Gp8h0I0wwkH5/pw0HXGeHgk= =UHSn -----END PGP SIGNATURE----- Changeset: 3a926122 Author: Andrew Hughes Date: 2025-10-17 23:09:02 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/3a926122dc31e0f0a184d7bedba27167079763e8 From iris at openjdk.org Tue Nov 4 15:42:01 2025 From: iris at openjdk.org (Iris Clark) Date: Tue, 4 Nov 2025 15:42:01 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: <4Bn0JXdtXoqusP9GZa4nbc8wHQ67W8eGH4KioZ69XK8=.7bfd3448-9996-4fc7-aaac-8285c7e626c3@github.com> References: <4Bn0JXdtXoqusP9GZa4nbc8wHQ67W8eGH4KioZ69XK8=.7bfd3448-9996-4fc7-aaac-8285c7e626c3@github.com> Message-ID: On Sun, 2 Nov 2025 22:50:43 GMT, Andrew John Hughes wrote: > Merge `jdk8u472-b08` This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/114 From andrew at openjdk.org Tue Nov 4 15:42:00 2025 From: andrew at openjdk.org (Andrew John Hughes) Date: Tue, 4 Nov 2025 15:42:00 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: <4Bn0JXdtXoqusP9GZa4nbc8wHQ67W8eGH4KioZ69XK8=.7bfd3448-9996-4fc7-aaac-8285c7e626c3@github.com> References: <4Bn0JXdtXoqusP9GZa4nbc8wHQ67W8eGH4KioZ69XK8=.7bfd3448-9996-4fc7-aaac-8285c7e626c3@github.com> Message-ID: > Merge `jdk8u472-b08` Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/114/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/114/files/3a926122..3a926122 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=114&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=114&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/114.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u.git pull/114/head:pull/114 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/114 From xpeng at openjdk.org Tue Nov 4 16:47:20 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 4 Nov 2025 16:47:20 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References: <7KjxoZ7UQ9LugnCVgBT6moBrQEfxmPNBXrvyWXD-MQ8=.37907346-c252-4daf-b7bb-9d0f89291753@github.com>

Message-ID: On Fri, 31 Oct 2025 21:00:46 GMT, Xiaolong Peng wrote: >> The young and old collector reserves both have `FREE` regions, but we need to make sure we don't exceed the reserves for one or the other. > > I have added the can_allocate_in_new_region check back, it should have same behavior now. I have removed can_allocate_in_new_region again after merging @kdnilsen's change unifying accountings, since we don't have consistency in accountings anymore. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28036#discussion_r2491266912 From wkemper at openjdk.org Tue Nov 4 22:27:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Nov 2025 22:27:52 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> Message-ID: <7kzHfoGKxHS3U5xhG81AuWhzghnHr-iWKuMyDZe-o6Q=.0e08f7b0-7ced-42df-9e39-d3134a095b73@github.com> On Mon, 3 Nov 2025 18:01:31 GMT, William Kemper wrote: >> When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. >> >> # Background >> When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Flush SATB buffers upon entering degenerated cycle when old marking is in progress > > This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. > - Fix typo in comment > - Remove duplicate satb flush closure > - Only flush satb once during degenerated cycle > - Cleanup and comments > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Fix assertion > - Oops, move inline definition out of ifdef ASSERT > - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de I've run more tests and confirmed that critical and max `jops` are 3% improved on a variety of heap sizes and configurations. Additionally, after running more tests with `extremem`, the apparent regression at `p100` has evaporated: genshen/extremem/control Category | Count | Total | GeoMean | Average | Trim 0.1 | StdDev | Minimum | Maximum sales_transaction_p100 | 23 | 38817.000 | 1685.651 | 1687.696 | 1690.158 | 84.299 | 1539.000 | 1823.000 browsing_history_p100 | 23 | 32145.000 | 1391.269 | 1397.609 | 1382.211 | 139.779 | 1175.000 | 1769.000 customer_replacement_p100 | 23 | 107141.000 | 4652.940 | 4658.304 | 4664.211 | 227.279 | 4093.000 | 5053.000 product_replacement_p100 | 23 | 58315.000 | 2526.181 | 2535.435 | 2523.737 | 223.834 | 2203.000 | 3041.000 customer_preparation_p100 | 23 | 142953.000 | 5442.520 | 6215.348 | 6064.000 | 3132.333 | 2502.000 | 11547.000 customer_purchase_p100 | 23 | 491201.000 | 18622.792 | 21356.565 | 20385.263 | 11424.057 | 6582.000 | 48274.000 customer_save_for_later_p100 | 23 | 880105.000 | 36675.774 | 38265.435 | 37248.895 | 11777.757 | 23023.000 | 65591.000 customer_abandonment_p100 | 23 | 701197.000 | 28578.563 | 30486.826 | 29311.105 | 11521.080 | 16390.000 | 59179.000 genshen/extremem/experiment Category | Count | Total | GeoMean | Average | Trim 0.1 | StdDev | Minimum | Maximum sales_transaction_p100 | 23 | 40007.000 | 1730.529 | 1739.435 | 1711.947 | 193.896 | 1528.000 | 2490.000 browsing_history_p100 | 23 | 33032.000 | 1425.803 | 1436.174 | 1436.316 | 175.753 | 1123.000 | 1729.000 customer_replacement_p100 | 23 | 107516.000 | 4653.546 | 4674.609 | 4599.579 | 488.862 | 4072.000 | 6579.000 product_replacement_p100 | 23 | 56647.000 | 2451.855 | 2462.913 | 2469.789 | 235.896 | 1968.000 | 2903.000 customer_preparation_p100 | 23 | 136482.000 | 5224.974 | 5934.000 | 5766.684 | 3027.151 | 2924.000 | 10652.000 customer_purchase_p100 | 23 | 464888.000 | 17675.074 | 20212.522 | 18641.263 | 11355.233 | 7921.000 | 55420.000 customer_save_for_later_p100 | 23 | 854932.000 | 35744.969 | 37170.957 | 35660.579 | 11323.562 | 24370.000 | 71376.000 customer_abandonment_p100 | 23 | 686923.000 | 28261.963 | 29866.217 | 28589.737 | 10907.943 | 17791.000 | 65329.000 Indeed, the _experiment_ looks slightly better in some cases (slightly worse in others). The results also show the expected reduction in safepoint times as we are no longer flushing SATB buffers during `final_update_refs` or `init_mark`: -133.33% extremem/shenandoahfinalupdaterefs_stopped_max p=0.00000 (Welch's T-Test) Control: 1.103 (+/- 0.14 ) 23 Test: 0.473 (+/- 0.06 ) 23 The effect is more pronounced on `specjbb2015`: -216.54% specjbb2015/shenandoahfinalupdaterefs_stopped_max p=0.00000 (Mann-Whitney) Control: 2.217 (+/- 0.09 ) 22 Test: 0.700 (+/- 0.19 ) 22 -791.43% specjbb2015/shenandoahinitmark_stopped_max p=0.00173 (Mann-Whitney) Control: 3.408 (+/- 3.12 ) 22 Test: 0.382 (+/- 0.07 ) 22 ------------- PR Comment: https://git.openjdk.org/jdk/pull/27983#issuecomment-3488220025 From wkemper at openjdk.org Tue Nov 4 23:10:34 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 4 Nov 2025 23:10:34 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References: <7KjxoZ7UQ9LugnCVgBT6moBrQEfxmPNBXrvyWXD-MQ8=.37907346-c252-4daf-b7bb-9d0f89291753@github.com>

Message-ID: On Tue, 4 Nov 2025 16:43:58 GMT, Xiaolong Peng wrote: >> I have added the can_allocate_in_new_region check back, it should have same behavior now. > > I have removed can_allocate_in_new_region again after merging @kdnilsen's change unifying accountings, since we don't have consistency in accountings anymore. We still need to enforce that requests for a young evacuation don't take memory that was reserved for old evacuations. `can_allocate_in_new_region` isn't about consistency between freeset and generation accounting, it's used to maintain old/young collector reserves. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28036#discussion_r2492277440 From kdnilsen at openjdk.org Wed Nov 5 02:24:11 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 5 Nov 2025 02:24:11 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v6] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Revert "Fixup handling of weakly marked objects in remembered set" This reverts commit 80198abe5d06c3532d9a43a53691376e990ed45f. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/d341522e..643cdfd6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=04-05 Stats: 28 lines in 1 file changed: 0 ins; 16 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From duke at openjdk.org Wed Nov 5 05:08:57 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:08:57 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - Fix test failed - 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/ea83736e..6d122039 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=06-07 Stats: 526337 lines in 7522 files changed: 349612 ins; 122587 del; 54138 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Wed Nov 5 05:09:05 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:09:05 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v6] In-Reply-To: References:

Message-ID: On Tue, 8 Apr 2025 13:04:12 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > src/hotspot/share/gc/shared/c2/barrierSetC2.cpp line 223: > >> 221: MergeMemNode* mm = opt_access.mem(); >> 222: PhaseGVN& gvn = opt_access.gvn(); >> 223: Node* mem = mm->memory_at(gvn.C->get_alias_index(access.addr().type())); > > Can we get rid of all uses of `access.addr().type()`? Get rid of all access.addr().type() > src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp line 105: > >> 103: // stores. In theory we could relax the load from ctrl() to >> 104: // no_ctrl, but that doesn't buy much latitude. >> 105: Node* card_val = __ load( __ ctrl(), card_adr, TypeInt::BYTE, T_BYTE); > > We could asssert that `C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw`, that is that computed slice is the same as hardcoded slide. Similar asserts could be added for every location where a slice/address type is removed in this patch. Sure, I add more assert for this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2484816831 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2492987998 From ayang at openjdk.org Wed Nov 5 10:16:39 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 5 Nov 2025 10:16:39 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue Message-ID: Removing effectively dead code. Test: tier1, GHA ------------- Commit messages: - remove-barrier-arg Changes: https://git.openjdk.org/jdk/pull/28146/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28146&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371321 Stats: 38 lines in 16 files changed: 0 ins; 6 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/28146.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28146/head:pull/28146 PR: https://git.openjdk.org/jdk/pull/28146 From fandreuzzi at openjdk.org Wed Nov 5 10:33:04 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 5 Nov 2025 10:33:04 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: <_jy1EXbNvAcvuv1R0jwOUXuE7dQX1YSLjsnM5ijyBNM=.20d22624-f6ff-4de7-8701-7ff1536ffa5b@github.com> On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Marked as reviewed by fandreuzzi (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/28146#pullrequestreview-3421109827 From roland at openjdk.org Wed Nov 5 13:23:18 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 5 Nov 2025 13:23:18 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References:

Message-ID: On Wed, 5 Nov 2025 05:08:57 GMT, Zihao Lin wrote: >> This patch remove slice parameter from LoadNode::make >> >> I have done more work which remove slice paramater from StoreNode::make. >> >> Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 >> >> Hi team, I am new, I'd appreciate any guidance. Thank a lot! > > Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: > > - fix assert > - add more assert > - rid of access.addr().type() > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Fix build > - Fix test failed > - 8344116: C2: remove slice parameter from LoadNode::make Can we remove `C2AccessValuePtr` entirely and use: Node* _addr; where, currently, there's: C2AccessValuePtr& _addr; ? src/hotspot/share/opto/callnode.cpp line 1740: > 1738: Node* klass_node = in(AllocateNode::KlassNode); > 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); > 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw ------------- PR Review: https://git.openjdk.org/jdk/pull/24258#pullrequestreview-3421940817 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2494424924 From kdnilsen at openjdk.org Wed Nov 5 18:20:45 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 5 Nov 2025 18:20:45 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> Message-ID: On Mon, 3 Nov 2025 18:01:31 GMT, William Kemper wrote: >> When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. >> >> # Background >> When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Flush SATB buffers upon entering degenerated cycle when old marking is in progress > > This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. > - Fix typo in comment > - Remove duplicate satb flush closure > - Only flush satb once during degenerated cycle > - Cleanup and comments > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Fix assertion > - Oops, move inline definition out of ifdef ASSERT > - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de Thank you for bringing this to closure... ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/27983#pullrequestreview-3423680188 From kdnilsen at openjdk.org Wed Nov 5 22:31:22 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 5 Nov 2025 22:31:22 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v5] In-Reply-To: References:

Message-ID: On Mon, 27 Oct 2025 21:59:35 GMT, Xiaolong Peng wrote: >> Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: >> * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). >> * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. >> * If mutator thread fails after trying 3 directly allocatable regions, it will: >> * Take heap lock >> * Try to retire the directly allocatable regions which are ready to retire. >> * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. >> * Satisfy mutator allocation request if possible. >> >> >> I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: >> >> 1. Dacapo lusearch test on EC2 host with 96 CPU cores: >> Openjdk TIP: >> >> [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" >> ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== >> ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== >> ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 ... > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 135 commits: > > - Merge branch 'openjdk:master' into cas-alloc-1 > - Merge branch 'openjdk:master' into cas-alloc-1 > - format > - Merge branch 'openjdk:master' into cas-alloc-1 > - Merge branch 'openjdk:master' into cas-alloc-1 > - Merge branch 'master' into cas-alloc-1 > - Move ShenandoahHeapRegionIterationClosure to shenandoahFreeSet.hpp > - Merge branch 'openjdk:master' into cas-alloc-1 > - Fix errors caused by renaming ofAtomic to AtomicAccess > - Merge branch 'openjdk:master' into cas-alloc-1 > - ... and 125 more: https://git.openjdk.org/jdk/compare/2f613911...e6bfef05 src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 762: > 760: #endif > 761: > 762: PaddedEnd* ShenandoahDirectlyAllocatableRegionAffinity::_affinity = nullptr; IIUC, each of the DirectlyAllocatableRegions has an affinity to a particular thread. If some other thread randomly selects to allocate from this region, we change the region's affinity. Now, if the original thread tries another allocation, it will be redirected to a different randomly selected region. It feels to me like this introduces more churn than necessary. It should be ok for multiple threads to have affinity to the same directly allocatable region so they can preserve "locality". The locality to be preserved is the value of region->top() and region->end(), in local caches. Sometimes, the multiple threads that are affiliated with a particular region will be running on the same core, which helps improve cache performance. If they are on different cores, that's now worse than the current implementation. So my thinking is that each thread should maintain an affinity to a particular index within the directly allocatable region array. It should reaffiliate with a different region only if that region entry becomes nullptr, or in case its attempt to CAS allocate fails more than twice in a row (because of heavy contention on the value of top for that region). Please clarify if I've misunderstood. Please explain rationale if there is good reason for preferring the design as it is currently implemented. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 848: > 846: HeapWord* ShenandoahFreeSet::allocate_with_affiliation(Iter& iterator, ShenandoahAffiliation affiliation, ShenandoahAllocRequest& req, bool& in_new_region) { > 847: for (idx_t idx = iterator.current(); iterator.has_next(); idx = iterator.next()) { > 848: ShenandoahHeapRegion* r = _heap->get_region(idx); I wonder if we could refine this a little bit. When the region is moved into the "directly allocatable" set, wouldn't we remove it from its partition? Then, we wouldn't have to test for !r->reserved_for_direct_allocation() here because the iterator wouldn't produce it. We could maybe replace this test with an assert that !r->reserved_for_direct_allocation(). src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1268: > 1266: // If region is not completely free, the current [beg; end] is useless, and we may fast-forward. If we can extend > 1267: // the existing range, we can exploit that certain regions are already known to be in the Mutator free set. > 1268: ShenandoahHeapRegion* region = _heap->get_region(end); Here also, if we remove the region from the partition when we make it directly allocatable, we would not need to rewrite this loop. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2070: > 2068: // retired, the sum of used and capacities within regions that are still in the Mutator free partition may not match > 2069: // my internally tracked values of used() and free(). > 2070: //TODO remove assert, it is not possible to mach since mutators may allocate on region w/o acquiring lock It seems that if we are properly adjusting used() when we make a region directly allocatable, then this assert would still be valid. Why not? src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2197: > 2195: // Intentionally not using AtomicAccess::load, if a mutator see a stale region it will fail to allocate anyway. > 2196: if ((r = shared_region._address) != nullptr && r->reserved_for_direct_allocation()) { > 2197: obj = cas_allocate_in_for_mutator(r, req, in_new_region); There are multiple things that can go wrong here, and it's not clear how we distinguish between them: 1. Region r may be retirable and not have enough memory to satisfy req. 2. Region r may not be retirable and not have enough memory to satisfy req. 3. Region r may have sufficient memory to satisfy r but we experience heavy contention (multiple CAS failures) while we attempt to allocate r. In my mental model, I think I might be inclined to behave as follows: 1. Inside case_allocate_in_for_mutator(), if we satisfy our allocation request, but only after "too many" (more than 2) CAS retries, do not update my thread-local _index variable. Leave it at its previously selected value. Otherwise, on successful allocation, update _index to represent current slot. (might have to pass current slot in as an argument) 2. If cas_allocate_in_for_mutator() returns nullptr, ask the question "is this region retirable?" If so, increment a local count of retirable_regions. If the incremented count of retirable_regions exceeds some threshold value (DirectlyAllocatableRegionCount / 4?) (and there may be more than this number of retirable regions, but we've seen at least this many), grab the heap lock to retire and replenish all retirable regions. Then restart this loop with i = 0; src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2204: > 2202: i++; > 2203: } > 2204: return obj; I think obj always equals nullptr at this point. Seems the code would be easier to understand (and would depend less on effective compiler optimization) if we just made that explicit. Can we just say: return nullptr? src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2206: > 2204: return obj; > 2205: } > 2206: template space above this line? src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2219: > 2217: const uint max_probes = ShenandoahDirectAllocationMaxProbes; > 2218: for (;;) { > 2219: HeapWord* obj = nullptr; This is written as an infinite loop, but I'm not sure it intends to iterate? It looks like there's no way for the loop body to get to a second iteration. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2235: > 2233: start_idx = next_start_index; > 2234: } else { > 2235: // try to steal from other directly allocatable regions Why not just let the enclosing infinite loop (for(;;)) iterate, which will call cas_allocate_single_for_mutator()? You could overwrite start_idx with a new value if you want. I'm not sure I agree with choosing start_idx + max_probes. It seems a more likely new value would be next_start_index. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2256: > 2254: } > 2255: > 2256: // Explicit specializations I'm sorry. I don't understand what is happening here with "explicit specializations". Maybe a comment to explain this less common C++ syntax would help here. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2280: > 2278: } > 2279: > 2280: class DirectAllocatableRegionRefillClosure final : public ShenandoahHeapRegionIterationClosure { I don't think we want to subclass ShenandoahHeapRegionIterationClosure here. That iterates over all 2000 regions. We only want to iterate over the 13 Directly allocatable regions. Maybe we don't even need/want a closure iterator here. We could just write a loop. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2347: > 2345: } > 2346: > 2347: bool heap_region_do(ShenandoahHeapRegion *r) { Need a comment to explain what heap_region_do() is doing. I think it is doing: 1. Returns true if there is no need to iterate over additional regions, otherwise returns false 2. If we've already found a region to hold the requested allocation and there are no more regions to retire, we return true. 3. If the region is trash or is free and we're not doing concurrent_weak_roots, we try to recycle it, setting its affiliation to young. (I'm not understanding how this works. What if the region had been placed in the Collector or OldCollector partitions?) 4. If the requested object has not yet been allocated and this region has sufficient memory to represent the object, allocate it here. (We do all this while holding the heap lock, so we can check available memory at entry to the function, and use the available memory further below within the function.) 5. If this region has more than min_tlab_size memory (after allocating _obj) and there's another region to be retired, we use replace the next retirable region with this region. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2383: > 2381: ShenandoahDirectAllocationRegion& shared_region = _direct_allocation_regions[_next_retire_eligible_region]; > 2382: assert(AtomicAccess::load(&shared_region._address) == nullptr, "Must have been released."); > 2383: r->reserve_for_direct_allocation(); Here also, I wonder if we can simplify the protocol for assigning this region into the directly allocatable array. Maybe the key idea could be to introduce a new volatile variable into ShenandoahHeapRegion known as top_for_cas. cas allocations allocate by incrementing top_for_cas. Before removing a region from the directly allocatable set, we use cas to increase top_for_cas to end(). Before placing a region into the directly allocatable set (and while holding the heap lock), we copy top() to top_for_cas. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2418: > 2416: iterate_regions_for_alloc(&cl, false); > 2417: if (cl._next_region_with_sufficient_mem != ShenandoahDirectlyAllocatableRegionCount && obj == nullptr) { > 2418: new_start_index = cl._next_region_with_sufficient_mem; I think an accessor method is preferred over direct access to a "private" variable. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 263: > 261: _capacity[int(which_partition)] = value; > 262: _available[int(which_partition)] = value - _used[int(which_partition)]; > 263: AtomicAccess::store(_capacity + int(which_partition), value); Shouldn't require AtomicAccess here, because we hold heap lock. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 271: > 269: _used[int(which_partition)] = value; > 270: _available[int(which_partition)] = _capacity[int(which_partition)] - value; > 271: AtomicAccess::store(_used + int(which_partition), value); Also here, should not require AtomicAccess. src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 464: > 462: HeapWord* cas_allocate_in_for_mutator(ShenandoahHeapRegion* region, ShenandoahAllocRequest &req, bool &in_new_region); > 463: > 464: bool try_allocate_directly_allocatable_regions(uint start_index, Can we have a block comment description of what this function does, including its impact on var parameters. src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp line 363: > 361: } > 362: > 363: void ShenandoahHeapRegion::reset_alloc_metadata() { Do we need to make these atomic because we now increment asynchronously from within mutator CAS allocations? Before, they were only adjusted while holding heap lock? I'm wondering if add-with-fetch() or CAS() would be more/less efficient than AtomicAccess::stores. Can we test the tradeoffs? src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.inline.hpp line 131: > 129: } > 130: > 131: HeapWord* ShenandoahHeapRegion::allocate_atomic(size_t size, const ShenandoahAllocRequest& req) { As mentioned above, this is maybe where we would want to set the thread-local _index variable, and this is also where we might want to count how many times we retry the try_allocate() request before we succeed. The point is that if we have multiple try_allocate() CAS failures, that means this region is heavily contended, and we don't want to make this our new "starting index". src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 559: > 557: range(1, 128) \ > 558: \ > 559: product(uintx, ShenandoahDirectAllocationMaxProbes, 3, EXPERIMENTAL, \ I think we found that setting DirectAllocationMaxProbes to equal ShenandoahDirectlyAlloctableRegionCount works "best". I'm inclined to remove this parameter entirely as it somewhat simplifies the implementation. If you think we want to keep it, can you explain the rationale? Would we change the default value? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495713187 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495759091 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495768272 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495789372 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495974976 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495837037 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495793783 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496020674 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496020913 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496084186 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496124916 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496344767 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496359448 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496109695 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496045832 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496047006 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496050940 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496384148 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496395761 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496399061 From kdnilsen at openjdk.org Wed Nov 5 22:31:23 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 5 Nov 2025 22:31:23 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v5] In-Reply-To: References:

Message-ID: <5YSA3F88CmDDv09M2KOm_EFNDh_09LPO2WMrgETfupI=.cc658dc9-829e-41a5-ad76-393d3eb0f75a@github.com> On Wed, 5 Nov 2025 19:00:03 GMT, Kelvin Nilsen wrote: >> Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 135 commits: >> >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - format >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'master' into cas-alloc-1 >> - Move ShenandoahHeapRegionIterationClosure to shenandoahFreeSet.hpp >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Fix errors caused by renaming ofAtomic to AtomicAccess >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - ... and 125 more: https://git.openjdk.org/jdk/compare/2f613911...e6bfef05 > > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 848: > >> 846: HeapWord* ShenandoahFreeSet::allocate_with_affiliation(Iter& iterator, ShenandoahAffiliation affiliation, ShenandoahAllocRequest& req, bool& in_new_region) { >> 847: for (idx_t idx = iterator.current(); iterator.has_next(); idx = iterator.next()) { >> 848: ShenandoahHeapRegion* r = _heap->get_region(idx); > > I wonder if we could refine this a little bit. When the region is moved into the "directly allocatable" set, wouldn't we remove it from its partition? Then, we wouldn't have to test for !r->reserved_for_direct_allocation() here because the iterator wouldn't produce it. > > We could maybe replace this test with an assert that !r->reserved_for_direct_allocation(). Same issue in other uses of the allocation iterator. > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2280: > >> 2278: } >> 2279: >> 2280: class DirectAllocatableRegionRefillClosure final : public ShenandoahHeapRegionIterationClosure { > > I don't think we want to subclass ShenandoahHeapRegionIterationClosure here. That iterates over all 2000 regions. We only want to iterate over the 13 Directly allocatable regions. Maybe we don't even need/want a closure iterator here. We could just write a loop. I think we should be borrowing from this code when replenishing the regions that are ready to be retired: if (_partitions.alloc_from_left_bias(ShenandoahFreeSetPartitionId::Mutator)) { // Allocate from low to high memory. This keeps the range of fully empty regions more tightly packed. // Note that the most recently allocated regions tend not to be evacuated in a given GC cycle. So this // tends to accumulate "fragmented" uncollected regions in high memory. ShenandoahLeftRightIterator iterator(&_partitions, ShenandoahFreeSetPartitionId::Mutator); return allocate_from_regions(iterator, req, in_new_region); } // Allocate from high to low memory. This preserves low memory for humongous allocations. ShenandoahRightLeftIterator iterator(&_partitions, ShenandoahFreeSetPartitionId::Mutator); return allocate_from_regions(iterator, req, in_new_region); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2495761580 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2496372013 From duke at openjdk.org Thu Nov 6 00:29:26 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 00:29:26 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space Message-ID: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m, so GenShen isn't worse. The reported GenShen failure observation probably came from the Random. public class TestLargeObjectAlignmentDeterministic { static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); static final int NODE_COUNT = Integer.getInteger("nodes", 10000); static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); static Object[] objects; public static void main(String[] args) throws Exception { objects = new Object[SLABS_COUNT]; for (int i = 0; i < SLABS_COUNT; i++) { objects[i] = createSome(); } } public static Object createSome() { List result = new ArrayList(); for (int c = 0; c < NODE_COUNT; c++) { result.add(new Integer(c)); } return result; } } ------------- Commit messages: - 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space Changes: https://git.openjdk.org/jdk/pull/28167/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8361339 Stats: 9 lines in 1 file changed: 0 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/28167.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28167/head:pull/28167 PR: https://git.openjdk.org/jdk/pull/28167 From ysr at openjdk.org Thu Nov 6 01:25:11 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Nov 2025 01:25:11 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> Message-ID: On Mon, 3 Nov 2025 18:01:31 GMT, William Kemper wrote: >> When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. >> >> # Background >> When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Flush SATB buffers upon entering degenerated cycle when old marking is in progress > > This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. > - Fix typo in comment > - Remove duplicate satb flush closure > - Only flush satb once during degenerated cycle > - Cleanup and comments > - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots > - Fix assertion > - Oops, move inline definition out of ifdef ASSERT > - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de Changes look ok. I found some of the comments confusing -- I have left some remarks at those places, please have a look to see if they can be made clearer. The improvement in performance looks good. Do we track the number of SATB pointers processed by the old marking (to compare between before and after your changes here)? src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1143: > 1141: // be in the collection set. If this happens, the pointer will be preserved, essentially > 1142: // becoming part of the old snapshot. > 1143: // 2. The region is allocated during evacuation of old. This is also not a concern because One related question. In both these cases, I assume the reference will look "marked" because it's above TAMS for the purposes of the old marking? src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.hpp line 234: > 232: // marking phase. In some cases, this can cause a write to a perfectly > 233: // reachable oop to enqueue a pointer that later becomes garbage (because > 234: // it points at an object that is later chosen for the collection set). There are > ``` > // ... In some cases, this can cause a write to a perfectly > // reachable oop to enqueue a pointer that later becomes garbage (because > // it points at an object that is later chosen for the collection set). > ``` I don't understand this statement. The SATB is supposed to be pointers to objects that we will preserve because they were reachable when the snapshot (marking) was started. Can you elaborate what you mean here? Did you mean that the filtering of the SATB didn't filter a (sometime) young reference which was then processed by the old marking? src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.hpp line 237: > 235: // also cases where the referent of a weak reference ends up in the SATB > 236: // and is later collected. In these cases the oop in the SATB buffer becomes > 237: // invalid and the _next_ cycle will crash during its marking phase. To Again I don't understand the concept of an SATB pointer to an object that was later collected? Are we talking about young objects that are subsequently processed by old marking because they weren't filtered out when they should be? I think that is probably the case here, but it would be good to clean up these comments to avoid this confusion. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27983#pullrequestreview-3425195561 PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2496682448 PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2496698952 PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2496702500 From duke at openjdk.org Thu Nov 6 01:45:26 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 01:45:26 GMT Subject: RFR: 8261743: Shenandoah: enable String deduplication with compact heuristics Message-ID: Enable `UseStringDeduplication` when using compact heuristics. Testing: ./build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact -XX:+PrintFlagsFinal --version | grep UseStringDeduplication bool UseStringDeduplication = true {product} {default} Note: The label should be `{product} {ergonomic}` in theory. Pending on a separate issue: [JDK-8371381](https://bugs.openjdk.org/browse/JDK-8371381) ------------- Commit messages: - 8261743: Shenandoah: enable String deduplication with compact heuristics Changes: https://git.openjdk.org/jdk/pull/28170/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28170&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8261743 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28170.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28170/head:pull/28170 PR: https://git.openjdk.org/jdk/pull/28170 From shade at openjdk.org Thu Nov 6 09:39:10 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Nov 2025 09:39:10 GMT Subject: RFR: 8261743: Shenandoah: enable String deduplication with compact heuristics In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:38:36 GMT, Rui Li wrote: > Enable `UseStringDeduplication` when using compact heuristics. > > Testing: > > ./build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact -XX:+PrintFlagsFinal --version | grep UseStringDeduplication > bool UseStringDeduplication = true {product} {default} > > > Note: The labels should be `{product} {ergonomic}` ideally. Pending on a separate issue: [JDK-8371381](https://bugs.openjdk.org/browse/JDK-8371381) Looks good. Have you measured any impact on pauses / GC durations in testing? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28170#pullrequestreview-3427178962 From shade at openjdk.org Thu Nov 6 09:47:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Nov 2025 09:47:04 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Thu, 6 Nov 2025 00:21:55 GMT, Rui Li wrote: > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } Right. One node is basically an `Integer` and the reference array slot. So about 20 bytes, give or take. There are 10K nodes per slab, meaning there is 200KB per slab. There are 10K slabs, meaning they take 2GB memory. So bumping the limit to 3G makes sense. I think you want to override `Xmx`, though -- that is the decisive factor for sizing heap regions. Sticking with `Xmx` means the test runs in the same conditions everywhere, helping reproducibility. Unless there is a strong reason to explore different heap regions sizes, which I think there is none: the test was about testing `ObjectAlignmentInBytes` first and foremost. ------------- PR Review: https://git.openjdk.org/jdk/pull/28167#pullrequestreview-3427215034 From syan at openjdk.org Thu Nov 6 11:20:02 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 6 Nov 2025 11:20:02 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Thu, 6 Nov 2025 00:21:55 GMT, Rui Li wrote: > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } test/hotspot/jtreg/gc/shenandoah/TestLargeObjectAlignment.java line 34: > 32: * @library /test/lib > 33: * > 34: * @run main/othervm -Xms3g -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ObjectAlignmentInBytes=16 -Xint TestLargeObjectAlignment Since the initial heap memory set to 3G, maybe we should add '@requires os.maxMemory > 4g' to skip this test when the physical memory of test machine is less than 4g. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28167#discussion_r2498564622 From duke at openjdk.org Thu Nov 6 13:43:33 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:43:33 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v9] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: remove C2AccessValuePtr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/6d122039..e89910c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07-08 Stats: 58 lines in 8 files changed: 0 ins; 21 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Thu Nov 6 13:58:53 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:58:53 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v10] In-Reply-To: References: Message-ID: <1zyQq98OPsZ-2nzYz21X_5v2RgKhWaZrZaJQevDMzo4=.138599b1-4797-42b0-a48a-829a112dfbe7@github.com> > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - ... and 2 more: https://git.openjdk.org/jdk/compare/c173d416...36e024db ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=09 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From kdnilsen at openjdk.org Thu Nov 6 14:22:16 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 6 Nov 2025 14:22:16 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v6] In-Reply-To: References:

Message-ID: <6QqmpWAvag906JW5UZj7T1tnWsDKuNNgVArEYv2pQ5g=.69b21a94-afc3-4812-ba9b-7217df4b6704@github.com> On Mon, 29 Sep 2025 16:40:51 GMT, William Kemper wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Fixup handling of weakly marked objects in remembered set" >> >> This reverts commit 80198abe5d06c3532d9a43a53691376e990ed45f. > > src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 132: > >> 130: >> 131: // Search for last one in the range [l_index, r_index). Return r_index if not found. >> 132: inline idx_t get_prev_one_offset (idx_t l_index, idx_t r_index) const; > > Nit: Some idiosyncratic formatting here (space before opening parenthesis). Thanks for catching this. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2499139532 From wkemper at openjdk.org Thu Nov 6 14:31:34 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 14:31:34 GMT Subject: RFR: Merge openjdk/jdk21u:master Message-ID: <3mTSioNe6lIuX9XiTKFwtTGHZYutyfFp2Xch97_1Ymg=.6a364e89-1c4f-4496-b995-dd1a38a73d42@github.com> Merges tag jdk-21.0.10+1 ------------- Commit messages: - 8369947: Bytecode rewriting causes Java heap corruption on RISC-V - 8353832: Opensource FontClass, Selection and Icon tests - 8346753: Test javax/swing/JMenuItem/RightLeftOrientation/RightLeftOrientation.java fails on Windows Server 2025 x64 because the icons of RBMenuItem and CBMenuItem are not visible in Nimbus LookAndFeel - 8369506: Bytecode rewriting causes Java heap corruption on AArch64 - 8325766: Extend CertificateBuilder to create trust and end entity certificates programmatically - 8358813: JPasswordField identifies spaces in password via delete shortcuts - 8355077: Compiler error at splashscreen_gif.c due to unterminated string initialization - 8355241: Move NativeDialogToFrontBackTest.java PL test to manual category - 8334509: Cancelling PageDialog does not return the same PageFormat object - 8328377: Convert java/awt/Cursor/MultiResolutionCursorTest test to main - ... and 263 more: https://git.openjdk.org/shenandoah-jdk21u/compare/7f146482...485ced0d The webrev contains the conflicts with master: - merge conflicts: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=226&range=00.conflicts Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/226/files Stats: 52671 lines in 2989 files changed: 33410 ins; 10089 del; 9172 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/226.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/226/head:pull/226 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/226 From wkemper at openjdk.org Thu Nov 6 17:04:58 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 17:04:58 GMT Subject: RFR: 8261743: Shenandoah: enable String deduplication with compact heuristics In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:38:36 GMT, Rui Li wrote: > Enable `UseStringDeduplication` when using compact heuristics. > > Testing: > > ./build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact -XX:+PrintFlagsFinal --version | grep UseStringDeduplication > bool UseStringDeduplication = true {product} {default} > > > Note: The labels should be `{product} {ergonomic}` ideally. Pending on a separate issue: [JDK-8371381](https://bugs.openjdk.org/browse/JDK-8371381) Marked as reviewed by wkemper (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28170#pullrequestreview-3429408699 From xpeng at openjdk.org Thu Nov 6 17:27:34 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Nov 2025 17:27:34 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References: <7KjxoZ7UQ9LugnCVgBT6moBrQEfxmPNBXrvyWXD-MQ8=.37907346-c252-4daf-b7bb-9d0f89291753@github.com>

Message-ID: On Tue, 4 Nov 2025 23:08:11 GMT, William Kemper wrote: >> I have removed can_allocate_in_new_region again after merging @kdnilsen's change unifying accountings, since we don't have consistency in accountings anymore. > > We still need to enforce that requests for a young evacuation don't take memory that was reserved for old evacuations. `can_allocate_in_new_region` isn't about consistency between freeset and generation accounting, it's used to maintain old/young collector reserves. I am confused again. My understanding is: we won't, that is is done by the partition, for young evacuation we will only look for regions in the Collector partition, if a region is FREE and reserved for old evacuations, the region will be in OldCollector partition. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28036#discussion_r2500016821 From wkemper at openjdk.org Thu Nov 6 17:43:52 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 17:43:52 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References: <7KjxoZ7UQ9LugnCVgBT6moBrQEfxmPNBXrvyWXD-MQ8=.37907346-c252-4daf-b7bb-9d0f89291753@github.com>

Message-ID: On Thu, 6 Nov 2025 17:24:21 GMT, Xiaolong Peng wrote: >> We still need to enforce that requests for a young evacuation don't take memory that was reserved for old evacuations. `can_allocate_in_new_region` isn't about consistency between freeset and generation accounting, it's used to maintain old/young collector reserves. > > I am confused again. My understanding is: we won't, that is is done by the partition, for young evacuation we will only look for regions in the Collector partition, if a region is FREE and reserved for old evacuations, the region will be in OldCollector partition. My apologies, you're right. The `FREE` region we remember when looking for the region with the same affiliation as the request does come from the same partition as the second loop would have searched. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28036#discussion_r2500097577 From wkemper at openjdk.org Thu Nov 6 17:47:42 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 17:47:42 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References:

Message-ID: <2wIxZ0sslT7_GkFAr5yccBB7zSzfUOZAeaOsf9Ljnqg=.851fb239-04f8-4a85-adf0-42879d646722@github.com> On Fri, 31 Oct 2025 22:09:18 GMT, Xiaolong Peng wrote: >> To allocate an object in Collector/OldCollector partition, current implementation may traverse the regions in the partition twice: >> 1. fast path: traverse regions between left most and right most in the partition, and try to allocate in an affiliated region in the partition; >> 2. if fails in fast path, traverse regions between left most empty and right most empty in the partition, and try try to allocate in a FREE region. >> >> 2 can be saved if we also remember the first FREE region seem in 1. >> >> The PR makes the code much cleaner, and more efficient(although the performance impact may not be measurable, I have run some dacapo benchmarks and didn't see meaningful difference) >> >> >> Test: >> - [x] hotspot_gc_shenandoah > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Remove can_allocate_in_new_region > - Merge remote-tracking branch 'origin/master' into collector-allocation > - Remove condition check for trash region > - Address the PR review comments > - Merge branch 'openjdk:master' into collector-allocation > - Touch up > - Remove test 'req.is_old()' when steal an empty region from the mutator view > - Update comment > - Fix wrong condition when steal an empty region from the mutator view > - Fix potential failure in young evac > - ... and 3 more: https://git.openjdk.org/jdk/compare/ec059c0e...8e01d691 Looks good, sorry for the mix up. ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28036#pullrequestreview-3429628956 From wkemper at openjdk.org Thu Nov 6 17:53:12 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 17:53:12 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> Message-ID: <5J3gg2vmNAxe3JuZb88sM_CYm6DPwk9z8-o3ml3_l28=.38cf585c-c2ab-44da-9321-b448da98b4a7@github.com> On Thu, 6 Nov 2025 01:04:04 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: >> >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Flush SATB buffers upon entering degenerated cycle when old marking is in progress >> >> This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. >> - Fix typo in comment >> - Remove duplicate satb flush closure >> - Only flush satb once during degenerated cycle >> - Cleanup and comments >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Fix assertion >> - Oops, move inline definition out of ifdef ASSERT >> - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1143: > >> 1141: // be in the collection set. If this happens, the pointer will be preserved, essentially >> 1142: // becoming part of the old snapshot. >> 1143: // 2. The region is allocated during evacuation of old. This is also not a concern because > > One related question. In both these cases, I assume the reference will look "marked" because it's above TAMS for the purposes of the old marking? In the first case (in-place promotion) the pointer wouldn't necessarily be above TAMS. In this case, we would leave the object in the 'complete' buffer and it would become marked by the old marking threads (it becomes part of the old snapshot). The second case is not an issue because none of the cset regions will become trash until _after_ `final-update-refs` (so no regions could become old until after we filter the SATB buffers). For regions that are already old, then the TAMS and mark bit map work as usual. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2500141016 From wkemper at openjdk.org Thu Nov 6 18:06:03 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 18:06:03 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> Message-ID: On Thu, 6 Nov 2025 01:18:52 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: >> >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Flush SATB buffers upon entering degenerated cycle when old marking is in progress >> >> This has to happen at least once during the degenerated cycle. Doing it at the start, rather than the end, simplifies the verifier. >> - Fix typo in comment >> - Remove duplicate satb flush closure >> - Only flush satb once during degenerated cycle >> - Cleanup and comments >> - Merge remote-tracking branch 'jdk/master' into piggyback-satb-flush-on-update-roots >> - Fix assertion >> - Oops, move inline definition out of ifdef ASSERT >> - ... and 11 more: https://git.openjdk.org/jdk/compare/1922c4fd...4bd602de > > src/hotspot/share/gc/shenandoah/shenandoahOldGeneration.hpp line 237: > >> 235: // also cases where the referent of a weak reference ends up in the SATB >> 236: // and is later collected. In these cases the oop in the SATB buffer becomes >> 237: // invalid and the _next_ cycle will crash during its marking phase. To > > Again I don't understand the concept of an SATB pointer to an object that was later collected? Are we talking about young objects that are subsequently processed by old marking because they weren't filtered out when they should be? > > I think that is probably the case here, but it would be good to clean up these comments to avoid this confusion. Suppose we have a young collection running while the SATB barrier is active for old marking. The barrier will be enabled for the entirety of the young collection. Now, suppose we have a situation like this: +--Young, CSet------+ +--Young, Regular----+ | | | | | | | | | A <--------------------+ B | | | | | | | | | | | | | | | | | | | | | +-------------------+ +--------------------+ If a mutator overwrites the pointer in `B`, the SATB barrier will enqueue object `A`. These are the objects we need to filter out. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2500196026 From wkemper at openjdk.org Thu Nov 6 18:36:48 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 18:36:48 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v4] In-Reply-To: References: Message-ID: > When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. > > # Background > When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Improve comment describing the need for a method to filter SATB buffers for degenerated cycles ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27983/files - new: https://git.openjdk.org/jdk/pull/27983/files/4bd602de..c741bf6b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27983&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27983&range=02-03 Stats: 24 lines in 1 file changed: 6 ins; 6 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/27983.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27983/head:pull/27983 PR: https://git.openjdk.org/jdk/pull/27983 From wkemper at openjdk.org Thu Nov 6 18:38:08 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 18:38:08 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v6] In-Reply-To: References:

Message-ID: <3fNre-PPou9riBh4OJnCKeMgVIisuncArHrA44Ao4GQ=.4785a481-4280-414e-8a29-b9f7c654accb@github.com> On Thu, 6 Nov 2025 18:36:48 GMT, William Kemper wrote: >> When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. >> >> # Background >> When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment describing the need for a method to filter SATB buffers for degenerated cycles I approved this, but will follow up with you offline to more properly understand this. ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27983#pullrequestreview-3430002460 From ysr at openjdk.org Thu Nov 6 18:57:08 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Nov 2025 18:57:08 GMT Subject: RFR: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old [v3] In-Reply-To: <5J3gg2vmNAxe3JuZb88sM_CYm6DPwk9z8-o3ml3_l28=.38cf585c-c2ab-44da-9321-b448da98b4a7@github.com> References: <-MlfBVpHD57gSd_4_0iIDHI0Cv6HIpTj9H9mX-UHS7g=.54423f43-32dd-49c4-8e7e-705dbbc8c825@github.com> <5J3gg2vmNAxe3JuZb88sM_CYm6DPwk9z8-o3ml3_l28=.38cf585c-c2ab-44da-9321-b448da98b4a7@github.com> Message-ID: On Thu, 6 Nov 2025 17:50:50 GMT, William Kemper wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1143: >> >>> 1141: // be in the collection set. If this happens, the pointer will be preserved, essentially >>> 1142: // becoming part of the old snapshot. >>> 1143: // 2. The region is allocated during evacuation of old. This is also not a concern because >> >> One related question. In both these cases, I assume the reference will look "marked" because it's above TAMS for the purposes of the old marking? > > In the first case (in-place promotion) the pointer wouldn't necessarily be above TAMS. In this case, we would leave the object in the 'complete' buffer and it would become marked by the old marking threads (it becomes part of the old snapshot). > > The second case is not an issue because none of the cset regions will become trash until _after_ `final-update-refs` (so no regions could become old until after we filter the SATB buffers). For regions that are already old, then the TAMS and mark bit map work as usual. Shouldn't promoted objects look black to the old collection, because that's what SATB marking would want? (In other words, the TAMS for the region should be bottom for that promoted in place region for the purposes of old gen marking.) I'll follow up off-line with you so I understand what's happening here better. Meanwhile, I'm going to re-approve this PR because I don't want to hold it hostage to my misunderstanding at this time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27983#discussion_r2500408183 From xpeng at openjdk.org Thu Nov 6 19:01:22 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Nov 2025 19:01:22 GMT Subject: RFR: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration [v3] In-Reply-To: References:

Message-ID: On Fri, 31 Oct 2025 22:09:18 GMT, Xiaolong Peng wrote: >> To allocate an object in Collector/OldCollector partition, current implementation may traverse the regions in the partition twice: >> 1. fast path: traverse regions between left most and right most in the partition, and try to allocate in an affiliated region in the partition; >> 2. if fails in fast path, traverse regions between left most empty and right most empty in the partition, and try try to allocate in a FREE region. >> >> 2 can be saved if we also remember the first FREE region seem in 1. >> >> The PR makes the code much cleaner, and more efficient(although the performance impact may not be measurable, I have run some dacapo benchmarks and didn't see meaningful difference) >> >> >> Test: >> - [x] hotspot_gc_shenandoah > > Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Remove can_allocate_in_new_region > - Merge remote-tracking branch 'origin/master' into collector-allocation > - Remove condition check for trash region > - Address the PR review comments > - Merge branch 'openjdk:master' into collector-allocation > - Touch up > - Remove test 'req.is_old()' when steal an empty region from the mutator view > - Update comment > - Fix wrong condition when steal an empty region from the mutator view > - Fix potential failure in young evac > - ... and 3 more: https://git.openjdk.org/jdk/compare/ec059c0e...8e01d691 Thanks a lot for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28036#issuecomment-3498932713 From xpeng at openjdk.org Thu Nov 6 19:01:23 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Nov 2025 19:01:23 GMT Subject: Integrated: 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 05:29:14 GMT, Xiaolong Peng wrote: > To allocate an object in Collector/OldCollector partition, current implementation may traverse the regions in the partition twice: > 1. fast path: traverse regions between left most and right most in the partition, and try to allocate in an affiliated region in the partition; > 2. if fails in fast path, traverse regions between left most empty and right most empty in the partition, and try try to allocate in a FREE region. > > 2 can be saved if we also remember the first FREE region seem in 1. > > The PR makes the code much cleaner, and more efficient(although the performance impact may not be measurable, I have run some dacapo benchmarks and didn't see meaningful difference) > > > Test: > - [x] hotspot_gc_shenandoah This pull request has now been integrated. Changeset: 9cc542eb Author: Xiaolong Peng URL: https://git.openjdk.org/jdk/commit/9cc542ebcb81552fe8c32a8cc3c63332853e5127 Stats: 77 lines in 2 files changed: 18 ins; 45 del; 14 mod 8370850: Shenandoah: Simplify collector allocation to save unnecessary region iteration Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/jdk/pull/28036 From azeller at openjdk.org Thu Nov 6 19:36:06 2025 From: azeller at openjdk.org (Arno Zeller) Date: Thu, 6 Nov 2025 19:36:06 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space In-Reply-To: References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Thu, 6 Nov 2025 11:17:01 GMT, SendaoYan wrote: >> Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. >> >> Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. >> >> Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. >> >> >> >> public class TestLargeObjectAlignmentDeterministic { >> >> static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); >> static final int NODE_COUNT = Integer.getInteger("nodes", 10000); >> static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); >> >> static Object[] objects; >> >> public static void main(String[] args) throws Exception { >> objects = new Object[SLABS_COUNT]; >> >> for (int i = 0; i < SLABS_COUNT; i++) { >> objects[i] = createSome(); >> } >> } >> >> public static Object createSome() { >> List result = new ArrayList(); >> for (int c = 0; c < NODE_COUNT; c++) { >> result.add(new Integer(c)); >> } >> return result; >> } >> >> } > > test/hotspot/jtreg/gc/shenandoah/TestLargeObjectAlignment.java line 34: > >> 32: * @library /test/lib >> 33: * >> 34: * @run main/othervm -Xms3g -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ObjectAlignmentInBytes=16 -Xint TestLargeObjectAlignment > > Since the initial heap memory set to 3G, maybe we should add '@requires os.maxMemory > 4g' to skip this test when the physical memory of test machine is less than 4g. I suggest to also set -Xmx3g - otherwise the test will fail in case an -Xmx value of less than 3GB is set by a jtreg -vmoption. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28167#discussion_r2500542446 From wkemper at openjdk.org Thu Nov 6 19:40:14 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 19:40:14 GMT Subject: Integrated: 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old In-Reply-To: References: Message-ID: On Fri, 24 Oct 2025 21:00:40 GMT, William Kemper wrote: > When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'. > > # Background > When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers _after_ the thread local SATB buffers were completed. After this change, such pointers are filtered out _before_ the buffers are completed. This is more inline with the natural way of things. This pull request has now been integrated. Changeset: cad73d39 Author: William Kemper URL: https://git.openjdk.org/jdk/commit/cad73d39762974776dd6fda5efe4e2a271d69f14 Stats: 331 lines in 11 files changed: 108 ins; 198 del; 25 mod 8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old Reviewed-by: ysr, kdnilsen ------------- PR: https://git.openjdk.org/jdk/pull/27983 From duke at openjdk.org Thu Nov 6 20:46:13 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 20:46:13 GMT Subject: RFR: 8261743: Shenandoah: enable String deduplication with compact heuristics In-Reply-To: References:

Message-ID: <2pqCwlRDMHj8qRBGXssro17pUf9sGnMLaeb4J_mY_iQ=.99c31242-76e5-4f08-b818-e7dfd0a38ed1@github.com> On Thu, 6 Nov 2025 09:36:06 GMT, Aleksey Shipilev wrote: > Looks good. Have you measured any impact on pauses / GC durations in testing? Yeah. Had a simple benchmark below (credit to [here](https://muratakkan.medium.com/understanding-string-deduplication-in-java-how-it-works-and-when-to-use-it-fbda71711435)): @Benchmark public void testMethod(Blackhole bh) { // This is a demo/sample template for building your JMH benchmarks. Edit as needed. // Put your benchmark code here. String[] strings = new String[1000000]; for (int i = 0; i < strings.length; i++) { strings[i] = "This is a test string"; } bh.consume(strings); } Results: ######## java -jar target/benchmarks.jar --jvmArgs "-XX:-UseStringDeduplication -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc ######## Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod thrpt 3 3181.651 ? 248.835 ops/s MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12136.888 ? 949.185 MB/sec MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.188 ? 0.016 B/op MyBenchmark.testMethod:gc.count thrpt 3 1568.000 counts MyBenchmark.testMethod:gc.time thrpt 3 1882.000 ms ######## java -jar target/benchmarks.jar --jvmArgs "-XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc ######## Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod thrpt 3 3155.961 ? 365.174 ops/s # throuput decreased by 0.8% MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12038.882 ? 1394.186 MB/sec # decreased by 0.8% MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.190 ? 0.022 B/op # same MyBenchmark.testMethod:gc.count thrpt 3 1172.000 counts # decreased by 25% MyBenchmark.testMethod:gc.time thrpt 3 726.000 ms # decreased by 38.6% The alloc rate / throughput didn't change much, but the gc count and gc time reduced by 25% and 38.6%. Not that familiar with string deduplication, but according to https://openjdk.org/jeps/192, the results seem to match the implementation - does not dedup at allocation time, but dedup the string internal char array at gc time and gc would be more efficient. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28170#issuecomment-3499316594 From wkemper at openjdk.org Thu Nov 6 21:00:34 2025 From: wkemper at openjdk.org (William Kemper) Date: Thu, 6 Nov 2025 21:00:34 GMT Subject: RFR: 8370039: GenShen: array copy SATB barrier improvements Message-ID: When an array copy happens concurrently with old and young marking, Shenandoah's generational mode walks over the array twice. This is unnecessary and increases the workload for marking threads. It also has been unconditionally enqueuing old references during a young mark. This is also unnecessary and also increases marking workload. Finally, the barrier went through a somewhat complicated decision process based on affiliation of the region where the array resides. However, the barrier must consider the affiliation of objects that are pointed at by array elements. ------------- Commit messages: - More simplification of arraycopy mark barrier - Merge remote-tracking branch 'jdk/master' into satb-fixes - Merge remote-tracking branch 'jdk/master' into satb-fixes - Why does this break? - Simplify arraycopy barrier case for marking - Merge remote-tracking branch 'jdk/master' into satb-fixes - Do not unconditionally enqueue old objects during marking Changes: https://git.openjdk.org/jdk/pull/28183/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28183&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370039 Stats: 53 lines in 2 files changed: 0 ins; 46 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/28183.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28183/head:pull/28183 PR: https://git.openjdk.org/jdk/pull/28183 From ysr at openjdk.org Thu Nov 6 21:12:03 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Nov 2025 21:12:03 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v6] In-Reply-To: References:

Message-ID: On Wed, 5 Nov 2025 02:24:11 GMT, Kelvin Nilsen wrote: >> When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. >> >> The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Fixup handling of weakly marked objects in remembered set" > > This reverts commit 80198abe5d06c3532d9a43a53691376e990ed45f. It looks like a few older comments had not been published previously. I am flushing those and will take a fresh look at the review. ------------- PR Review: https://git.openjdk.org/jdk/pull/27353#pullrequestreview-3300762300 From ysr at openjdk.org Thu Nov 6 21:12:12 2025 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 6 Nov 2025 21:12:12 GMT Subject: RFR: 8358735: GenShen: bug in #undef'd code in block_start() [v2] In-Reply-To: <7zV-fLvjb-4gBVTppg4XTXPNxEheqLfxB0v_WONuinI=.22775b58-42cf-499e-9007-fad07118217d@github.com> References: <7zV-fLvjb-4gBVTppg4XTXPNxEheqLfxB0v_WONuinI=.22775b58-42cf-499e-9007-fad07118217d@github.com> Message-ID: On Fri, 3 Oct 2025 00:21:11 GMT, Kelvin Nilsen wrote: >> When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. >> >> The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > fix idiosyncratic formatting src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 129: > 127: inline idx_t get_prev_bit_impl(idx_t l_index, idx_t r_index) const; > 128: > 129: inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const; Please document analogous to line 131. src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 131: > 129: inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const; > 130: > 131: // Search for last one in the range [l_index, r_index). Return r_index if not found. Symmetry arguments wrt spec for `get_next_one_offset` may have preferred range `(l_index, r_index]`, returning `l_index` if none found. May be its (transitive) usage prefers this shape? (See similar comment at line 180.) src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 134: > 132: inline idx_t get_prev_one_offset(idx_t l_index, idx_t r_index) const; > 133: > 134: void clear_large_range(idx_t beg, idx_t end); documentation comment. src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.hpp line 180: > 178: const HeapWord* limit) const; > 179: > 180: // Return the last marked address in the range [limit, addr], or addr+1 if none found. Symmetry would have preferred `(limit, addr]` as the range with `limit` if none found. However, may be usage of this method prefers the present shape? src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 251: > 249: // if marking context is valid and we are below tams, we use the marking bit map to find the first marked object that > 250: // intersects with this card, and if no such object exists, we return null > 251: if ((ctx != nullptr) && (left < tams)) { It seems like the caller should check if `left >= tams` and short-circuit rather than have this method do that work. src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 663: > 661: // we expect that the marking context isn't available and the crossing maps are valid. > 662: // Note that crossing maps may be invalid following class unloading and before dead > 663: // or unloaded objects have been coalesced and filled (updating the crossing maps). Good comment! What's still not clear is why `tams` and `last_relevant_card_index` are passed here. Does it reduce the work in the caller? I'd expect this to just return the first object on the card index or null if no such object exists. I realize `ctx` is used when one must consult the marking context in preference to the "crossing maps". The relevance of the last 2 arguments isn't clear from this documentation comment. May be I'll see why these are passed in when I look at the method definition, but I suspect there may be some leakage of abstraction & functionality here between caller and callee. src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 177: > 175: // common. > 176: assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error"); > 177: HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, dirty_r); Passing `ctx`, `tams`, and `dirty_r` into this method seems interesting. Let's see how they are used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403250259 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403253570 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403255239 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403247074 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403311952 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403309508 PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2403300158 From duke at openjdk.org Thu Nov 6 21:54:00 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 21:54:00 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Thu, 6 Nov 2025 00:21:55 GMT, Rui Li wrote: > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } Will set Xmx to 3g instead and add `@requires os.maxMemory > 3g` Initially set Xms to 3g was purely to satisfy the "minimal heap size needed" condition for the test. `@requires os.maxMemory` is apparently more close to what I was thinking and I didn't think about reproducibility. Thanks for the suggestions! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28167#issuecomment-3499525127 From duke at openjdk.org Thu Nov 6 22:25:32 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 22:25:32 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v2] In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } Rui Li has updated the pull request incrementally with one additional commit since the last revision: Set Xmx to 3g ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28167/files - new: https://git.openjdk.org/jdk/pull/28167/files/31e4adfc..6c442aa0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=00-01 Stats: 10 lines in 1 file changed: 2 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/28167.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28167/head:pull/28167 PR: https://git.openjdk.org/jdk/pull/28167 From duke at openjdk.org Thu Nov 6 22:35:09 2025 From: duke at openjdk.org (duke) Date: Thu, 6 Nov 2025 22:35:09 GMT Subject: RFR: 8261743: Shenandoah: enable String deduplication with compact heuristics In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:38:36 GMT, Rui Li wrote: > Enable `UseStringDeduplication` when using compact heuristics. > > Testing: > > ./build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact -XX:+PrintFlagsFinal --version | grep UseStringDeduplication > bool UseStringDeduplication = true {product} {default} > > > Note: The labels should be `{product} {ergonomic}` ideally. Pending on a separate issue: [JDK-8371381](https://bugs.openjdk.org/browse/JDK-8371381) > > > ------- > > Edit: add benchmark results: > > Had a simple benchmark below (credit to [here](https://muratakkan.medium.com/understanding-string-deduplication-in-java-how-it-works-and-when-to-use-it-fbda71711435)): > > > > @Benchmark > public void testMethod(Blackhole bh) { > String[] strings = new String[1000000]; > for (int i = 0; i < strings.length; i++) { > strings[i] = "This is a test string"; > } > bh.consume(strings); > } > > > Results: > > ######## > java -jar target/benchmarks.jar --jvmArgs "-XX:-UseStringDeduplication -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc > ######## > Benchmark Mode Cnt Score Error Units > MyBenchmark.testMethod thrpt 3 3181.651 ? 248.835 ops/s > MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12136.888 ? 949.185 MB/sec > MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.188 ? 0.016 B/op > MyBenchmark.testMethod:gc.count thrpt 3 1568.000 counts > MyBenchmark.testMethod:gc.time thrpt 3 1882.000 ms > > ######## > java -jar target/benchmarks.jar --jvmArgs "-XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc > ######## > Benchmark Mode Cnt Score Error Units > MyBenchmark.testMethod thrpt 3 3155.961 ? 365.174 ops/s # throuput decreased by 0.8% > MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12038.882 ? 1394.186 MB/sec # decreased by 0.8% > MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.190 ? 0.022 B/op # same > MyBenchmark.testMethod:gc.count thrpt 3 1172.000 counts # decreased by 25% > MyBenchmark.testMethod:gc.time thrpt 3 726.000 ms # decrea... @rgithubli Your change (at version 2407976e137e42f3f831690c14501b0e338a4c4e) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28170#issuecomment-3499636944 From duke at openjdk.org Thu Nov 6 22:41:20 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 22:41:20 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v3] In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: <8F4jfHS1_VpRu9qzZdgO85PMEZRvL3FVMHMQDdmr7MU=.da7b2e1c-282b-47d1-9c46-aa684520acbb@github.com> > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } Rui Li has updated the pull request incrementally with one additional commit since the last revision: Adjust tag order ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28167/files - new: https://git.openjdk.org/jdk/pull/28167/files/6c442aa0..3dbe9a10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28167.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28167/head:pull/28167 PR: https://git.openjdk.org/jdk/pull/28167 From xpeng at openjdk.org Thu Nov 6 23:16:23 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Nov 2025 23:16:23 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v6] In-Reply-To: References: Message-ID: > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 138 commits: - Add assert back - Address test failures after merging the change from master which unify the accounting in FreeSet and ShenandoahGeneration - Merge remote-tracking branch 'origin/master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - format - Merge branch 'openjdk:master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - Merge branch 'master' into cas-alloc-1 - Move ShenandoahHeapRegionIterationClosure to shenandoahFreeSet.hpp - ... and 128 more: https://git.openjdk.org/jdk/compare/ec059c0e...dfb9c415 ------------- Changes: https://git.openjdk.org/jdk/pull/26171/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=05 Stats: 780 lines in 17 files changed: 690 ins; 23 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From xpeng at openjdk.org Thu Nov 6 23:29:10 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 6 Nov 2025 23:29:10 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v5] In-Reply-To: References:

Message-ID: On Wed, 5 Nov 2025 22:28:06 GMT, Kelvin Nilsen wrote: >> Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 135 commits: >> >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - format >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'master' into cas-alloc-1 >> - Move ShenandoahHeapRegionIterationClosure to shenandoahFreeSet.hpp >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Fix errors caused by renaming ofAtomic to AtomicAccess >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - ... and 125 more: https://git.openjdk.org/jdk/compare/2f613911...e6bfef05 > > src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 559: > >> 557: range(1, 128) \ >> 558: \ >> 559: product(uintx, ShenandoahDirectAllocationMaxProbes, 3, EXPERIMENTAL, \ > > I think we found that setting DirectAllocationMaxProbes to equal ShenandoahDirectlyAlloctableRegionCount works "best". I'm inclined to remove this parameter entirely as it somewhat simplifies the implementation. If you think we want to keep it, can you explain the rationale? Would we change the default value? I just fixed all the test failures after merging your changes from tip. Totally agreed, I do plan to remove DirectAllocationMaxProbes, we don't need to keep it based on the test result, I'll update the PR to remove it shortly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2501197258 From duke at openjdk.org Thu Nov 6 23:50:13 2025 From: duke at openjdk.org (Rui Li) Date: Thu, 6 Nov 2025 23:50:13 GMT Subject: Integrated: 8261743: Shenandoah: enable String deduplication with compact heuristics In-Reply-To: References: Message-ID: <3KIq-OuvFuL4npIJZcS5IonVljwX8UMjOCfZNv4Pl-I=.ec7ead22-0fae-4993-ac22-6fa0b3a2c824@github.com> On Thu, 6 Nov 2025 01:38:36 GMT, Rui Li wrote: > Enable `UseStringDeduplication` when using compact heuristics. > > Testing: > > ./build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact -XX:+PrintFlagsFinal --version | grep UseStringDeduplication > bool UseStringDeduplication = true {product} {default} > > > Note: The labels should be `{product} {ergonomic}` ideally. Pending on a separate issue: [JDK-8371381](https://bugs.openjdk.org/browse/JDK-8371381) > > > ------- > > Edit: add benchmark results: > > Had a simple benchmark below (credit to [here](https://muratakkan.medium.com/understanding-string-deduplication-in-java-how-it-works-and-when-to-use-it-fbda71711435)): > > > > @Benchmark > public void testMethod(Blackhole bh) { > String[] strings = new String[1000000]; > for (int i = 0; i < strings.length; i++) { > strings[i] = "This is a test string"; > } > bh.consume(strings); > } > > > Results: > > ######## > java -jar target/benchmarks.jar --jvmArgs "-XX:-UseStringDeduplication -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc > ######## > Benchmark Mode Cnt Score Error Units > MyBenchmark.testMethod thrpt 3 3181.651 ? 248.835 ops/s > MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12136.888 ? 949.185 MB/sec > MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.188 ? 0.016 B/op > MyBenchmark.testMethod:gc.count thrpt 3 1568.000 counts > MyBenchmark.testMethod:gc.time thrpt 3 1882.000 ms > > ######## > java -jar target/benchmarks.jar --jvmArgs "-XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=compact" -prof gc > ######## > Benchmark Mode Cnt Score Error Units > MyBenchmark.testMethod thrpt 3 3155.961 ? 365.174 ops/s # throuput decreased by 0.8% > MyBenchmark.testMethod:gc.alloc.rate thrpt 3 12038.882 ? 1394.186 MB/sec # decreased by 0.8% > MyBenchmark.testMethod:gc.alloc.rate.norm thrpt 3 4000016.190 ? 0.022 B/op # same > MyBenchmark.testMethod:gc.count thrpt 3 1172.000 counts # decreased by 25% > MyBenchmark.testMethod:gc.time thrpt 3 726.000 ms # decrea... This pull request has now been integrated. Changeset: e34a8318 Author: Rui Li Committer: Xiaolong Peng URL: https://git.openjdk.org/jdk/commit/e34a831814996be3e0a2df86b11b1718a76ea558 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8261743: Shenandoah: enable String deduplication with compact heuristics Reviewed-by: shade, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/28170 From kdnilsen at openjdk.org Thu Nov 6 23:52:05 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 6 Nov 2025 23:52:05 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v2] In-Reply-To: References: <7zV-fLvjb-4gBVTppg4XTXPNxEheqLfxB0v_WONuinI=.22775b58-42cf-499e-9007-fad07118217d@github.com> Message-ID: On Fri, 3 Oct 2025 20:47:16 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> fix idiosyncratic formatting > > src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 663: > >> 661: // we expect that the marking context isn't available and the crossing maps are valid. >> 662: // Note that crossing maps may be invalid following class unloading and before dead >> 663: // or unloaded objects have been coalesced and filled (updating the crossing maps). > > Good comment! > > What's still not clear is why `tams` and `last_relevant_card_index` are passed here. Does it reduce the work in the caller? I'd expect this to just return the first object on the card index or null if no such object exists. I realize `ctx` is used when one must consult the marking context in preference to the "crossing maps". The relevance of the last 2 arguments isn't clear from this documentation comment. > > May be I'll see why these are passed in when I look at the method definition, but I suspect there may be some leakage of abstraction & functionality here between caller and callee. Thanks for identifying this "confusion". I'm making an attempt to improve documentation for this comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2501237330 From xpeng at openjdk.org Fri Nov 7 00:12:25 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 00:12:25 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v7] In-Reply-To: References: Message-ID: > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Set region back to empty and unaffiliated when release a directly allocatable region(may happen in full GC) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26171/files - new: https://git.openjdk.org/jdk/pull/26171/files/dfb9c415..3fdb0bc0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=05-06 Stats: 12 lines in 2 files changed: 10 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From xpeng at openjdk.org Fri Nov 7 00:49:53 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 00:49:53 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v8] In-Reply-To: References: Message-ID: > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request incrementally with three additional commits since the last revision: - Remove unnecessary fences - tidy up - Remove ShenandoahDirectAllocationMaxProbes, simplifications ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26171/files - new: https://git.openjdk.org/jdk/pull/26171/files/3fdb0bc0..68b4673a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=06-07 Stats: 55 lines in 3 files changed: 12 ins; 34 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From xpeng at openjdk.org Fri Nov 7 01:07:51 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 01:07:51 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v9] In-Reply-To: References: Message-ID: <_31UGErPvJMV2V7F92kffihoUTtJ7L3HrX-SGgTXj_M=.030a9b2a-8ef0-4d23-a2dd-69ad70e17656@github.com> > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 143 commits: - Merge branch 'openjdk:master' into cas-alloc-1 - Remove unnecessary fences - tidy up - Remove ShenandoahDirectAllocationMaxProbes, simplifications - Set region back to empty and unaffiliated when release a directly allocatable region(may happen in full GC) - Add assert back - Address test failures after merging the change from master which unify the accounting in FreeSet and ShenandoahGeneration - Merge remote-tracking branch 'origin/master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - ... and 133 more: https://git.openjdk.org/jdk/compare/e34a8318...e65457f6 ------------- Changes: https://git.openjdk.org/jdk/pull/26171/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=08 Stats: 767 lines in 17 files changed: 677 ins; 23 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From syan at openjdk.org Fri Nov 7 02:12:06 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 7 Nov 2025 02:12:06 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v3] In-Reply-To: <8F4jfHS1_VpRu9qzZdgO85PMEZRvL3FVMHMQDdmr7MU=.da7b2e1c-282b-47d1-9c46-aa684520acbb@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> <8F4jfHS1_VpRu9qzZdgO85PMEZRvL3FVMHMQDdmr7MU=.da7b2e1c-282b-47d1-9c46-aa684520acbb@github.com> Message-ID: On Thu, 6 Nov 2025 22:41:20 GMT, Rui Li wrote: >> Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. >> >> Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. >> >> Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. >> >> >> >> public class TestLargeObjectAlignmentDeterministic { >> >> static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); >> static final int NODE_COUNT = Integer.getInteger("nodes", 10000); >> static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); >> >> static Object[] objects; >> >> public static void main(String[] args) throws Exception { >> objects = new Object[SLABS_COUNT]; >> >> for (int i = 0; i < SLABS_COUNT; i++) { >> objects[i] = createSome(); >> } >> } >> >> public static Object createSome() { >> List result = new ArrayList(); >> for (int c = 0; c < NODE_COUNT; c++) { >> result.add(new Integer(c)); >> } >> return result; >> } >> >> } > > Rui Li has updated the pull request incrementally with one additional commit since the last revision: > > Adjust tag order test/hotspot/jtreg/gc/shenandoah/TestLargeObjectAlignment.java line 32: > 30: * @requires vm.gc.Shenandoah > 31: * @requires vm.bits == "64" > 32: * @requires os.maxMemory > 3G I think os.maxMemory should slightly large than max heap memory size, since heap memory is not the only memory in jvm. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28167#discussion_r2501446720 From kdnilsen at openjdk.org Fri Nov 7 06:14:45 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 7 Nov 2025 06:14:45 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v7] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: fix up comments and simplify API for ShenandoahScanRemembered::first_object_start() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/643cdfd6..637c1775 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=05-06 Stats: 43 lines in 3 files changed: 19 ins; 13 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From kdnilsen at openjdk.org Fri Nov 7 18:25:37 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 7 Nov 2025 18:25:37 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v8] In-Reply-To: References: Message-ID: <6QHIbCuNkwTlUX2e3kjvaQF0G6WjJSQbP0bmqLtoYXQ=.1b42521a-eff2-4eab-834b-97c9dc537f2b@github.com> > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: consider last_relevant_card in determining right-most address ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/637c1775..0e2120b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From xpeng at openjdk.org Fri Nov 7 19:42:31 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 19:42:31 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v10] In-Reply-To: References: Message-ID: > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 144 commits: - Merge branch 'openjdk:master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - Remove unnecessary fences - tidy up - Remove ShenandoahDirectAllocationMaxProbes, simplifications - Set region back to empty and unaffiliated when release a directly allocatable region(may happen in full GC) - Add assert back - Address test failures after merging the change from master which unify the accounting in FreeSet and ShenandoahGeneration - Merge remote-tracking branch 'origin/master' into cas-alloc-1 - Merge branch 'openjdk:master' into cas-alloc-1 - ... and 134 more: https://git.openjdk.org/jdk/compare/2c3c4707...69aec972 ------------- Changes: https://git.openjdk.org/jdk/pull/26171/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=09 Stats: 767 lines in 17 files changed: 677 ins; 23 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From xpeng at openjdk.org Fri Nov 7 19:42:31 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 19:42:31 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v5] In-Reply-To: <5YSA3F88CmDDv09M2KOm_EFNDh_09LPO2WMrgETfupI=.cc658dc9-829e-41a5-ad76-393d3eb0f75a@github.com> References:

<5YSA3F88CmDDv09M2KOm_EFNDh_09LPO2WMrgETfupI=.cc658dc9-829e-41a5-ad76-393d3eb0f75a@github.com> Message-ID: On Wed, 5 Nov 2025 19:00:37 GMT, Kelvin Nilsen wrote: >> src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 848: >> >>> 846: HeapWord* ShenandoahFreeSet::allocate_with_affiliation(Iter& iterator, ShenandoahAffiliation affiliation, ShenandoahAllocRequest& req, bool& in_new_region) { >>> 847: for (idx_t idx = iterator.current(); iterator.has_next(); idx = iterator.next()) { >>> 848: ShenandoahHeapRegion* r = _heap->get_region(idx); >> >> I wonder if we could refine this a little bit. When the region is moved into the "directly allocatable" set, wouldn't we remove it from its partition? Then, we wouldn't have to test for !r->reserved_for_direct_allocation() here because the iterator wouldn't produce it. >> >> We could maybe replace this test with an assert that !r->reserved_for_direct_allocation(). > > Same issue in other uses of the allocation iterator. You are right, allocate_with_affiliation is only called from collector, it won't see any regions from Mutator partition, we don't even need to the assert here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2505191139 From xpeng at openjdk.org Fri Nov 7 19:48:13 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 19:48:13 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v5] In-Reply-To: References:

Message-ID: On Wed, 5 Nov 2025 19:02:59 GMT, Kelvin Nilsen wrote: >> Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 135 commits: >> >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - format >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Merge branch 'master' into cas-alloc-1 >> - Move ShenandoahHeapRegionIterationClosure to shenandoahFreeSet.hpp >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - Fix errors caused by renaming ofAtomic to AtomicAccess >> - Merge branch 'openjdk:master' into cas-alloc-1 >> - ... and 125 more: https://git.openjdk.org/jdk/compare/2f613911...e6bfef05 > > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1268: > >> 1266: // If region is not completely free, the current [beg; end] is useless, and we may fast-forward. If we can extend >> 1267: // the existing range, we can exploit that certain regions are already known to be in the Mutator free set. >> 1268: ShenandoahHeapRegion* region = _heap->get_region(end); > > Here also, if we remove the region from the partition when we make it directly allocatable, we would not need to rewrite this loop. Yes, same as your comment above. Mutator thread shouldn't see any region which has been made directly allocatable in the Mutator partition here. > src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 2204: > >> 2202: i++; >> 2203: } >> 2204: return obj; > > I think obj always equals nullptr at this point. Seems the code would be easier to understand (and would depend less on effective compiler optimization) if we just made that explicit. Can we just say: > > return nullptr? Yes, it is always `nullptr`, `return nullptr` will make the code more readable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2505212589 PR Review Comment: https://git.openjdk.org/jdk/pull/26171#discussion_r2505221448 From duke at openjdk.org Fri Nov 7 21:02:34 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 21:02:34 GMT Subject: RFR: 8371284: GenShen: Avoid unnecessary card marking Message-ID: Exclude young-young, old-old and honor UseCondCardMark in dirty card marking. ------------- Commit messages: - clean whitespace - minor cleanup - Merge branch 'openjdk:master' into 8371284 - exclude young-yong, old-old and honor UseCondCardMark in dirty card marking Changes: https://git.openjdk.org/jdk/pull/28204/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28204&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371284 Stats: 21 lines in 1 file changed: 20 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28204.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28204/head:pull/28204 PR: https://git.openjdk.org/jdk/pull/28204 From xpeng at openjdk.org Fri Nov 7 21:06:28 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 7 Nov 2025 21:06:28 GMT Subject: RFR: 8361099: Shenandoah: Improve heap lock contention by using CAS for memory allocation [v11] In-Reply-To: References: Message-ID: <6pTKMOkjQwuamgBRyOItAlgwlNQD18aupF0ykkYN3GA=.690dc564-2ce0-4684-9a25-0c7162857167@github.com> > Shenandoah always allocates memory with heap lock, we have observed heavy heap lock contention on memory allocation path in performance analysis of some service in which we tried to adopt Shenandoah. This change is to propose an optimization for the code path of mutator memory allocation to improve heap lock contention, at vey high level, here is how it works: > * ShenandoahFreeSet holds a N (default to 13) number of ShenandoahHeapRegion* which are used by mutator threads for regular object allocations, they are called shared regions/directly allocatable regions, which are stored in PaddedEnd data structure(padded array). > * Each mutator thread will be assigned one of the directly allocatable regions, the thread will try to allocate in the directly allocatable region with CAS atomic operation, if fails will try 2 more consecutive directly allocatable regions in the array storing directly allocatable region. > * If mutator thread fails after trying 3 directly allocatable regions, it will: > * Take heap lock > * Try to retire the directly allocatable regions which are ready to retire. > * Iterator mutator partition and allocate directly allocatable regions and store to the padded array if any need to be retired. > * Satisfy mutator allocation request if possible. > > > I'm not expecting significant performance impact for most of the cases since in most case the contention on heap lock it not high enough to cause performance issue, I have done many tests, here are some of them: > > 1. Dacapo lusearch test on EC2 host with 96 CPU cores: > Openjdk TIP: > > [ec2-user at ip-172-31-42-91 jdk]$ ./master-jdk/bin/java -XX:-TieredCompilation -XX:+AlwaysPreTouch -Xms4G -Xmx4G -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-ShenandoahUncommit -XX:ShenandoahGCMode=generational -XX:+UseTLAB -jar ~/tools/dacapo/dacapo-23.11-MR2-chopin.jar -n 10 lusearch | grep "metered full smoothing" > ===== DaCapo tail latency, metered full smoothing: 50% 131684 usec, 90% 200192 usec, 99% 211369 usec, 99.9% 212517 usec, 99.99% 213043 usec, max 235289 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 1568 usec, 90% 36101 usec, 99% 42172 usec, 99.9% 42928 usec, 99.99% 43100 usec, max 43305 usec, measured over 524288 events ===== > ===== DaCapo tail latency, metered full smoothing: 50% 52644 usec, 90% 124393 usec, 99% 137711 usec, 99.9% 139355 usec, 99.99% 139749 usec, max 146722 usec, measured over 524288 events ====... Xiaolong Peng has updated the pull request incrementally with four additional commits since the last revision: - Fix wrong asserts - Remove unnecessary test for directly allocable region - Merge branch 'cas-alloc-1' of https://github.com/pengxiaolong/jdk into cas-alloc-1 - tidy up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26171/files - new: https://git.openjdk.org/jdk/pull/26171/files/69aec972..b1c3b90f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26171&range=09-10 Stats: 11 lines in 1 file changed: 1 ins; 4 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26171.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26171/head:pull/26171 PR: https://git.openjdk.org/jdk/pull/26171 From wkemper at openjdk.org Fri Nov 7 21:44:01 2025 From: wkemper at openjdk.org (William Kemper) Date: Fri, 7 Nov 2025 21:44:01 GMT Subject: RFR: 8371284: GenShen: Avoid unnecessary card marking In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 20:42:25 GMT, Nityanand Rai wrote: > Exclude young-young, old-old and honor UseCondCardMark in dirty card marking. Let's make sure we don't mark cards for null objects. src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 200: > 198: // Exclude old-old > 199: T heap_oop = RawAccess<>::oop_load(field); > 200: if (!CompressedOops::is_null(heap_oop)) { We can return early if `CompressedOops::is_null` (we don't need to remember old pointers that are null). ------------- Changes requested by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28204#pullrequestreview-3436538439 PR Review Comment: https://git.openjdk.org/jdk/pull/28204#discussion_r2505630257 From duke at openjdk.org Fri Nov 7 21:51:22 2025 From: duke at openjdk.org (Rui Li) Date: Fri, 7 Nov 2025 21:51:22 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v4] In-Reply-To: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: > Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. > > Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. > > Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. > > > > public class TestLargeObjectAlignmentDeterministic { > > static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); > static final int NODE_COUNT = Integer.getInteger("nodes", 10000); > static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); > > static Object[] objects; > > public static void main(String[] args) throws Exception { > objects = new Object[SLABS_COUNT]; > > for (int i = 0; i < SLABS_COUNT; i++) { > objects[i] = createSome(); > } > } > > public static Object createSome() { > List result = new ArrayList(); > for (int c = 0; c < NODE_COUNT; c++) { > result.add(new Integer(c)); > } > return result; > } > > } Rui Li has updated the pull request incrementally with one additional commit since the last revision: Bump maxMem to 4g ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28167/files - new: https://git.openjdk.org/jdk/pull/28167/files/3dbe9a10..142d00e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28167&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28167.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28167/head:pull/28167 PR: https://git.openjdk.org/jdk/pull/28167 From duke at openjdk.org Fri Nov 7 23:09:47 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 23:09:47 GMT Subject: RFR: 8371284: GenShen: Avoid unnecessary card marking [v2] In-Reply-To: References: Message-ID: > Exclude young-young, old-old and honor UseCondCardMark in dirty card marking. Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: early return of oop is null ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28204/files - new: https://git.openjdk.org/jdk/pull/28204/files/09ad25ab..59f7a0d0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28204&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28204&range=00-01 Stats: 8 lines in 1 file changed: 1 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/28204.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28204/head:pull/28204 PR: https://git.openjdk.org/jdk/pull/28204 From duke at openjdk.org Fri Nov 7 23:09:48 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 23:09:48 GMT Subject: RFR: 8371284: GenShen: Avoid unnecessary card marking [v2] In-Reply-To: References:

Message-ID: On Fri, 7 Nov 2025 21:40:38 GMT, William Kemper wrote: >> Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: >> >> early return of oop is null > > src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 200: > >> 198: // Exclude old-old >> 199: T heap_oop = RawAccess<>::oop_load(field); >> 200: if (!CompressedOops::is_null(heap_oop)) { > > We can return early if `CompressedOops::is_null` (we don't need to remember old pointers that are null). Good point, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28204#discussion_r2505799473 From syan at openjdk.org Sat Nov 8 02:31:03 2025 From: syan at openjdk.org (SendaoYan) Date: Sat, 8 Nov 2025 02:31:03 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v4] In-Reply-To: References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Fri, 7 Nov 2025 21:51:22 GMT, Rui Li wrote: >> Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. >> >> Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. >> >> Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. >> >> >> >> public class TestLargeObjectAlignmentDeterministic { >> >> static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); >> static final int NODE_COUNT = Integer.getInteger("nodes", 10000); >> static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); >> >> static Object[] objects; >> >> public static void main(String[] args) throws Exception { >> objects = new Object[SLABS_COUNT]; >> >> for (int i = 0; i < SLABS_COUNT; i++) { >> objects[i] = createSome(); >> } >> } >> >> public static Object createSome() { >> List result = new ArrayList(); >> for (int c = 0; c < NODE_COUNT; c++) { >> result.add(new Integer(c)); >> } >> return result; >> } >> >> } > > Rui Li has updated the pull request incrementally with one additional commit since the last revision: > > Bump maxMem to 4g Marked as reviewed by syan (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28167#pullrequestreview-3437094158 From kdnilsen at openjdk.org Sat Nov 8 19:54:44 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 8 Nov 2025 19:54:44 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v9] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Refinements and debugging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/0e2120b8..29f5d42c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=07-08 Stats: 46 lines in 4 files changed: 27 ins; 4 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From kdnilsen at openjdk.org Sun Nov 9 16:11:46 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sun, 9 Nov 2025 16:11:46 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v10] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: fix multiple errors introduced by minor refactoring of API ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/29f5d42c..cee16f88 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=08-09 Stats: 16 lines in 4 files changed: 13 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From kdnilsen at openjdk.org Sun Nov 9 16:16:25 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sun, 9 Nov 2025 16:16:25 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v11] In-Reply-To: References: Message-ID: > When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading. > > The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove debug instrumentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27353/files - new: https://git.openjdk.org/jdk/pull/27353/files/cee16f88..9f629a2a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27353&range=09-10 Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27353.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353 PR: https://git.openjdk.org/jdk/pull/27353 From shade at openjdk.org Mon Nov 10 09:12:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 09:12:04 GMT Subject: RFR: 8361339: Test gc/shenandoah/TestLargeObjectAlignment.java#generational fails on macOS aarch64 with OOM: Java heap space [v4] In-Reply-To: References: <4hAMzlEVTLb91k4l8Hd2ysUFx7FEe2erCAB_ReeHU2E=.9cae38e5-261a-499f-aee9-770775c02708@github.com> Message-ID: On Fri, 7 Nov 2025 21:51:22 GMT, Rui Li wrote: >> Sporadic failures were observed for TestLargeObjectAlignment.java#generational. The current theory is that jtreg deafult heap size on the reporter's machines is too small, and the randomness in test just sometimes created a huge heap larger than what the test had. >> >> Did a calculation for the worst case (see the code snippet at the end - it removes the Random in the original test and always allocates the array to full) and the test needs at least 2g. Initiating 3g heap for safety to reduce the noise. >> >> Also use the test to compare between Shenandoah vs GenShen: on my laptop (Mac M3), Shen failed at 2150m Xmx, GenShen could pass Xmx2150m and failed at Xmx2050m (step: 50m), so GenShen isn't worse, it's actually better. The reported GenShen failure observation probably came from the Random. >> >> >> >> public class TestLargeObjectAlignmentDeterministic { >> >> static final int SLABS_COUNT = Integer.getInteger("slabs", 10000); >> static final int NODE_COUNT = Integer.getInteger("nodes", 10000); >> static final long TIME_NS = 1000L * 1000L * Integer.getInteger("timeMs", 5000); >> >> static Object[] objects; >> >> public static void main(String[] args) throws Exception { >> objects = new Object[SLABS_COUNT]; >> >> for (int i = 0; i < SLABS_COUNT; i++) { >> objects[i] = createSome(); >> } >> } >> >> public static Object createSome() { >> List result = new ArrayList(); >> for (int c = 0; c < NODE_COUNT; c++) { >> result.add(new Integer(c)); >> } >> return result; >> } >> >> } > > Rui Li has updated the pull request incrementally with one additional commit since the last revision: > > Bump maxMem to 4g Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28167#pullrequestreview-3441786698 From shade at openjdk.org Mon Nov 10 11:43:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 11:43:04 GMT Subject: RFR: 8371284: GenShen: Avoid unnecessary card marking [v2] In-Reply-To: References:

Message-ID: On Fri, 7 Nov 2025 23:09:47 GMT, Nityanand Rai wrote: >> Exclude young-young, old-old and honor UseCondCardMark in dirty card marking. > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > early return of oop is null Generally looks fine, just tighten up the comments. src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 197: > 195: if (_heap->is_in_young(field)) { > 196: return; > 197: } Suggestion: if (_heap->is_in_young(field)) { // Young field stores do not require card mark. return; } src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 206: > 204: if (!_heap->is_in_young(obj)) { > 205: return; > 206: } Suggestion: if (!_heap->is_in_young(obj)) { // Young object -> old field stores do not require card mark. return; } src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp line 207: > 205: return; > 206: } > 207: // Honor UseCondCardMark: check if card is already dirty before writing Suggestion: ------------- PR Review: https://git.openjdk.org/jdk/pull/28204#pullrequestreview-3442710354 PR Review Comment: https://git.openjdk.org/jdk/pull/28204#discussion_r2510147860 PR Review Comment: https://git.openjdk.org/jdk/pull/28204#discussion_r2510152910 PR Review Comment: https://git.openjdk.org/jdk/pull/28204#discussion_r2510153202 From adinn at openjdk.org Mon Nov 10 13:19:08 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 10 Nov 2025 13:19:08 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References:

Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages Looks good to me. @Harshit470250 You need another reviewer before you can push this. Perhaps @dean-long can help -- he reviewed the earlier commit which led to this one being created. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3443199481 PR Comment: https://git.openjdk.org/jdk/pull/27279#issuecomment-3511586105 From tschatzl at openjdk.org Mon Nov 10 13:47:20 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 10 Nov 2025 13:47:20 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Looks good. Thanks for cleaning this up. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28146#pullrequestreview-3443342641 From ayang at openjdk.org Mon Nov 10 14:29:56 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 10 Nov 2025 14:29:56 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28146#issuecomment-3512011561 From kdnilsen at openjdk.org Mon Nov 10 14:31:08 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 10 Nov 2025 14:31:08 GMT Subject: RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v2] In-Reply-To: References: <7zV-fLvjb-4gBVTppg4XTXPNxEheqLfxB0v_WONuinI=.22775b58-42cf-499e-9007-fad07118217d@github.com> Message-ID: On Fri, 3 Oct 2025 20:49:01 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> fix idiosyncratic formatting > > src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 251: > >> 249: // if marking context is valid and we are below tams, we use the marking bit map to find the first marked object that >> 250: // intersects with this card, and if no such object exists, we return null >> 251: if ((ctx != nullptr) && (left < tams)) { > > It seems like the caller should check if `left >= tams` and short-circuit rather than have this method do that work. That comment is wrong, which is what caused you to request the alternative semantics for this function. Your comments and questions motivated me to rewrite the comments describing the behavior of this function. Rewriting the comments helped me realize the API was a bit ill-defined. I made some improvements to the behavior so that the definition could be more clearly defined. The new implementation now passes all tests again. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2510782671 From ayang at openjdk.org Mon Nov 10 14:32:20 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 10 Nov 2025 14:32:20 GMT Subject: Integrated: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: <3MvmZGPFXA7cqYhLs44MVY_P3BFyuUh4y5mlSWbGUxA=.d2d98f00-d991-43cd-8dd1-7d5d96c4031c@github.com> On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA This pull request has now been integrated. Changeset: 9d2fa8fe Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/9d2fa8fe22652cbf1c70b953247bd154b363b383 Stats: 38 lines in 16 files changed: 0 ins; 6 del; 32 mod 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue Reviewed-by: fandreuzzi, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28146 From kdnilsen at openjdk.org Mon Nov 10 14:39:09 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 10 Nov 2025 14:39:09 GMT Subject: RFR: 8353115: GenShen: mixed evacuation candidate regions need accurate live_data [v13] In-Reply-To: References: Message-ID: > The existing implementation of get_live_data_bytes() and git_live_data_words() does not always behave as might be expected. In particular, the value returned ignores any allocations that occur subsequent to the most recent mark effort that identified live data within the region. This is typically ok for young regions, where the amount of live data determines whether a region should be added to the collection set during the final-mark safepoint. > > However, old-gen regions that are placed into the set of candidates for mixed evacuation are more complicated. In particular, by the time the old-gen region is added to a mixed evacuation, its live data may be much larger than at the time concurrent old marking ended. > > This PR provides comments to clarify the shortcomings of the existing functions, and adds new functions that provide more accurate accountings of live data for mixed-evacuation candidate regions. Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 58 commits: - Fix mistaken merge resolution - Merge remote-tracking branch 'jdk/master' into fix-live-data-for-mixed-evac-candidates The resulting fastdebug build has 64 failures. I need to debug these. Probably introduced by improper resolution of merge conflicts - fix error in merge conflict resolution - Merge remote-tracking branch 'jdk/master' into fix-live-data-for-mixed-evac-candidates - rework CompressedClassSpaceSizeinJmapHeap.java - fix errors in CompressedClassSpaceSizeInJmapHeap.java - Add debug instrumentation to CompressedClassSpaceSizeInJmapHeap.java - fix two indexing bugs - add an assert to detect suspected bug - Remove debug scaffolding - ... and 48 more: https://git.openjdk.org/jdk/compare/c272aca8...16cd6f8a ------------- Changes: https://git.openjdk.org/jdk/pull/24319/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24319&range=12 Stats: 284 lines in 31 files changed: 115 ins; 30 del; 139 mod Patch: https://git.openjdk.org/jdk/pull/24319.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24319/head:pull/24319 PR: https://git.openjdk.org/jdk/pull/24319 From kdnilsen at openjdk.org Mon Nov 10 14:39:12 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 10 Nov 2025 14:39:12 GMT Subject: RFR: 8353115: GenShen: mixed evacuation candidate regions need accurate live_data [v12] In-Reply-To: <2yyHGosCQCiu03jYG7lG82_2WrppAZqVQgDATo8bcfQ=.9c8bed3d-980f-499f-a467-a5cd1d90b6b8@github.com> References: <2yyHGosCQCiu03jYG7lG82_2WrppAZqVQgDATo8bcfQ=.9c8bed3d-980f-499f-a467-a5cd1d90b6b8@github.com> Message-ID: On Mon, 13 Oct 2025 22:18:38 GMT, Kelvin Nilsen wrote: >> The existing implementation of get_live_data_bytes() and git_live_data_words() does not always behave as might be expected. In particular, the value returned ignores any allocations that occur subsequent to the most recent mark effort that identified live data within the region. This is typically ok for young regions, where the amount of live data determines whether a region should be added to the collection set during the final-mark safepoint. >> >> However, old-gen regions that are placed into the set of candidates for mixed evacuation are more complicated. In particular, by the time the old-gen region is added to a mixed evacuation, its live data may be much larger than at the time concurrent old marking ended. >> >> This PR provides comments to clarify the shortcomings of the existing functions, and adds new functions that provide more accurate accountings of live data for mixed-evacuation candidate regions. > > Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: > > - Merge remote-tracking branch 'jdk/master' into fix-live-data-for-mixed-evac-candidates > - remove _mixed_candidate_garbage_words from ShenandoahHeapRegion > - reviewer feedback to reduce code duplication > - Fix compilation errors after merge > - Merge remote-tracking branch 'jdk/master' into fix-live-data-for-mixed-evac-candidates > - Fix uninitialized variable > - Remove deprecation conditional compiles > - Adjust candidate live memory for each mixed evac > - Refactor for better abstraction > - Fix set_live() after full gc > - ... and 35 more: https://git.openjdk.org/jdk/compare/92f2ab2e...24322e75 I'm going to place this in draft while I experiment with the code to diagnose apparent regressions with traditional shenandoah mode. I have placed instrumentation into the code to confirm that the live_data reported by ShenandoahHeapRegion is the same before and after this PR for traditional Shenandoah mode. So the regressions are either "signal noise", or perhaps inefficiencies introduced regarding how we compute the live_data. After refactoring the code to perform better in the "tight" rebuild free-set and build-collection set loops, the Shenandoah results show very slight improvement (rather than regression) on specjbb2015. Here is the performance regression that we saw before commit https://github.com/openjdk/jdk/pull/24319/commits/ecdec6363ee9e0a27f4207350cf0b51f8c99bab5

Here are comparisons (in a slightly different environment) after that same commit:

------------- PR Comment: https://git.openjdk.org/jdk/pull/24319#issuecomment-3401760959 PR Comment: https://git.openjdk.org/jdk/pull/24319#issuecomment-3416424293 PR Comment: https://git.openjdk.org/jdk/pull/24319#issuecomment-3453055680 From kdnilsen at openjdk.org Mon Nov 10 15:37:59 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 10 Nov 2025 15:37:59 GMT Subject: RFR: 8371573: Shenandoah: Fix style of include directive Message-ID: Fix style of include directive for improved consistency and compatibility with GraalVM conventions. ------------- Commit messages: - Fix style of include directive Changes: https://git.openjdk.org/jdk/pull/28219/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28219&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371573 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28219.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28219/head:pull/28219 PR: https://git.openjdk.org/jdk/pull/28219 From wkemper at openjdk.org Mon Nov 10 15:45:55 2025 From: wkemper at openjdk.org (William Kemper) Date: Mon, 10 Nov 2025 15:45:55 GMT Subject: RFR: 8371573: Shenandoah: Fix style of include directive In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 15:32:03 GMT, Kelvin Nilsen wrote: > Fix style of include directive for improved consistency and compatibility with GraalVM conventions. This include looks vestigial. Can we just delete it? ------------- Changes requested by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28219#pullrequestreview-3443924743 From kdnilsen at openjdk.org Mon Nov 10 15:55:12 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 10 Nov 2025 15:55:12 GMT Subject: RFR: 8357471: GenShen: Share collector reserves between young and old [v2] In-Reply-To: References:

Message-ID: <_3OuNx5vGJMp_kP0hRZgLvkJgOUXuKVpJZ8ores0H-8=.3ca53897-5f22-4d1c-b3e6-a8e71422612a@github.com> On Sat, 31 May 2025 02:53:52 GMT, Y. Srinivas Ramakrishna