From kbarrett at openjdk.org Sun Feb 1 00:21:01 2026 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 1 Feb 2026 00:21:01 GMT Subject: RFR: 8376131: Convert ContiguousSpace to use Atomic In-Reply-To: References: Message-ID: On Thu, 22 Jan 2026 17:51:08 GMT, Thomas Schatzl wrote: > Hi all, > > please review this conversions of `ContiguousSpace` to use `Atomic`. > > Testing: gha, tier1-5 > > Thanks, > Thomas Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29370#pullrequestreview-3734044371 From jbhateja at openjdk.org Sun Feb 1 07:41:59 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 1 Feb 2026 07:41:59 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v5] In-Reply-To: References: Message-ID: > As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. > > Patch add new lane type constants and pass them to vector intrinsic entry points. > > All existing Vector API jtreg test are passing with the patch. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolution ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29481/files - new: https://git.openjdk.org/jdk/pull/29481/files/ff73dc3d..0c60016b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=03-04 Stats: 401 lines in 39 files changed: 28 ins; 62 del; 311 mod Patch: https://git.openjdk.org/jdk/pull/29481.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29481/head:pull/29481 PR: https://git.openjdk.org/jdk/pull/29481 From jbhateja at openjdk.org Sun Feb 1 07:42:04 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 1 Feb 2026 07:42:04 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v4] In-Reply-To: <-fsfUEvFpvmAsupQFgx1CBkH9vr_efE5-qYeUzy5VFQ=.4abb05e0-1f82-4d6c-8bc4-ca4bc6fc5e80@github.com> References: <-fsfUEvFpvmAsupQFgx1CBkH9vr_efE5-qYeUzy5VFQ=.4abb05e0-1f82-4d6c-8bc4-ca4bc6fc5e80@github.com> Message-ID: On Fri, 30 Jan 2026 23:31:29 GMT, Paul Sandoz wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolutions > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractSpecies.java line 152: > >> 150: int laneTypeOrdinal() { >> 151: return laneType.ordinal(); >> 152: } > > Is this needed? Won't all concrete sub types override this? This interface provides access to lane type constant though species, its used for consistency, please have a look at following line and other places around it. https://github.com/jatin-bhateja/jdk/blob/ff73dc3d48a9435c4395556c8325fbce7610cba9/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/DoubleVector.java#L3374 > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 60: > >> 58: >> 59: static final int LANE_TYPE_ORDINAL = LT_BYTE; >> 60: > > You can move this up to `ByteVector` and then reuse it to replace `byte.class`, so it is used consistently. Done > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java line 821: > >> 819: convert(String name, char kind, Class dom, Class ran, int opCode, int flags) { >> 820: int domran = ((LaneType.of(dom).ordinal() << VO_DOM_SHIFT) + >> 821: (LaneType.of(ran).ordinal() << VO_RAN_SHIFT)); > > As i understand this is still correct because the maximum ordinal value is less than 16 (as was already the case for the basic type). Correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2750675259 PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2750675162 PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2750675209 From psandoz at openjdk.org Sun Feb 1 17:11:11 2026 From: psandoz at openjdk.org (Paul Sandoz) Date: Sun, 1 Feb 2026 17:11:11 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v4] In-Reply-To: References: <-fsfUEvFpvmAsupQFgx1CBkH9vr_efE5-qYeUzy5VFQ=.4abb05e0-1f82-4d6c-8bc4-ca4bc6fc5e80@github.com> Message-ID: On Sun, 1 Feb 2026 07:36:35 GMT, Jatin Bhateja wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractSpecies.java line 152: >> >>> 150: int laneTypeOrdinal() { >>> 151: return laneType.ordinal(); >>> 152: } >> >> Is this needed? Won't all concrete sub types override this? > > This interface provides access to lane type constant though species, its used for consistency, please have a look at following line and other places around it. > https://github.com/jatin-bhateja/jdk/blob/ff73dc3d48a9435c4395556c8325fbce7610cba9/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/DoubleVector.java#L3374 Agreed that this method is required, but i was wondering why `AbstractSpecies` need to implement it. Ok, i see now you are copying the same pattern as some other methods such as `elementType`, so this is a more general issue we should not resolve in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2751614740 From psandoz at openjdk.org Sun Feb 1 17:15:09 2026 From: psandoz at openjdk.org (Paul Sandoz) Date: Sun, 1 Feb 2026 17:15:09 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v5] In-Reply-To: References: Message-ID: On Sun, 1 Feb 2026 07:41:59 GMT, Jatin Bhateja wrote: >> As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. >> >> Patch add new lane type constants and pass them to vector intrinsic entry points. >> >> All existing Vector API jtreg test are passing with the patch. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolution src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 580: > 578: public static ByteVector zero(VectorSpecies species) { > 579: ByteSpecies vsp = (ByteSpecies) species; > 580: return VectorSupport.fromBitsCoerced(vsp.vectorType(), vsp.laneTypeOrdinal(), species.length(), You can now use `LANE_TYPE_ORDINAL` rather than `vsp.laneTypeOrdinal()`, which better fits the prior pattern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2751629721 From dholmes at openjdk.org Mon Feb 2 01:21:41 2026 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Feb 2026 01:21:41 GMT Subject: RFR: 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature Message-ID: An ASAN enabled build reported heap-buffer-overflow in `MethodHandles::is_basic_type_signature` with `ASAN_OPTIONS=strict_string_checks=true` when running test `jdk/jdk/jfr/api/metadata/annotations/TestThrottle.java` The code is here: bool MethodHandles::is_basic_type_signature(Symbol* sig) { assert(vmSymbols::object_signature()->utf8_length() == (int)OBJ_SIG_LEN, ""); assert(vmSymbols::object_signature()->equals(OBJ_SIG), ""); for (SignatureStream ss(sig, sig->starts_with(JVM_SIGNATURE_FUNC)); !ss.is_done(); ss.next()) { switch (ss.type()) { case T_OBJECT: // only java/lang/Object is valid here if (strncmp((char*) ss.raw_bytes(), OBJ_SIG, OBJ_SIG_LEN) != 0) The ASAN `strncmp` interceptor acts as follows: INTERCEPTOR(int, strncmp, const char *s1, const char *s2, size_t n) { void *ctx; ASAN_INTERCEPTOR_ENTER(linker, strncmp); // Sets up context ASAN_READ_RANGE(s1, n); // Validates s1 ASAN_READ_RANGE(s2, n); // Validates s2 return REAL(strncmp)(s1, s2, n); // Calls original function } With the test given `s1` is a buffer of size 15, containing a non-nul-terminated string, and `n` is 18, so `ASAN_READ_RANGE` fails for `s1` as we could potentially read beyond the end of the buffer. In practice however, given `s1` is guaranteed to be a valid type-string from a signature symbol of type `T_OBJECT`, its final character is `;` and the final character of `s2` is also `;` (it is the string constant `Ljava/lang/Object;`). Hence the comparison must terminate before we can run off the end of `s1`. To appease ASAN we can make a simple change to the `strncmp` call to compare at most `ss.raw_length()` bytes. Testing - ASAN no longer reports an error - tiers 1-3 sanity Thanks ------------- Commit messages: - copyright-year - 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature Changes: https://git.openjdk.org/jdk/pull/29516/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29516&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8376855 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/29516.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29516/head:pull/29516 PR: https://git.openjdk.org/jdk/pull/29516 From dholmes at openjdk.org Mon Feb 2 02:14:06 2026 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Feb 2026 02:14:06 GMT Subject: RFR: 8373367: interp-only mechanism fails to work for carrier threads in a corner case [v3] In-Reply-To: References: <4kL5ukI7hOKtKX0zkyc6K_7RMq3v1t_fJdvdwvmXfsw=.60ebbe1d-0133-4bff-953c-db953eed86db@github.com> Message-ID: On Fri, 30 Jan 2026 07:45:04 GMT, Serguei Spitsyn wrote: >> The `interp-only` mechanism is based on the `JavaThread` objects. Carrier and virtual threads can temporary share the same `JavaThread`. The `java_thread->jvmti_thread_state()` is re-linked to a virtual thread at `mount` and to the carrier thread at `unmount`. The `JvmtiThreadState` has a back link to the `JavaThread` which is also set for virtual thread at a `mount` and carrier thread at an `unmount`. Just one of these two links at the same time is set to the `JavaThread`, the other one has to be set to `nullptr`. The `interp-only` mechanism needs this invariant. >> However, there is a corner case when this invariant is broken. It happens when the `JvmtiThreadState` for carrier thread has just been created. In such case, the link to `JavaThread` is always `non-nullptr` even though a virtual thread is currently mounted on a carrier thread. This simple update fixes the issue in the `JvmtiThreadState` ctor. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: moved and extended comment in JvmtiThreadState ctor I appreciate the expanded comments but I still don't fully understand what `_thread` and `_saved_thread` point to at different times. The lifecycle of these fields really needs to be clearly described somewhere. A couple of typos are present - see below. Thanks src/hotspot/share/prims/jvmtiThreadState.cpp line 61: > 59: > 60: // The _thread field is a link to the JavaThread associated with JvmtiThreadState. > 61: // The _thread_saved field is used for carrier threads only when a virtual thread, Suggestion: // The _thread_saved field is used for carrier threads only when a virtual thread src/hotspot/share/prims/jvmtiThreadState.cpp line 65: > 63: // Carrier and virtual threads can temporarily share same JavaThread. In such a case, > 64: // only virtual _thread should have a link from JvmtiThreadState to JavaThread. > 65: // The carrier thread _thread filed is set to nullptr if a virtual thread is monted. Suggestion: // The carrier thread _thread field is set to nullptr if a virtual thread is mounted. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29436#pullrequestreview-3737008826 PR Review Comment: https://git.openjdk.org/jdk/pull/29436#discussion_r2752276119 PR Review Comment: https://git.openjdk.org/jdk/pull/29436#discussion_r2752275470 From shade at openjdk.org Mon Feb 2 07:15:19 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 07:15:19 GMT Subject: RFR: 8376472: Shenandoah: Assembler store barriers read destination memory despite the decorators [v2] In-Reply-To: References: Message-ID: <-XK_Jf4sJArKYhzJltTtV3CUe3k4iI1ZpVT4E5QaDbo=.52cd97d5-9d47-44bd-9618-01f10fb04ed9@github.com> On Fri, 30 Jan 2026 10:16:19 GMT, Aleksey Shipilev wrote: >> The issue is really a correctness issue, and it readily manifests in Valhalla, which sometimes does the stores with `IS_DEST_UNINITIALIZED` set. Unfortunately, Shenandoah SATB barriers ignore this attribute, and attempt to read the memory at store address. At best it crashes the VM with the "oopness" asserts, at worst it feeds "garbage" pointers into SATB machinery, which then wrecks havoc on everything else. >> >> We need to make sure store barriers are consistently checking these attributes. Unfortunately, that would mean doing the changes in arch-specific assembler code. >> >> This PR makes sure the ShenandoahBarrierSetAssembler store barriers are roughly in the same shape, and that they consult `ShenandoahBarrierSet::need_*_barrier` to make the proper decisions whether to use SATB/card barriers. >> >> `hotspot_gc_shenandoah` is enough to sanity-check this patch, but I am also running `all` tests for extra safety. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `hotspot_gc_shenandoah` >> - [x] Linux AArch64 server fastdebug, `hotspot_gc_shenandoah` >> - [x] Linux x86_64 server fastdebug, `all` + `-XX:+UseShenandoahGC` >> - [x] Linux AArch64 server fastdebug, `all` + `-XX:+UseShenandoahGC` >> - [x] Linux {PPC64, RISC-V, S390X} server fastdebug, cross-compilation > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Missing return in PPC64 for non-reference stores > - Merge branch 'master' into JDK-8376472-shenandoah-store-barriers > - More polish > - RISC-V version > - More touchups, AArch64 version > - Store barrier cleanup Let's go! Thanks for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29444#issuecomment-3833369307 From shade at openjdk.org Mon Feb 2 07:15:20 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 07:15:20 GMT Subject: Integrated: 8376472: Shenandoah: Assembler store barriers read destination memory despite the decorators In-Reply-To: References: Message-ID: On Tue, 27 Jan 2026 10:47:54 GMT, Aleksey Shipilev wrote: > The issue is really a correctness issue, and it readily manifests in Valhalla, which sometimes does the stores with `IS_DEST_UNINITIALIZED` set. Unfortunately, Shenandoah SATB barriers ignore this attribute, and attempt to read the memory at store address. At best it crashes the VM with the "oopness" asserts, at worst it feeds "garbage" pointers into SATB machinery, which then wrecks havoc on everything else. > > We need to make sure store barriers are consistently checking these attributes. Unfortunately, that would mean doing the changes in arch-specific assembler code. > > This PR makes sure the ShenandoahBarrierSetAssembler store barriers are roughly in the same shape, and that they consult `ShenandoahBarrierSet::need_*_barrier` to make the proper decisions whether to use SATB/card barriers. > > `hotspot_gc_shenandoah` is enough to sanity-check this patch, but I am also running `all` tests for extra safety. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `hotspot_gc_shenandoah` > - [x] Linux AArch64 server fastdebug, `hotspot_gc_shenandoah` > - [x] Linux x86_64 server fastdebug, `all` + `-XX:+UseShenandoahGC` > - [x] Linux AArch64 server fastdebug, `all` + `-XX:+UseShenandoahGC` > - [x] Linux {PPC64, RISC-V, S390X} server fastdebug, cross-compilation This pull request has now been integrated. Changeset: f8b0ff26 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/f8b0ff26c9e6643e96f06c18c509ddaf50326205 Stats: 270 lines in 10 files changed: 48 ins; 61 del; 161 mod 8376472: Shenandoah: Assembler store barriers read destination memory despite the decorators Reviewed-by: mdoerr, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/29444 From shade at openjdk.org Mon Feb 2 07:45:03 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 07:45:03 GMT Subject: RFR: 8376355: Update to use jtreg 8.2.1 In-Reply-To: References: Message-ID: On Tue, 27 Jan 2026 15:26:20 GMT, Christian Stein wrote: > Please review the change to update to using jtreg 8.2.1. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. Nice to see no actual test changes are required for compatibility. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29452#pullrequestreview-3737777336 From tschatzl at openjdk.org Mon Feb 2 08:01:19 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 08:01:19 GMT Subject: RFR: 8376131: Convert ContiguousSpace to use Atomic In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 04:59:09 GMT, David Holmes wrote: >> Hi all, >> >> please review this conversions of `ContiguousSpace` to use `Atomic`. >> >> Testing: gha, tier1-5 >> >> Thanks, >> Thomas > > Looks good. Thanks Thanks @dholmes-ora @kimbarrett for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/29370#issuecomment-3833527727 From tschatzl at openjdk.org Mon Feb 2 08:01:21 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 08:01:21 GMT Subject: Integrated: 8376131: Convert ContiguousSpace to use Atomic In-Reply-To: References: Message-ID: On Thu, 22 Jan 2026 17:51:08 GMT, Thomas Schatzl wrote: > Hi all, > > please review this conversions of `ContiguousSpace` to use `Atomic`. > > Testing: gha, tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: f22bc1cd Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/f22bc1cd518bc7f09dc49b78e40d06210226d2b7 Stats: 21 lines in 4 files changed: 2 ins; 7 del; 12 mod 8376131: Convert ContiguousSpace to use Atomic Reviewed-by: dholmes, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/29370 From lkorinth at openjdk.org Mon Feb 2 08:05:21 2026 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 2 Feb 2026 08:05:21 GMT Subject: RFR: 8367993: G1: Speed up ConcurrentMark initialization [v9] In-Reply-To: References: Message-ID: On Thu, 29 Jan 2026 14:47:12 GMT, Leo Korinth wrote: >> This change moves almost all of the ConcurrentMark initialisation from its constructor to the method `G1ConcurrentMark::fully_initialize()`. Thus, creation time of the VM can be slightly improved by postponing creation of ConcurrentMark. Most time is saved postponing creation of statistics buffers and threads. >> >> It is not obvious that this is the best solution. I have earlier experimented with lazily allocating statistics buffers _only_. One could also initialise a little bit more eagerly (for example the concurrent mark thread) and maybe get a slightly cleaner change. However IMO it seems better to not have ConcurrentMark "half initiated" with a created mark thread, but un-initialised worker threads. >> >> This change is depending on the integration of https://bugs.openjdk.org/browse/JDK-8373253. >> >> I will be out for vacation, and will be back after new year (and will not answer questions during that time), but I thought I get the pull request out now so that you can have a look. > > Leo Korinth has updated the pull request incrementally with two additional commits since the last revision: > > - Reapply "remove commented out code" > > This reverts commit d0d1860058f0dae7813c3e5115e2784da8331f3b. > - Reapply "Stefan J 4" > > This reverts commit c5a7e2bb44ce111f8c8d1d7f728f1bf8013475e0. Thanks everyone! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28723#issuecomment-3833554735 From lkorinth at openjdk.org Mon Feb 2 08:05:23 2026 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 2 Feb 2026 08:05:23 GMT Subject: Integrated: 8367993: G1: Speed up ConcurrentMark initialization In-Reply-To: References: Message-ID: On Tue, 9 Dec 2025 14:56:49 GMT, Leo Korinth wrote: > This change moves almost all of the ConcurrentMark initialisation from its constructor to the method `G1ConcurrentMark::fully_initialize()`. Thus, creation time of the VM can be slightly improved by postponing creation of ConcurrentMark. Most time is saved postponing creation of statistics buffers and threads. > > It is not obvious that this is the best solution. I have earlier experimented with lazily allocating statistics buffers _only_. One could also initialise a little bit more eagerly (for example the concurrent mark thread) and maybe get a slightly cleaner change. However IMO it seems better to not have ConcurrentMark "half initiated" with a created mark thread, but un-initialised worker threads. > > This change is depending on the integration of https://bugs.openjdk.org/browse/JDK-8373253. > > I will be out for vacation, and will be back after new year (and will not answer questions during that time), but I thought I get the pull request out now so that you can have a look. This pull request has now been integrated. Changeset: 766e03b1 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/766e03b151b2972108ddc207eed10428e9a91c30 Stats: 57 lines in 9 files changed: 30 ins; 6 del; 21 mod 8367993: G1: Speed up ConcurrentMark initialization Reviewed-by: sjohanss, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28723 From erfang at openjdk.org Mon Feb 2 08:25:32 2026 From: erfang at openjdk.org (Eric Fang) Date: Mon, 2 Feb 2026 08:25:32 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction [v2] In-Reply-To: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> Message-ID: <595lDgLFjcH0tzzdeacMVa_1fPt3PQhKIhibehSvpZk=.3f01b98a-ce6b-4c81-92da-235443e81f9b@github.com> > When optimizing some VectorMask related APIs , we found an optimization opportunity related to the `cpy (immediate, zeroing)` instruction [1]. Implementing the functionality of this instruction using `cpy (immediate, merging)` instruction [2] leads to better performance. > > Currently the `cpy (imm, zeroing)` instruction is used in code generated by `VectorStoreMaskNode` and `VectorReinterpretNode`. Doing this optimization benefits all vector APIs that generate these two IRs potentially, such as `VectorMask.intoArray()` and `VectorMask.toLong()`. > > Microbenchmarks show this change brings performance uplift ranging from **11%** to **33%**, depending on the specific operation and data types. > > The specific changes in this PR: > 1. Achieve the functionality of the `cpy (imm, zeroing)` instruction with the `movi + cpy (imm, merging)` instructions in assembler: > > cpy z17.d, p1/z, #1 => > > movi v17.2d, #0 // this instruction is zero cost > cpy z17.d, p1/m, #1 > > > 2. Add a new option `PreferSVEMergingModeCPY` to indicate whether to apply this optimization or not. > - This option belongs to the Arch product category. > - The default value is true on Neoverse-V1/V2 where the improvement has been confirmed, false on others. > - When its value is true, the change is applied. > > 3. Add a jtreg test to verify the behavior of this option. > > This PR was tested on aarch64 and x86 machines with different configurations, and all tests passed. > > JMH benchmarks: > > On a Nvidia Grace (Neoverse-V2) machine with 128-bit SVE2: > > Benchmark Unit size Before Error After Error Uplift > byteIndexInRange ops/ms 7.00 471816.15 1125.96 473237.77 1593.92 1.00 > byteIndexInRange ops/ms 256.00 149654.21 416.57 149259.95 116.59 1.00 > byteIndexInRange ops/ms 259.00 177850.31 991.13 179785.19 1110.07 1.01 > byteIndexInRange ops/ms 512.00 133393.26 167.26 133484.61 281.83 1.00 > doubleIndexInRange ops/ms 7.00 302176.39 12848.8 299813.02 37.76 0.99 > doubleIndexInRange ops/ms 256.00 47831.93 56.70 46708.70 56.11 0.98 > doubleIndexInRange ops/ms 259.00 11550.02 27.95 15333.50 10.40 1.33 > doubleIndexInRange ops/ms 512.00 23687.76 61.65 23996.08 69.52 1.01 > floatIndexInRange ops/ms 7.00 412195.79 124.71 411770.23 78.73 1.00 > floatIndexInRange ops/ms 256.00 84479.98 70.69 84237.31 70.15 1.00 > floatIndexInRange ops/ms 259.00 22585.65 80.07 28296.21 7.98 1.25 > floatIndexInRange ops/ms 512.00 46902.99 51.60 46686.68 66.01 1.00 > intIndexInRange ops/ms 7.00 413411.70 50.59 420684.66 253.55 1.02 > intIndexInRange ops/... Eric Fang has updated the pull request incrementally with one additional commit since the last revision: Move the implementation into C2_MacroAssembler ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29359/files - new: https://git.openjdk.org/jdk/pull/29359/files/4f5a7bd7..884a11f2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29359&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29359&range=00-01 Stats: 240 lines in 10 files changed: 37 ins; 171 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/29359.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29359/head:pull/29359 PR: https://git.openjdk.org/jdk/pull/29359 From erfang at openjdk.org Mon Feb 2 08:25:33 2026 From: erfang at openjdk.org (Eric Fang) Date: Mon, 2 Feb 2026 08:25:33 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction In-Reply-To: References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> <_qJ_Qo_Mqexx7dYu0Vkc9ru4SxZ0izfqifaUIAL1iyQ=.741b11d4-a89d-495e-8d31-78fed690abf6@github.com> Message-ID: On Wed, 28 Jan 2026 10:17:30 GMT, Andrew Haley wrote: >> @fg1417 thanks for your help, this is really helpful! >> >> You've also noticed slight regression in a few cases, which is reasonable. The optimization effect is influenced by multiple factors, such as the alignment you mentioned on N2, as well as code generation and register allocation. The underlying principle of this optimization is that the latency of the `cpy(imm, zeroing)` instruction seems quite high, while the `movi + cpy(imm, merging)` combination improves the parallelism of the program. In some cases, a `mov` or other instruction with the same effect is already generated before the `cpy(imm, zeroing)` instruction, thus achieving the optimization effect of the `movi + cpy(imm, merging)` instruction combination. Therefore, the slight regression caused by the extra `movi` instruction in these cases is reasonable. However, for cases where this optimization applies, the performance improvement will be more significant. For example, in the following case, I even saw a **2x** performance improvement on Neoverse-V2. >> >> @Param({"128"}) >> private int loop_iteration; >> private static final VectorSpecies ispecies = VectorSpecies.ofLargestShape(int.class); >> private boolean[] mask_arr; >> >> @Setup(Level.Trial) >> public void BmSetup() { >> int array_size = loop_iteration * bspecies.length(); >> mask_arr = new boolean[array_size]; >> Random r = new Random(); >> for (int i = 0; i < array_size; i++) { >> mask_arr[i] = r.nextBoolean(); >> } >> } >> >> @CompilerControl(CompilerControl.Mode.INLINE) >> private long testIndexInRangeToLongKernel(VectorSpecies species) { >> long sum = 0; >> VectorMask m = VectorMask.fromArray(species, mask_arr, 0); >> for (int i = 0; i < loop_iteration; i++) { >> sum += m.indexInRange(i & (m.length() - 1), m.length()).toLong(); >> } >> return sum; >> } >> >> @Benchmark >> public long indexInRangeToLongInt() { >> return testIndexInRangeToLongKernel(ispecies); >> } >> >> >> Therefore, when you test this change using the C case, you will see a significant performance improvement. >>> I see 2% uplift on these numbers. >> >> @theRealAph And I think this also explains your question on these numbers. >> >>> One thing you can do is add a flag to control this minor optimization, but make it constexpr bool = true until we know what other SVE implementations might do. >> In general: >> Dea... > >> Therefore, when you test this change using the C case, you will see a significant performance improvement. >> >> > I see 2% uplift on these numbers. >> >> @theRealAph And I think this also explains your question on these numbers. > > Not at all. > > The performance claim above was: > >> Microbenchmarks show this change brings performance uplift ranging from 11% to 33%, depending on the specific operation and data types. > > But the real performance uplift, as measured in Java microbenchmarks, is 2%. Hi @theRealAph I have moved the implementation into C2_MacroAssember and Added a constexpr flag to guard this optimization, would you mind taking another look, thanks~ ------------- PR Comment: https://git.openjdk.org/jdk/pull/29359#issuecomment-3833659702 From iwalulya at openjdk.org Mon Feb 2 08:43:30 2026 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 2 Feb 2026 08:43:30 GMT Subject: RFR: 8375438: G1: Convert G1HeapRegion related classes to use Atomic [v2] In-Reply-To: References: Message-ID: On Tue, 20 Jan 2026 11:32:13 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review conversion of G1HeapRegion related classes to use Atomic. >> >> Testing: tier1, tier4, tier5 >> >> (The PipelineLeaksFD failure in gha is a known issue) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * shade review Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29301#pullrequestreview-3738026225 From iwalulya at openjdk.org Mon Feb 2 08:47:24 2026 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 2 Feb 2026 08:47:24 GMT Subject: RFR: 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic In-Reply-To: References: Message-ID: On Thu, 22 Jan 2026 12:58:39 GMT, Thomas Schatzl wrote: > Hi all, > > use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. > > Testing: gha > > Thanks, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29360#pullrequestreview-3738043319 From iwalulya at openjdk.org Mon Feb 2 08:57:16 2026 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 2 Feb 2026 08:57:16 GMT Subject: RFR: 8376570: GrowableArray::remove_{till,range} should work on empty list In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 09:48:58 GMT, Aleksey Shipilev wrote: > Split from [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046), we want to make sure GrowableArray removal methods work appropriately with empty lists. > > Testing: > - [x] New test > - [x] Linux x86_64 server fastdebug, `all` (in course of [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046) testing) LGTM! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29462#pullrequestreview-3738088571 From jbhateja at openjdk.org Mon Feb 2 09:07:21 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Feb 2026 09:07:21 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v6] In-Reply-To: References: Message-ID: > As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. > > Patch add new lane type constants and pass them to vector intrinsic entry points. > > All existing Vector API jtreg test are passing with the patch. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comment resolution ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29481/files - new: https://git.openjdk.org/jdk/pull/29481/files/0c60016b..23022d42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=04-05 Stats: 115 lines in 7 files changed: 0 ins; 0 del; 115 mod Patch: https://git.openjdk.org/jdk/pull/29481.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29481/head:pull/29481 PR: https://git.openjdk.org/jdk/pull/29481 From jbhateja at openjdk.org Mon Feb 2 09:07:23 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Feb 2026 09:07:23 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v5] In-Reply-To: References: Message-ID: On Sun, 1 Feb 2026 17:12:49 GMT, Paul Sandoz wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolution > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 580: > >> 578: public static ByteVector zero(VectorSpecies species) { >> 579: ByteSpecies vsp = (ByteSpecies) species; >> 580: return VectorSupport.fromBitsCoerced(vsp.vectorType(), vsp.laneTypeOrdinal(), species.length(), > > You can now use `LANE_TYPE_ORDINAL` rather than `vsp.laneTypeOrdinal()`, which better fits the prior pattern. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2753281411 From aph at openjdk.org Mon Feb 2 09:08:48 2026 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Feb 2026 09:08:48 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction [v2] In-Reply-To: <595lDgLFjcH0tzzdeacMVa_1fPt3PQhKIhibehSvpZk=.3f01b98a-ce6b-4c81-92da-235443e81f9b@github.com> References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> <595lDgLFjcH0tzzdeacMVa_1fPt3PQhKIhibehSvpZk=.3f01b98a-ce6b-4c81-92da-235443e81f9b@github.com> Message-ID: <4zd3EIKU033iAFrjn7h3BQMbG-4R0DlhELQ_yEAXaZ0=.cfd0af29-7b06-45f9-8547-cf94b91aaab2@github.com> On Mon, 2 Feb 2026 08:25:32 GMT, Eric Fang wrote: >> When optimizing some VectorMask related APIs , we found an optimization opportunity related to the `cpy (immediate, zeroing)` instruction [1]. Implementing the functionality of this instruction using `cpy (immediate, merging)` instruction [2] leads to better performance. >> >> Currently the `cpy (imm, zeroing)` instruction is used in code generated by `VectorStoreMaskNode` and `VectorReinterpretNode`. Doing this optimization benefits all vector APIs that generate these two IRs potentially, such as `VectorMask.intoArray()` and `VectorMask.toLong()`. >> >> Microbenchmarks show this change brings performance uplift ranging from **11%** to **33%**, depending on the specific operation and data types. >> >> The specific changes in this PR: >> 1. Achieve the functionality of the `cpy (imm, zeroing)` instruction with the `movi + cpy (imm, merging)` instructions in assembler: >> >> cpy z17.d, p1/z, #1 => >> >> movi v17.2d, #0 // this instruction is zero cost >> cpy z17.d, p1/m, #1 >> >> >> 2. Add a new option `PreferSVEMergingModeCPY` to indicate whether to apply this optimization or not. >> - This option belongs to the Arch product category. >> - The default value is true on Neoverse-V1/V2 where the improvement has been confirmed, false on others. >> - When its value is true, the change is applied. >> >> 3. Add a jtreg test to verify the behavior of this option. >> >> This PR was tested on aarch64 and x86 machines with different configurations, and all tests passed. >> >> JMH benchmarks: >> >> On a Nvidia Grace (Neoverse-V2) machine with 128-bit SVE2: >> >> Benchmark Unit size Before Error After Error Uplift >> byteIndexInRange ops/ms 7.00 471816.15 1125.96 473237.77 1593.92 1.00 >> byteIndexInRange ops/ms 256.00 149654.21 416.57 149259.95 116.59 1.00 >> byteIndexInRange ops/ms 259.00 177850.31 991.13 179785.19 1110.07 1.01 >> byteIndexInRange ops/ms 512.00 133393.26 167.26 133484.61 281.83 1.00 >> doubleIndexInRange ops/ms 7.00 302176.39 12848.8 299813.02 37.76 0.99 >> doubleIndexInRange ops/ms 256.00 47831.93 56.70 46708.70 56.11 0.98 >> doubleIndexInRange ops/ms 259.00 11550.02 27.95 15333.50 10.40 1.33 >> doubleIndexInRange ops/ms 512.00 23687.76 61.65 23996.08 69.52 1.01 >> floatIndexInRange ops/ms 7.00 412195.79 124.71 411770.23 78.73 1.00 >> floatIndexInRange ops/ms 256.00 84479.98 70.69 84237.31 70.15 1.00 >> floatIndexInRange ops/ms 259.00 22585.65 80.07 28296.21 7.98 1.25 >> floatIndexInRange ops/ms 512.00 46902.99 51.60 46686.68 66.01 1.00 >> intInd... > > Eric Fang has updated the pull request incrementally with one additional commit since the last revision: > > Move the implementation into C2_MacroAssembler src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2846: > 2844: void C2_MacroAssembler::sve_cpy_optimized(FloatRegister dst, SIMD_RegVariant T, > 2845: PRegister pg, int imm8, bool isMerge) { > 2846: // When prefer_sve_merging_mode_cpy is enabled, optimize the SVE `cpy This comment says nothing that is not obvious from the code. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2848: > 2846: // When prefer_sve_merging_mode_cpy is enabled, optimize the SVE `cpy > 2847: // (immediate, zeroing)` instruction as `movi + cpy (immediate, merging)` > 2848: // instructions for better performance. Most of this comment is obvious from reading the code. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2855: > 2853: // Z above 128, so this `movi` instruction effectively zeroes the > 2854: // entire Z register. According to the Arm Software Optimization > 2855: // Guide, `movi` is zero cost. I don't think it says that exactly. movi is handled early during renaming, but still occupies a decode slot. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753291396 PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753296599 PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753295400 From dfenacci at openjdk.org Mon Feb 2 09:31:40 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 09:31:40 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v9] In-Reply-To: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: > ## Issue > > This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. > > This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. > > ## Causes > > The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. > > ## Fix > > A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: > https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 > This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. > > # Testing > > * Tier 1-3+ > * 2 JTReg tests added > * `TestRangeCheck.java` as regression test for the reported issue > * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion Damon Fenacci has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8374582' of https://github.com/dafedafe/jdk into JDK-8374582 - JDK-8374582: add assert in opaque constructor ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29164/files - new: https://git.openjdk.org/jdk/pull/29164/files/c5390e4a..5e7df6f4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=07-08 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29164.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29164/head:pull/29164 PR: https://git.openjdk.org/jdk/pull/29164 From mhaessig at openjdk.org Mon Feb 2 09:36:25 2026 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Mon, 2 Feb 2026 09:36:25 GMT Subject: RFR: 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations [v8] In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 15:04:25 GMT, Roland Westrelin wrote: >> For this failure memory stats are: >> >> >> Total Usage: 1095525816 >> --- Arena Usage by Arena Type and compilation phase, at arena usage peak of 1095525816 --- >> Phase Total ra node comp type states reglive regsplit regmask superword cienv ha other >> none 5976032 331560 5402064 197512 33712 10200 0 0 984 0 0 0 0 >> parse 2716464 65456 1145480 196408 1112752 0 0 0 0 0 196368 0 0 >> optimizer 98184 0 32728 0 65456 0 0 0 0 0 0 0 0 >> connectionGraph 32728 0 0 32728 0 0 0 0 0 0 0 0 0 >> iterGVN 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> idealLoop 918189632 0 38687056 872824784 392776 0 0 0 0 0 6285016 0 0 >> idealLoopVerify 2228144 0 0 2228144 0 0 0 0 0 0 0 0 0 >> macroExpand 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> graphReshape 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> matcher 20135944 3369848 9033208 7536400 65456 131032 0 0 0 0 0 0 0 >> postselect_cleanup 294872 294872 0 0 0 0 0 0 0 0 0 0 0 >> scheduler 752944 196488 556456 0 0 0 0 0 0 0 0 0 0 >> regalloc 388736 388736 0 0 0 0 0 0 0 0 0 0 0 >> ... > > Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/memory/arena.hpp > > Co-authored-by: Manuel H?ssig > - Update src/hotspot/share/opto/loopnode.cpp > > Co-authored-by: Manuel H?ssig > - Update src/hotspot/share/opto/loopnode.hpp > > Co-authored-by: Manuel H?ssig Thank you for updating the copyright years. Testing passed as well. ------------- Marked as reviewed by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/28581#pullrequestreview-3738252300 From roland at openjdk.org Mon Feb 2 09:36:27 2026 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 2 Feb 2026 09:36:27 GMT Subject: RFR: 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations [v6] In-Reply-To: References: Message-ID: <_CO2G_HBJteRozKtjofE4Esyfk0qgZYiqO1uhQxH6Sc=.029ae153-446a-4cf9-a561-a94b5eaca6ed@github.com> On Fri, 30 Jan 2026 16:10:25 GMT, Beno?t Maillard wrote: >>> I was able to come up with this test, which is a bit more that 2 times faster than the original one on my machine. Its `memlimit` is set to `600M`, which is enough to make the old version fail. With the new one, the test passes even with a `memlimit` of `200M`, so this should be a good enough margin. >> >> Great. The new test looks good to me. I replaced the existing test with that one. Thanks for taking the time to do that. >> >>> While looking into this I have also found out that some programs have an unexpectedly high usage of `output` (as was the case in the test case that I initially suggested). I am trying to get a good reproducer and will most likely file a follow-up. >> >> Can you post links to the bugs? Thanks. > >> Can you post links to the bugs? Thanks. > > I haven't filed it yet. I observed something suspicious once, but at the moment I am not able to reproduce it anymore. I will take another look, and I will post here or tag you in the issue if there is any update @rwestrel. @benoitmaillard @mhaessig thanks for the reviews. @eme64 would you mind approving it again? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28581#issuecomment-3833993449 From dfenacci at openjdk.org Mon Feb 2 09:55:37 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 09:55:37 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v10] In-Reply-To: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: > ## Issue > > This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. > > This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. > > ## Causes > > The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. > > ## Fix > > A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: > https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 > This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. > > # Testing > > * Tier 1-3+ > * 2 JTReg tests added > * `TestRangeCheck.java` as regression test for the reported issue > * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: JDK-8374582: revert wrong copyright change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29164/files - new: https://git.openjdk.org/jdk/pull/29164/files/5e7df6f4..5ac3e6e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29164.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29164/head:pull/29164 PR: https://git.openjdk.org/jdk/pull/29164 From erfang at openjdk.org Mon Feb 2 09:59:19 2026 From: erfang at openjdk.org (Eric Fang) Date: Mon, 2 Feb 2026 09:59:19 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction [v2] In-Reply-To: <4zd3EIKU033iAFrjn7h3BQMbG-4R0DlhELQ_yEAXaZ0=.cfd0af29-7b06-45f9-8547-cf94b91aaab2@github.com> References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> <595lDgLFjcH0tzzdeacMVa_1fPt3PQhKIhibehSvpZk=.3f01b98a-ce6b-4c81-92da-235443e81f9b@github.com> <4zd3EIKU033iAFrjn7h3BQMbG-4R0DlhELQ_yEAXaZ0=.cfd0af29-7b06-45f9-8547-cf94b91aaab2@github.com> Message-ID: <9ODLGgIWL6x0UlzS81yDsxVxWWESoCtZh77EcIAjH0U=.f8832988-2525-49dc-9a19-1c1d6f7a1d81@github.com> On Mon, 2 Feb 2026 09:04:21 GMT, Andrew Haley wrote: >> Eric Fang has updated the pull request incrementally with one additional commit since the last revision: >> >> Move the implementation into C2_MacroAssembler > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2846: > >> 2844: void C2_MacroAssembler::sve_cpy_optimized(FloatRegister dst, SIMD_RegVariant T, >> 2845: PRegister pg, int imm8, bool isMerge) { >> 2846: // When prefer_sve_merging_mode_cpy is enabled, optimize the SVE `cpy > > This comment says nothing that is not obvious from the code. I?d like to briefly document the main idea of this method. How about adding a brief comment before the method like `Provide an optimized implementation for cpy (imm, zeroing) instruction`, or do you think it would be better to remove the comment? > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2855: > >> 2853: // Z above 128, so this `movi` instruction effectively zeroes the >> 2854: // entire Z register. According to the Arm Software Optimization >> 2855: // Guide, `movi` is zero cost. > > I don't think it says that exactly. movi is handled early during renaming, but still occupies a decode slot. Yeah you are right, and the movi uop gets eliminated shortly downstream of the decoder. I should say `zero latency`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753482758 PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753500143 From epeter at openjdk.org Mon Feb 2 10:04:44 2026 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Feb 2026 10:04:44 GMT Subject: RFR: 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations [v8] In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 15:04:25 GMT, Roland Westrelin wrote: >> For this failure memory stats are: >> >> >> Total Usage: 1095525816 >> --- Arena Usage by Arena Type and compilation phase, at arena usage peak of 1095525816 --- >> Phase Total ra node comp type states reglive regsplit regmask superword cienv ha other >> none 5976032 331560 5402064 197512 33712 10200 0 0 984 0 0 0 0 >> parse 2716464 65456 1145480 196408 1112752 0 0 0 0 0 196368 0 0 >> optimizer 98184 0 32728 0 65456 0 0 0 0 0 0 0 0 >> connectionGraph 32728 0 0 32728 0 0 0 0 0 0 0 0 0 >> iterGVN 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> idealLoop 918189632 0 38687056 872824784 392776 0 0 0 0 0 6285016 0 0 >> idealLoopVerify 2228144 0 0 2228144 0 0 0 0 0 0 0 0 0 >> macroExpand 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> graphReshape 32728 0 32728 0 0 0 0 0 0 0 0 0 0 >> matcher 20135944 3369848 9033208 7536400 65456 131032 0 0 0 0 0 0 0 >> postselect_cleanup 294872 294872 0 0 0 0 0 0 0 0 0 0 0 >> scheduler 752944 196488 556456 0 0 0 0 0 0 0 0 0 0 >> regalloc 388736 388736 0 0 0 0 0 0 0 0 0 0 0 >> ... > > Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/memory/arena.hpp > > Co-authored-by: Manuel H?ssig > - Update src/hotspot/share/opto/loopnode.cpp > > Co-authored-by: Manuel H?ssig > - Update src/hotspot/share/opto/loopnode.hpp > > Co-authored-by: Manuel H?ssig Looks good, thanks for the updates @rwestrel ! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28581#pullrequestreview-3738415604 From shade at openjdk.org Mon Feb 2 10:36:29 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 10:36:29 GMT Subject: RFR: 8376570: GrowableArray::remove_{till,range} should work on empty list In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 09:48:58 GMT, Aleksey Shipilev wrote: > Split from [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046), we want to make sure GrowableArray removal methods work appropriately with empty lists. > > Testing: > - [x] New test > - [x] Linux x86_64 server fastdebug, `all` (in course of [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046) testing) Thank you! Let's go. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29462#issuecomment-3834285269 From shade at openjdk.org Mon Feb 2 10:36:30 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 10:36:30 GMT Subject: Integrated: 8376570: GrowableArray::remove_{till, range} should work on empty list In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 09:48:58 GMT, Aleksey Shipilev wrote: > Split from [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046), we want to make sure GrowableArray removal methods work appropriately with empty lists. > > Testing: > - [x] New test > - [x] Linux x86_64 server fastdebug, `all` (in course of [JDK-8375046](https://bugs.openjdk.org/browse/JDK-8375046) testing) This pull request has now been integrated. Changeset: e370b8a1 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/e370b8a1d834a0a6ebcd1d5946a5533c015ed960 Stats: 122 lines in 2 files changed: 115 ins; 0 del; 7 mod 8376570: GrowableArray::remove_{till,range} should work on empty list Reviewed-by: kbarrett, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/29462 From shade at openjdk.org Mon Feb 2 11:07:46 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Feb 2026 11:07:46 GMT Subject: RFR: 8375438: G1: Convert G1HeapRegion related classes to use Atomic [v2] In-Reply-To: References: Message-ID: On Tue, 20 Jan 2026 11:32:13 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review conversion of G1HeapRegion related classes to use Atomic. >> >> Testing: tier1, tier4, tier5 >> >> (The PipelineLeaksFD failure in gha is a known issue) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * shade review Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29301#pullrequestreview-3738736964 From aph at openjdk.org Mon Feb 2 11:18:38 2026 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Feb 2026 11:18:38 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction [v2] In-Reply-To: <9ODLGgIWL6x0UlzS81yDsxVxWWESoCtZh77EcIAjH0U=.f8832988-2525-49dc-9a19-1c1d6f7a1d81@github.com> References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> <595lDgLFjcH0tzzdeacMVa_1fPt3PQhKIhibehSvpZk=.3f01b98a-ce6b-4c81-92da-235443e81f9b@github.com> <4zd3EIKU033iAFrjn7h3BQMbG-4R0DlhELQ_yEAXaZ0=.cfd0af29-7b06-45f9-8547-cf94b91aaab2@github.com> <9ODLGgIWL6x0UlzS81yDsxVxWWESoCtZh77EcIAjH0U=.f8832988-2525-49dc-9a19-1c1d6f7a1d81@github.com> Message-ID: <7g2UNaXs2NbRXX7r7YTHZFdq1X-q0Ix8wjSnHDoZnoQ=.59ae7bec-4a3e-4264-a997-e0ecb9fe0f06@github.com> On Mon, 2 Feb 2026 09:52:31 GMT, Eric Fang wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2846: >> >>> 2844: void C2_MacroAssembler::sve_cpy_optimized(FloatRegister dst, SIMD_RegVariant T, >>> 2845: PRegister pg, int imm8, bool isMerge) { >>> 2846: // When prefer_sve_merging_mode_cpy is enabled, optimize the SVE `cpy >> >> This comment says nothing that is not obvious from the code. > > I?d like to briefly document the main idea of this method. How about adding a brief comment before the method like `Provide an optimized implementation for cpy (imm, zeroing) instruction`, or do you think it would be better to remove the comment? If a comment says nothing that is not obvious from reading the code, the comment should be removed. It makes sense to explain why this is better, maybe with reference to documentation elsewhere. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29359#discussion_r2753831979 From azafari at openjdk.org Mon Feb 2 11:29:03 2026 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 2 Feb 2026 11:29:03 GMT Subject: RFR: 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 01:13:35 GMT, David Holmes wrote: > An ASAN enabled build reported heap-buffer-overflow in `MethodHandles::is_basic_type_signature` with `ASAN_OPTIONS=strict_string_checks=true` when running test `jdk/jdk/jfr/api/metadata/annotations/TestThrottle.java` > > The code is here: > > bool MethodHandles::is_basic_type_signature(Symbol* sig) { > assert(vmSymbols::object_signature()->utf8_length() == (int)OBJ_SIG_LEN, ""); > assert(vmSymbols::object_signature()->equals(OBJ_SIG), ""); > for (SignatureStream ss(sig, sig->starts_with(JVM_SIGNATURE_FUNC)); !ss.is_done(); ss.next()) { > switch (ss.type()) { > case T_OBJECT: > // only java/lang/Object is valid here > if (strncmp((char*) ss.raw_bytes(), OBJ_SIG, OBJ_SIG_LEN) != 0) > > The ASAN `strncmp` interceptor acts as follows: > > INTERCEPTOR(int, strncmp, const char *s1, const char *s2, size_t n) { > void *ctx; > ASAN_INTERCEPTOR_ENTER(linker, strncmp); // Sets up context > ASAN_READ_RANGE(s1, n); // Validates s1 > ASAN_READ_RANGE(s2, n); // Validates s2 > return REAL(strncmp)(s1, s2, n); // Calls original function > } > > With the test given `s1` is a buffer of size 15, containing a non-nul-terminated string, and `n` is 18, so `ASAN_READ_RANGE` fails for `s1` as we could potentially read beyond the end of the buffer. In practice however, given `s1` is guaranteed to be a valid type-string from a signature symbol of type `T_OBJECT`, its final character is `;` and the final character of `s2` is also `;` (it is the string constant `Ljava/lang/Object;`). Hence the comparison must terminate before we can run off the end of `s1`. > > To appease ASAN we can make a simple change to the `strncmp` call to compare at most `ss.raw_length()` bytes. > > Testing > - ASAN no longer reports an error > - tiers 1-3 sanity > > Thanks Thank you David for taking and fixing this. ------------- Marked as reviewed by azafari (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29516#pullrequestreview-3738832412 From thartmann at openjdk.org Mon Feb 2 11:40:03 2026 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 2 Feb 2026 11:40:03 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v10] In-Reply-To: References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: On Mon, 2 Feb 2026 09:55:37 GMT, Damon Fenacci wrote: >> ## Issue >> >> This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. >> >> This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. >> >> ## Causes >> >> The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. >> >> ## Fix >> >> A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: >> https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 >> This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. >> >> # Testing >> >> * Tier 1-3+ >> * 2 JTReg tests added >> * `TestRangeCheck.java` as regression test for the reported issue >> * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion > > Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8374582: revert wrong copyright change Thanks for working on this Damon. I added a few comments, otherwise it looks good! src/hotspot/share/opto/library_call.cpp line 894: > 892: > 893: inline Node* LibraryCallKit::generate_negative_guard(Node* index, RegionNode* region, > 894: Node** pos_index, bool is_opaque) { As we discussed offline, I think `with_opaque` is better here. src/hotspot/share/opto/opaquenode.hpp line 145: > 143: // with false in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. > 144: // In debug builds, we keep the actual checks as additional verification code (i.e. removing OpaqueConstantBoolNodes and > 145: // use the BoolNode inputs instead). Nice comment! src/hotspot/share/opto/opaquenode.hpp line 148: > 146: class OpaqueConstantBoolNode : public Node { > 147: private: > 148: bool _constant; Should this be `const`? src/hotspot/share/opto/opaquenode.hpp line 150: > 148: bool _constant; > 149: public: > 150: OpaqueConstantBoolNode(Compile* C, Node* tst, bool constant) : Node(nullptr, tst), _constant(constant) { An alternative would be to have the `constant` be an actual input node instead of a field. In macro expansion, you could then do `_igvn.replace_node(n, n->in(2));` instead (maybe define an enum for the input indices). I don't have a strong opinion on this though and leave it up to you to decide ? ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29164#pullrequestreview-3738450475 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2753636949 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2753548067 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2753551976 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2753586409 From roland at openjdk.org Mon Feb 2 11:46:30 2026 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 2 Feb 2026 11:46:30 GMT Subject: RFR: 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations [v8] In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 10:01:33 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update src/hotspot/share/memory/arena.hpp >> >> Co-authored-by: Manuel H?ssig >> - Update src/hotspot/share/opto/loopnode.cpp >> >> Co-authored-by: Manuel H?ssig >> - Update src/hotspot/share/opto/loopnode.hpp >> >> Co-authored-by: Manuel H?ssig > > Looks good, thanks for the updates @rwestrel ! @eme64 thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28581#issuecomment-3834624426 From roland at openjdk.org Mon Feb 2 11:46:32 2026 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 2 Feb 2026 11:46:32 GMT Subject: Integrated: 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations In-Reply-To: References: Message-ID: On Mon, 1 Dec 2025 15:40:00 GMT, Roland Westrelin wrote: > For this failure memory stats are: > > > Total Usage: 1095525816 > --- Arena Usage by Arena Type and compilation phase, at arena usage peak of 1095525816 --- > Phase Total ra node comp type states reglive regsplit regmask superword cienv ha other > none 5976032 331560 5402064 197512 33712 10200 0 0 984 0 0 0 0 > parse 2716464 65456 1145480 196408 1112752 0 0 0 0 0 196368 0 0 > optimizer 98184 0 32728 0 65456 0 0 0 0 0 0 0 0 > connectionGraph 32728 0 0 32728 0 0 0 0 0 0 0 0 0 > iterGVN 32728 0 32728 0 0 0 0 0 0 0 0 0 0 > idealLoop 918189632 0 38687056 872824784 392776 0 0 0 0 0 6285016 0 0 > idealLoopVerify 2228144 0 0 2228144 0 0 0 0 0 0 0 0 0 > macroExpand 32728 0 32728 0 0 0 0 0 0 0 0 0 0 > graphReshape 32728 0 32728 0 0 0 0 0 0 0 0 0 0 > matcher 20135944 3369848 9033208 7536400 65456 131032 0 0 0 0 0 0 0 > postselect_cleanup 294872 294872 0 0 0 0 0 0 0 0 0 0 0 > scheduler 752944 196488 556456 0 0 0 0 0 0 0 0 0 0 > regalloc 388736 388736 0 0 0 0 0 0 0 0 0 0 0 > ctorChaitin 160032 ... This pull request has now been integrated. Changeset: 176422b8 Author: Roland Westrelin URL: https://git.openjdk.org/jdk/commit/176422b885d2d045dd44b61b7fcdcb01be2d00a7 Stats: 171 lines in 4 files changed: 147 ins; 14 del; 10 mod 8370519: C2: Hit MemLimit when running with +VerifyLoopOptimizations Co-authored-by: Beno?t Maillard Reviewed-by: mhaessig, bmaillard, epeter ------------- PR: https://git.openjdk.org/jdk/pull/28581 From dfenacci at openjdk.org Mon Feb 2 12:01:51 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 12:01:51 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v11] In-Reply-To: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: > ## Issue > > This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. > > This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. > > ## Causes > > The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. > > ## Fix > > A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: > https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 > This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. > > # Testing > > * Tier 1-3+ > * 2 JTReg tests added > * `TestRangeCheck.java` as regression test for the reported issue > * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion Damon Fenacci has updated the pull request incrementally with two additional commits since the last revision: - JDK-8374582: add const - JDK-8374582: with_opaque ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29164/files - new: https://git.openjdk.org/jdk/pull/29164/files/5ac3e6e3..0d4eef88 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=09-10 Stats: 7 lines in 3 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/29164.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29164/head:pull/29164 PR: https://git.openjdk.org/jdk/pull/29164 From dfenacci at openjdk.org Mon Feb 2 12:13:36 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 12:13:36 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v10] In-Reply-To: References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: On Mon, 2 Feb 2026 10:29:20 GMT, Tobias Hartmann wrote: >> Damon Fenacci has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8374582: revert wrong copyright change > > src/hotspot/share/opto/library_call.cpp line 894: > >> 892: >> 893: inline Node* LibraryCallKit::generate_negative_guard(Node* index, RegionNode* region, >> 894: Node** pos_index, bool is_opaque) { > > As we discussed offline, I think `with_opaque` is better here. Renamed. Thanks @TobiHartmann. > src/hotspot/share/opto/opaquenode.hpp line 148: > >> 146: class OpaqueConstantBoolNode : public Node { >> 147: private: >> 148: bool _constant; > > Should this be `const`? Yep, fixed. > src/hotspot/share/opto/opaquenode.hpp line 150: > >> 148: bool _constant; >> 149: public: >> 150: OpaqueConstantBoolNode(Compile* C, Node* tst, bool constant) : Node(nullptr, tst), _constant(constant) { > > An alternative would be to have the `constant` be an actual input node instead of a field. In macro expansion, you could then do `_igvn.replace_node(n, n->in(2));` instead (maybe define an enum for the input indices). I don't have a strong opinion on this though and leave it up to you to decide ? Cool trick! ?... but now I can't decide between the two ? @chhagedorn do you fancy being the tiebreaker? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2754030906 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2754030537 PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2754032138 From duke at openjdk.org Mon Feb 2 12:34:01 2026 From: duke at openjdk.org (duke) Date: Mon, 2 Feb 2026 12:34:01 GMT Subject: Withdrawn: 8369021: A crash in ConstantPool::klass_at_impl In-Reply-To: References: Message-ID: On Wed, 1 Oct 2025 20:21:45 GMT, Jan Kratochvil wrote: > https://bugs.openjdk.org/browse/JDK-8369021 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/27595 From duke at openjdk.org Mon Feb 2 12:42:26 2026 From: duke at openjdk.org (Ruben) Date: Mon, 2 Feb 2026 12:42:26 GMT Subject: RFR: 8372942: AArch64: Set JVM flags for Neoverse V3AE core [v2] In-Reply-To: <8eI5E6cyFbIzKfiWurr2ovAUQEML2LDiJIW11BFX27w=.962bede5-2cc2-4e90-97f9-4953750f4b11@github.com> References: <8eI5E6cyFbIzKfiWurr2ovAUQEML2LDiJIW11BFX27w=.962bede5-2cc2-4e90-97f9-4953750f4b11@github.com> Message-ID: On Wed, 14 Jan 2026 09:02:33 GMT, Andrew Haley wrote: >> Thanks, this is fine. >> I wonder if we should be thinking about replacing some of this open-coded logic with something more expressive and concise. This bunch of model_is() expressions could be a switch, for example. > >> Thank you for review, @theRealAph, >> >> > I wonder if we should be thinking about replacing some of this open-coded logic with something more expressive and concise. This bunch of model_is() expressions could be a switch, for example. >> >> While switch-case might not be easily applicable because we have two variables, both of which have to be compared with the values, > > I only see one here, the `model_is`. > >> perhaps an interface like `bool is_model_any_of(std::initializer_list list)` can simplify the code. Would this approach be suitable? > > Maybe, if it's made as simple as possible. > >> Would you like this to be changed within this PR? > > I think so. @theRealAph, > I'm thinking of bool is_model_any_of(std::initializer_list list) which would iterate over the list and call model_is for each candidate I've added this as `model_is_in` interface. Does the new implementation look suitable? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28607#issuecomment-3834887509 From chagedorn at openjdk.org Mon Feb 2 13:28:57 2026 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Feb 2026 13:28:57 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v10] In-Reply-To: References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: On Mon, 2 Feb 2026 12:10:48 GMT, Damon Fenacci wrote: >> src/hotspot/share/opto/opaquenode.hpp line 150: >> >>> 148: bool _constant; >>> 149: public: >>> 150: OpaqueConstantBoolNode(Compile* C, Node* tst, bool constant) : Node(nullptr, tst), _constant(constant) { >> >> An alternative would be to have the `constant` be an actual input node instead of a field. In macro expansion, you could then do `_igvn.replace_node(n, n->in(2));` instead (maybe define an enum for the input indices). I don't have a strong opinion on this though and leave it up to you to decide ? > > Cool trick! ?... but now I can't decide between the two ? @chhagedorn do you fancy being the tiebreaker? The old `Opaque4` nodes used to have two data inputs where the second one was the replacement. I found it a little harder to view graphs in IGV with one more input. You also do not need to worry about trying to understand what the second input means. So, I would rather have a field if I may break the tie but both options are fine :-) When going with a field, you could add a NOT_PRODUCT(void dump_spec(outputStream* st) const); that prints `#true` or `#false` depending on `_constant` (that could also then be shown in IGV with the "Show custom node info"). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2754324868 From dfenacci at openjdk.org Mon Feb 2 14:03:00 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 14:03:00 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v12] In-Reply-To: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: <2_bA8sRgRlbc279Aia0oD9gPBn8bcD5kLP3RnA4Xl4Q=.deaeaaf0-27a1-40f8-81f3-c8283c4d9529@github.com> > ## Issue > > This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. > > This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. > > ## Causes > > The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. > > ## Fix > > A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: > https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 > This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. > > # Testing > > * Tier 1-3+ > * 2 JTReg tests added > * `TestRangeCheck.java` as regression test for the reported issue > * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion Damon Fenacci has updated the pull request incrementally with two additional commits since the last revision: - JDK-8374582: remove empty line - JDK-8374582: add constant dump ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29164/files - new: https://git.openjdk.org/jdk/pull/29164/files/0d4eef88..44b68dbc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29164&range=10-11 Stats: 7 lines in 2 files changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29164.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29164/head:pull/29164 PR: https://git.openjdk.org/jdk/pull/29164 From dfenacci at openjdk.org Mon Feb 2 14:05:43 2026 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Feb 2026 14:05:43 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v10] In-Reply-To: References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> Message-ID: <9TRsuJgH4W8hsmU02_3jXvLwPotWWdihBDjXoA_DZ3A=.577eb61f-6333-4cca-ad89-bc17c73bb660@github.com> On Mon, 2 Feb 2026 13:26:12 GMT, Christian Hagedorn wrote: > So, I would rather have a field if I may break Let's go with the field then ? > ```NOT_PRODUCT(void dump_spec(outputStream* st) const);``` Good idea! Added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29164#discussion_r2754497278 From jsjolen at openjdk.org Mon Feb 2 14:05:43 2026 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 2 Feb 2026 14:05:43 GMT Subject: RFR: 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 01:13:35 GMT, David Holmes wrote: > An ASAN enabled build reported heap-buffer-overflow in `MethodHandles::is_basic_type_signature` with `ASAN_OPTIONS=strict_string_checks=true` when running test `jdk/jdk/jfr/api/metadata/annotations/TestThrottle.java` > > The code is here: > > bool MethodHandles::is_basic_type_signature(Symbol* sig) { > assert(vmSymbols::object_signature()->utf8_length() == (int)OBJ_SIG_LEN, ""); > assert(vmSymbols::object_signature()->equals(OBJ_SIG), ""); > for (SignatureStream ss(sig, sig->starts_with(JVM_SIGNATURE_FUNC)); !ss.is_done(); ss.next()) { > switch (ss.type()) { > case T_OBJECT: > // only java/lang/Object is valid here > if (strncmp((char*) ss.raw_bytes(), OBJ_SIG, OBJ_SIG_LEN) != 0) > > The ASAN `strncmp` interceptor acts as follows: > > INTERCEPTOR(int, strncmp, const char *s1, const char *s2, size_t n) { > void *ctx; > ASAN_INTERCEPTOR_ENTER(linker, strncmp); // Sets up context > ASAN_READ_RANGE(s1, n); // Validates s1 > ASAN_READ_RANGE(s2, n); // Validates s2 > return REAL(strncmp)(s1, s2, n); // Calls original function > } > > With the test given `s1` is a buffer of size 15, containing a non-nul-terminated string, and `n` is 18, so `ASAN_READ_RANGE` fails for `s1` as we could potentially read beyond the end of the buffer. In practice however, given `s1` is guaranteed to be a valid type-string from a signature symbol of type `T_OBJECT`, its final character is `;` and the final character of `s2` is also `;` (it is the string constant `Ljava/lang/Object;`). Hence the comparison must terminate before we can run off the end of `s1`. > > To appease ASAN we can make a simple change to the `strncmp` call to compare at most `ss.raw_length()` bytes. > > Testing > - ASAN no longer reports an error > - tiers 1-3 sanity > > Thanks LGTM ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29516#pullrequestreview-3739605984 From iwalulya at openjdk.org Mon Feb 2 14:53:45 2026 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 2 Feb 2026 14:53:45 GMT Subject: RFR: 8376357: Parallel: Convert MutableSpace classes to use Atomic In-Reply-To: References: Message-ID: On Mon, 26 Jan 2026 17:14:07 GMT, Thomas Schatzl wrote: > Hi all, > > please review these changes that convert `MutableSpace` classes to use `Atomic`. > > Testing: gha, tier1-5 > > Thanks, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29427#pullrequestreview-3739922441 From tschatzl at openjdk.org Mon Feb 2 15:19:28 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 15:19:28 GMT Subject: RFR: 8375438: G1: Convert G1HeapRegion related classes to use Atomic [v2] In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 11:05:07 GMT, Aleksey Shipilev wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> * shade review > > Marked as reviewed by shade (Reviewer). Thanks @shipilev @walulyai for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/29301#issuecomment-3835773407 From tschatzl at openjdk.org Mon Feb 2 15:19:40 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 15:19:40 GMT Subject: RFR: 8376357: Parallel: Convert MutableSpace classes to use Atomic In-Reply-To: <6AE-M8uvq7FAfVoArxQTWI5zM61RxTapF0OjUHz8YNY=.fed7cbf8-6f33-4c78-a38b-1beb56580beb@github.com> References: <6AE-M8uvq7FAfVoArxQTWI5zM61RxTapF0OjUHz8YNY=.fed7cbf8-6f33-4c78-a38b-1beb56580beb@github.com> Message-ID: On Wed, 28 Jan 2026 21:18:07 GMT, David Holmes wrote: >> Hi all, >> >> please review these changes that convert `MutableSpace` classes to use `Atomic`. >> >> Testing: gha, tier1-5 >> >> Thanks, >> Thomas > > Overall looks good to me, but one nit. > > Thanks Thanks @dholmes-ora @walulyai for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/29427#issuecomment-3835756480 From tschatzl at openjdk.org Mon Feb 2 15:19:43 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 15:19:43 GMT Subject: Integrated: 8376357: Parallel: Convert MutableSpace classes to use Atomic In-Reply-To: References: Message-ID: On Mon, 26 Jan 2026 17:14:07 GMT, Thomas Schatzl wrote: > Hi all, > > please review these changes that convert `MutableSpace` classes to use `Atomic`. > > Testing: gha, tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: b7128b7c Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/b7128b7c30f3de2c1dcee2be567bb25d407c71a2 Stats: 30 lines in 4 files changed: 3 ins; 8 del; 19 mod 8376357: Parallel: Convert MutableSpace classes to use Atomic Reviewed-by: dholmes, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/29427 From tschatzl at openjdk.org Mon Feb 2 15:22:10 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 15:22:10 GMT Subject: Integrated: 8375438: G1: Convert G1HeapRegion related classes to use Atomic In-Reply-To: References: Message-ID: On Mon, 19 Jan 2026 13:32:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review conversion of G1HeapRegion related classes to use Atomic. > > Testing: tier1, tier4, tier5 > > (The PipelineLeaksFD failure in gha is a known issue) > > Thanks, > Thomas This pull request has now been integrated. Changeset: 903b3fe1 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/903b3fe19596adaeac7cfb0d749b6e83f668f52f Stats: 60 lines in 8 files changed: 19 ins; 3 del; 38 mod 8375438: G1: Convert G1HeapRegion related classes to use Atomic Reviewed-by: shade, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/29301 From tschatzl at openjdk.org Mon Feb 2 15:43:59 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 15:43:59 GMT Subject: RFR: 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic [v2] In-Reply-To: References: Message-ID: <9Ky4MZ0LMzsR9J_k7ushMAg37KSBjhcjdBFN0zNvOQk=.352bedd6-6eba-49e1-90a0-4aebbc46510d@github.com> > Hi all, > > use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. > > Testing: gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into submit/8375535-use-atomic-t-cardtablebarrierset - 8375535 Hi all, use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. Testing: gha Thanks, Thomas ------------- Changes: https://git.openjdk.org/jdk/pull/29360/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29360&range=01 Stats: 35 lines in 7 files changed: 6 ins; 0 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/29360.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29360/head:pull/29360 PR: https://git.openjdk.org/jdk/pull/29360 From qpzhang at openjdk.org Mon Feb 2 15:47:00 2026 From: qpzhang at openjdk.org (Patrick Zhang) Date: Mon, 2 Feb 2026 15:47:00 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v12] In-Reply-To: References: Message-ID: > Issue: > In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. > > This PR: > 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. > 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. > 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. > > Tests: > 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. > 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: Trigger OCA recheck ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26917/files - new: https://git.openjdk.org/jdk/pull/26917/files/5535721e..082bafe0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26917&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26917&range=10-11 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26917.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26917/head:pull/26917 PR: https://git.openjdk.org/jdk/pull/26917 From thartmann at openjdk.org Mon Feb 2 16:02:56 2026 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 2 Feb 2026 16:02:56 GMT Subject: RFR: 8374582: [REDO] Move input validation checks to Java for java.lang.StringCoding intrinsics [v12] In-Reply-To: <2_bA8sRgRlbc279Aia0oD9gPBn8bcD5kLP3RnA4Xl4Q=.deaeaaf0-27a1-40f8-81f3-c8283c4d9529@github.com> References: <3ci9RXEra2BlQPhYl-M0Wnu3hRpWaDvxPnMRzFnJA_k=.67795fb3-95d1-449b-a7a9-44b3776aa626@github.com> <2_bA8sRgRlbc279Aia0oD9gPBn8bcD5kLP3RnA4Xl4Q=.deaeaaf0-27a1-40f8-81f3-c8283c4d9529@github.com> Message-ID: <_mVonDnsPn3yCi7haKqAlC_3iD8GNOojYbMt4xuUf_Y=.2c887c19-5dba-4501-bec4-faba0a2dca9b@github.com> On Mon, 2 Feb 2026 14:03:00 GMT, Damon Fenacci wrote: >> ## Issue >> >> This is a redo of [JDK-8361842](https://bugs.openjdk.org/browse/JDK-8361842) which was backed out by [JDK-8374210](https://bugs.openjdk.org/browse/JDK-8374210) due to C2-related regressions. The original change moved input validation checks for java.lang.StringCoding from the intrinsic to Java code (leaving the intrinsic check only with the `VerifyIntrinsicChecks` flag). Refer to the [original PR](https://github.com/openjdk/jdk/pull/25998) for details. >> >> This additional issue happens because, in some cases, for instance when the Java checking code is not inlined and we give an out-of-range constant as input, we fold the data path but not the control path and we crash in the backend. >> >> ## Causes >> >> The cause of this is that the out-of-range constant (e.g. -1) floats into the intrinsic and there (assuming the input is valid) we add a constraint to its type to positive integers (e.g. to compute the array address) which makes it top. >> >> ## Fix >> >> A possible fix is to introduce an opaque node (OpaqueGuardNode) similar to what we do in `must_be_not_null` for values that we know cannot be null: >> https://github.com/openjdk/jdk/blob/ce721665cd61d9a319c667d50d9917c359d6c104/src/hotspot/share/opto/graphKit.cpp#L1484 >> This will temporarily add the range check to ensure that C2 figures that out-of-range values cannot reach the intrinsic. Then, during macro expansion, we replace the opaque node with the corresponding constant (true/false) in product builds such that the actually unneeded guards are folded and do not end up in the emitted code. >> >> # Testing >> >> * Tier 1-3+ >> * 2 JTReg tests added >> * `TestRangeCheck.java` as regression test for the reported issue >> * `TestOpaqueGuardNodes.java` to check that opaque guard nodes are added when parsing and removed at macro expansion > > Damon Fenacci has updated the pull request incrementally with two additional commits since the last revision: > > - JDK-8374582: remove empty line > - JDK-8374582: add constant dump That looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29164#pullrequestreview-3740390312 From kbarrett at openjdk.org Mon Feb 2 16:06:34 2026 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Feb 2026 16:06:34 GMT Subject: RFR: 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic [v2] In-Reply-To: <9Ky4MZ0LMzsR9J_k7ushMAg37KSBjhcjdBFN0zNvOQk=.352bedd6-6eba-49e1-90a0-4aebbc46510d@github.com> References: <9Ky4MZ0LMzsR9J_k7ushMAg37KSBjhcjdBFN0zNvOQk=.352bedd6-6eba-49e1-90a0-4aebbc46510d@github.com> Message-ID: On Mon, 2 Feb 2026 15:43:59 GMT, Thomas Schatzl wrote: >> Hi all, >> >> use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. >> >> Testing: gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge branch 'master' into submit/8375535-use-atomic-t-cardtablebarrierset > - 8375535 > > Hi all, > > use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. > > Testing: gha > > Thanks, > Thomas Still good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29360#pullrequestreview-3740383119 From tschatzl at openjdk.org Mon Feb 2 16:06:35 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 16:06:35 GMT Subject: RFR: 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic [v2] In-Reply-To: References: <9Ky4MZ0LMzsR9J_k7ushMAg37KSBjhcjdBFN0zNvOQk=.352bedd6-6eba-49e1-90a0-4aebbc46510d@github.com> Message-ID: On Mon, 2 Feb 2026 15:59:12 GMT, Kim Barrett wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - Merge branch 'master' into submit/8375535-use-atomic-t-cardtablebarrierset >> - 8375535 >> >> Hi all, >> >> use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. >> >> Testing: gha >> >> Thanks, >> Thomas > > Still good. Thanks @kimbarrett @walulyai for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/29360#issuecomment-3836097517 From tschatzl at openjdk.org Mon Feb 2 16:06:37 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 2 Feb 2026 16:06:37 GMT Subject: Integrated: 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic In-Reply-To: References: Message-ID: On Thu, 22 Jan 2026 12:58:39 GMT, Thomas Schatzl wrote: > Hi all, > > use `Atomic` instead of `AtomicAccess` in `CardTableBarrierSet` and subclasses. Since this modifies `CardTableBarrierSet::_card_table` the change has some fan-out. > > Testing: gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: 9871e2d3 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/9871e2d3f771ee2bc1b2473c0eb28a0bfc1c5456 Stats: 35 lines in 7 files changed: 6 ins; 0 del; 29 mod 8375535: G1: Convert CardTableBarrierSet and subclasses to use Atomic Reviewed-by: kbarrett, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/29360 From aph at openjdk.org Mon Feb 2 16:26:10 2026 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Feb 2026 16:26:10 GMT Subject: RFR: 8372942: AArch64: Set JVM flags for Neoverse V3AE core [v2] In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 22:22:50 GMT, Ruben wrote: >> For Neoverse N1, N2, N3, V1, V2 and V3, the following JVM flags are set: >> - UseSIMDForMemoryOps=true >> - OnSpinWaitInst=isb >> - OnSpinWaitInstCount=1 >> - AlwaysMergeDMB=false >> >> Additionally, for Neoverse V1, V2 and V3 only, these flags are set: >> - UseCryptoPmullForCRC32=true >> - CodeEntryAlignment=32 >> >> Enable the same flags for Neoverse V3AE. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Introduce `model_is_in` Great! Ship it. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28607#pullrequestreview-3740534141 From iwalulya at openjdk.org Mon Feb 2 17:10:28 2026 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 2 Feb 2026 17:10:28 GMT Subject: RFR: 8376195: Convert ThreadLocalAllocBuffer to use Atomic [v2] In-Reply-To: <2rAYqg4JIvntyUtk-qDU1oywjCA372LK1JyAZYQxTss=.12c4303f-cbcf-464e-83fd-edde06c83f30@github.com> References: <1TAUwWsHEcAIzMF35Q3v9xsDhsNV6ZhGZDCO9fh93KI=.f14438d0-8874-4259-b33d-83ad8b9bf2b3@github.com> <2rAYqg4JIvntyUtk-qDU1oywjCA372LK1JyAZYQxTss=.12c4303f-cbcf-464e-83fd-edde06c83f30@github.com> Message-ID: On Mon, 26 Jan 2026 11:00:10 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review the change to use `Atomic` in `ThreadLocalAllocBuffer`. >> >> Testing: gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with two additional commits since the last revision: > > - * kbarrett review > - * kbarrett review Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29386#pullrequestreview-3740800467 From aph at openjdk.org Mon Feb 2 18:08:54 2026 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Feb 2026 18:08:54 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching [v26] In-Reply-To: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> References: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> Message-ID: <5ndj0gE-T75cTq9SIs6slsLOnumMzlXPWOFGk3KZvgE=.a4d9ede5-5e71-4da5-a8f3-d380e58f1a34@github.com> > In MacOS/AArch64 HotSpot, we have to deal with the fact that a thread must be in one of two modes: it either may write to code cache memory or it may execute (and read) code or data in it. A system call `pthread_jit_write_protect_np(int enabled)` changes from one to the other. > > Today, we change mode whenever making a transition from interpreter to VM. This means that we change mode a lot: experiments have shown that during `jshell` startup we change mode 4 million times. Other experiments have shown that we only needed to change mode 45 thousand times. > > This "eager" mode switching is perhaps too eager, and we'd be better off switching lazily. While the system call that changes mode is very fast, mode switching still amounts to about 100ms of startup time. Switching eagerly also means that some native calls (e.g. to do arithmetic) are disproportionately expensive, given that they have no need of mode switching at all. > > The approach in this PR is to defer transitioning from exec-but-don't-write mode (`WXExec`) to write-but-don't-exec mode (`WXWrite`) until we need to write. Instead of enabling `WXWrite` immediately, we switch to a mode called `WXArmedForWrite`. When in this mode, when we need to write into code memory we call `os_bsd_jit_exec_enabled(false)` to enable writing and then set the current mode to `WXWrite`. > > We mark all sites that we know will write to code memory with > `MACOS_AARCH64_ONLY(os::thread_wx_enable_write());` Judicious placement of these markers, such as when entering patching code, means that we have a fairly small number of these. > > We also keep track (in thread-local storage) of the current state of `pthread_jit_write_protect_np` in order to avoid making the system call unnecessarily. > > It is possible that we have missed some sites where we do need to make a transition from write-protected to -enabled. While we haven't seen any in testing, we have a fallback path. An attempt to write into code memory triggers a `SIGILL` signal. A signal handler detects this, and if the current mode `WXArmedForWrite` it changes mode to write-enabled and returns. In addition, the handler "heals" the VM entry point so that next time the same point is entered (and for the rest of the lifetime of the VM) it will immediately transition to `WXWrite`. > > One other possibility remains: we could omit all of the `wx_enable_write` markers and use healing instead. We've experimented with this. It works well enough, but is rather crude, and it's better to be able to ... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Back out 37730e6aac899e1fbdcf4f201ac2ae1013201432 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26562/files - new: https://git.openjdk.org/jdk/pull/26562/files/c6652628..05860429 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26562&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26562&range=24-25 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26562/head:pull/26562 PR: https://git.openjdk.org/jdk/pull/26562 From aph at openjdk.org Mon Feb 2 18:11:12 2026 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Feb 2026 18:11:12 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching [v23] In-Reply-To: References: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> Message-ID: On Fri, 30 Jan 2026 03:42:50 GMT, Dean Long wrote: >> So, I'll happily drop this one change. > > Yes, drop this change and I'll test it again. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26562#discussion_r2755622645 From missa at openjdk.org Mon Feb 2 18:40:37 2026 From: missa at openjdk.org (Mohamed Issa) Date: Mon, 2 Feb 2026 18:40:37 GMT Subject: RFR: 8371955: Support AVX10 floating point comparison instructions [v9] In-Reply-To: References: Message-ID: <71uI1BCZPJmBfUhtRMBcRREf63StolB9Ch0vhgPgZeU=.c3bbfca7-e9da-4caf-82c6-be28ef4f98fe@github.com> On Thu, 29 Jan 2026 09:02:03 GMT, Emanuel Peter wrote: > FYI: testing launched ? @eme64 Did tests pass? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28337#issuecomment-3836979281 From psandoz at openjdk.org Mon Feb 2 20:25:15 2026 From: psandoz at openjdk.org (Paul Sandoz) Date: Mon, 2 Feb 2026 20:25:15 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v6] In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 09:07:21 GMT, Jatin Bhateja wrote: >> As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. >> >> Patch add new lane type constants and pass them to vector intrinsic entry points. >> >> All existing Vector API jtreg test are passing with the patch. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comment resolution Very good. Approved, there is just one comment related to adding a comment for the LT_* values. Thank you for separating this out from the float16 PR. Needs a HotSpot reviewer too. We will run it through tier 1 to 3 testing. src/hotspot/share/prims/vectorSupport.hpp line 140: > 138: }; > 139: > 140: enum LaneType { Please add a comment referencing `LaneType` and that the values in this enum correspond to the LaneType ordinal values. ------------- Marked as reviewed by psandoz (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29481#pullrequestreview-3741431390 PR Review Comment: https://git.openjdk.org/jdk/pull/29481#discussion_r2755893774 From dholmes at openjdk.org Mon Feb 2 22:42:11 2026 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Feb 2026 22:42:11 GMT Subject: RFR: 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 11:26:01 GMT, Afshin Zafari wrote: >> An ASAN enabled build reported heap-buffer-overflow in `MethodHandles::is_basic_type_signature` with `ASAN_OPTIONS=strict_string_checks=true` when running test `jdk/jdk/jfr/api/metadata/annotations/TestThrottle.java` >> >> The code is here: >> >> bool MethodHandles::is_basic_type_signature(Symbol* sig) { >> assert(vmSymbols::object_signature()->utf8_length() == (int)OBJ_SIG_LEN, ""); >> assert(vmSymbols::object_signature()->equals(OBJ_SIG), ""); >> for (SignatureStream ss(sig, sig->starts_with(JVM_SIGNATURE_FUNC)); !ss.is_done(); ss.next()) { >> switch (ss.type()) { >> case T_OBJECT: >> // only java/lang/Object is valid here >> if (strncmp((char*) ss.raw_bytes(), OBJ_SIG, OBJ_SIG_LEN) != 0) >> >> The ASAN `strncmp` interceptor acts as follows: >> >> INTERCEPTOR(int, strncmp, const char *s1, const char *s2, size_t n) { >> void *ctx; >> ASAN_INTERCEPTOR_ENTER(linker, strncmp); // Sets up context >> ASAN_READ_RANGE(s1, n); // Validates s1 >> ASAN_READ_RANGE(s2, n); // Validates s2 >> return REAL(strncmp)(s1, s2, n); // Calls original function >> } >> >> With the test given `s1` is a buffer of size 15, containing a non-nul-terminated string, and `n` is 18, so `ASAN_READ_RANGE` fails for `s1` as we could potentially read beyond the end of the buffer. In practice however, given `s1` is guaranteed to be a valid type-string from a signature symbol of type `T_OBJECT`, its final character is `;` and the final character of `s2` is also `;` (it is the string constant `Ljava/lang/Object;`). Hence the comparison must terminate before we can run off the end of `s1`. >> >> To appease ASAN we can make a simple change to the `strncmp` call to compare at most `ss.raw_length()` bytes. >> >> Testing >> - ASAN no longer reports an error >> - tiers 1-3 sanity >> >> Thanks > > Thank you David for taking and fixing this. Thanks for the reviews @afshin-zafari and @jdksjolen ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/29516#issuecomment-3837656354 From dholmes at openjdk.org Mon Feb 2 22:42:13 2026 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Feb 2026 22:42:13 GMT Subject: Integrated: 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature In-Reply-To: References: Message-ID: <6hTVYTUP53ChQvL9GpxoDlnZ-WWyVgRa3CN-kt0hvDU=.1d5ef6c2-9c06-4b85-85b4-bcd3f98157f2@github.com> On Mon, 2 Feb 2026 01:13:35 GMT, David Holmes wrote: > An ASAN enabled build reported heap-buffer-overflow in `MethodHandles::is_basic_type_signature` with `ASAN_OPTIONS=strict_string_checks=true` when running test `jdk/jdk/jfr/api/metadata/annotations/TestThrottle.java` > > The code is here: > > bool MethodHandles::is_basic_type_signature(Symbol* sig) { > assert(vmSymbols::object_signature()->utf8_length() == (int)OBJ_SIG_LEN, ""); > assert(vmSymbols::object_signature()->equals(OBJ_SIG), ""); > for (SignatureStream ss(sig, sig->starts_with(JVM_SIGNATURE_FUNC)); !ss.is_done(); ss.next()) { > switch (ss.type()) { > case T_OBJECT: > // only java/lang/Object is valid here > if (strncmp((char*) ss.raw_bytes(), OBJ_SIG, OBJ_SIG_LEN) != 0) > > The ASAN `strncmp` interceptor acts as follows: > > INTERCEPTOR(int, strncmp, const char *s1, const char *s2, size_t n) { > void *ctx; > ASAN_INTERCEPTOR_ENTER(linker, strncmp); // Sets up context > ASAN_READ_RANGE(s1, n); // Validates s1 > ASAN_READ_RANGE(s2, n); // Validates s2 > return REAL(strncmp)(s1, s2, n); // Calls original function > } > > With the test given `s1` is a buffer of size 15, containing a non-nul-terminated string, and `n` is 18, so `ASAN_READ_RANGE` fails for `s1` as we could potentially read beyond the end of the buffer. In practice however, given `s1` is guaranteed to be a valid type-string from a signature symbol of type `T_OBJECT`, its final character is `;` and the final character of `s2` is also `;` (it is the string constant `Ljava/lang/Object;`). Hence the comparison must terminate before we can run off the end of `s1`. > > To appease ASAN we can make a simple change to the `strncmp` call to compare at most `ss.raw_length()` bytes. > > Testing > - ASAN no longer reports an error > - tiers 1-3 sanity > > Thanks This pull request has now been integrated. Changeset: 1cb4ef85 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/1cb4ef8581b5c5572474a5376baf4fd88c5ffeab Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8376855: ASAN reports out-of-range read in strncmp in MethodHandles::is_basic_type_signature Reviewed-by: azafari, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/29516 From xuelei at openjdk.org Mon Feb 2 23:37:55 2026 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Mon, 2 Feb 2026 23:37:55 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: <43jWfoF7waaehspCCA-pV-eWsXF5AGCKvjyiC2uguTU=.297fbe19-7cb1-49e9-9994-f4b8ffb1ef09@github.com> Message-ID: <64hZpvWXWK3cRG_gVpeRvkMT2f35dkwJqgO0ZfY4YHY=.fe44589f-bf65-4ad7-bb57-f02da7f6548e@github.com> On Mon, 2 Feb 2026 20:17:15 GMT, Ashutosh Mehra wrote: >> Yes. Please refer to test/hotspot/jtreg/resourcehogs/runtime/aot/LargeArchive.java, where the archive size is more than 2GB. Without this update, the test will fail. > > umm, I removed this change to `os.cpp` and ran `LargeArchive.java` test on x86-64 system and it passed. On which platform/OS did you see the failure? I was on MacOS. Here is the failure if revert os.cpp update: % make test TEST="test/hotspot/jtreg/resourcehogs/runtime/aot/LargeArchive.java" JTREG="JAVA_OPTIONS=-Dtest.archive.large.all.workflows=true" ... [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Reference$ReferenceHandler [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Finalizer$FinalizerThread [0.039s][info][cds] Loading classes to share ... [0.039s][info][cds] Parsing LargeArchive.classlist [0.047s][info][aot] JVM_StartThread() ignored: jdk.internal.misc.InnocuousThread [136.124s][info][cds] Parsing /Users/xuelei.fan/workspace/openjdk/jdk-xf.git/build/macosx-aarch64-server-release/images/jdk/lib/classlist (lambda form invokers only) [136.126s][info][cds] Loading classes to share: done. [136.127s][info][aot] Rewriting and linking classes ... [136.245s][info][aot] Rewriting and linking classes: done [136.245s][info][aot] Regenerate MethodHandle Holder classes... [136.344s][info][aot] Regenerate MethodHandle Holder classes...done [136.351s][info][cds] Dumping shared data to file: LargeArchive.static.jsa [136.351s][info][cds] Gathering all archivable objects ... [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_F: dynamically generated [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_J: dynamically generated [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader$Source: used only when dumping CDS archive [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader: used only when dumping CDS archive [136.417s][info][cds] Heap range = [0x00000003c0000000 - 0x00000004c0000000] [136.430s][info][aot] Archived 7975 interned strings [136.431s][info][cds] Gathering classes and symbols ... [143.110s][info][cds] Sorting symbols ... [143.113s][info][cds] Sorting classes ... [149.873s][info][cds] Reserved output buffer space at 0x0000007000000000 [34359738368 bytes] [149.900s][info][cds] Allocating RW objects ... [149.954s][info][cds] done (46510 objects) [149.954s][info][cds] Allocating RO objects ... [151.033s][info][cds] done (142334 objects) [151.033s][info][cds] Relocating embedded pointers in core regions ... [152.400s][info][cds] Relocating 150345176 pointers, 0 tagged, 17461 nulled [152.400s][info][aot] Make classes shareable [152.558s][info][cds] Number of classes 4485 [152.558s][info][cds] instance classes = 4340, aot-linked = 0, inited = 0 [152.558s][info][cds] boot = 838, aot-linked = 0, inited = 0 [152.558s][info][cds] vm = 153, aot-linked = 0, inited = 0 [152.558s][info][cds] platform = 0, aot-linked = 0, inited = 0 [152.558s][info][cds] app = 3502, aot-linked = 0, inited = 0 [152.558s][info][cds] unregistered = 0, aot-linked = 0, inited = 0 [152.558s][info][cds] (enum) = 30, aot-linked = 0, inited = 0 [152.558s][info][cds] (hidden) = 8, aot-linked = 0, inited = 0 [152.558s][info][cds] (old) = 0, aot-linked = 0, inited = 0 [152.558s][info][cds] (unlinked) = 0, boot = 0, plat = 0, app = 0, unreg = 0 [152.558s][info][cds] obj array classes = 136 [152.558s][info][cds] type array classes = 9 [152.558s][info][cds] symbols = 93208 [153.627s][info][aot] sorting heap objects [153.628s][info][aot] computed ranks [153.629s][info][aot] sorting heap objects done [153.635s][info][aot] Size of heap region = 1461632 bytes, 31877 objects, 13159 roots, 0 native ptrs [153.642s][info][aot] oopmap = 4 ... 365408 ( 0% ... 100% = 99%) [153.642s][info][aot] ptrmap = 35175 ... 142547 ( 19% ... 78% = 58%) [153.642s][info][aot] Dumping symbol table ... [153.652s][info][aot] Archived 0 method handle intrinsics (16 bytes) [153.652s][info][aot] Adjust lambda proxy class dictionary [153.652s][info][cds] Make training data shareable [153.890s][info][cds] Shared file region (rw) 0: 305713152 bytes, addr 0x0000000800004000 file offset 0x00004000 crc 0xa7da7141 [154.232s][info][cds] Shared file region (ro) 1: 3269057552 bytes, addr 0x0000000812394000 file offset 0x12394000 crc 0xdce21a0b [154.237s][error][cds] An error has occurred while writing the shared archive file. [154.237s][error][cds] Unable to write to shared archive. [154.237s][error][cds] Unable to seek to position 3574808575 (errno=9: Bad file descriptor) [154.237s][info ][cds] An error has occurred while processing the shared archive file. Run with -Xlog:aot,cds for details. [154.237s][info ][cds] unrecoverable error Error occurred during initialization of VM Unable to use shared archive. Unrecoverable archive loading error (run with -Xlog:aot,cds for details): unrecoverable error ]; stderr: [] exitValue = 1 java.lang.RuntimeException: Expected to get exit value of [0], exit value is: [1] at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:549) at jdk.test.lib.cds.CDSAppTester.executeAndCheck(CDSAppTester.java:219) at jdk.test.lib.cds.CDSAppTester.dumpStaticArchive(CDSAppTester.java:319) at jdk.test.lib.cds.CDSAppTester.runStaticWorkflow(CDSAppTester.java:470) at jdk.test.lib.cds.SimpleCDSAppTester.runStaticWorkflow(SimpleCDSAppTester.java:196) at LargeArchive.main(LargeArchive.java:77) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1516) JavaTest Message: Test threw exception: java.lang.RuntimeException: Expected to get exit value of [0], exit value is: [1] JavaTest Message: shutting down test ... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2756468548 From abakhtin at openjdk.org Tue Feb 3 00:41:00 2026 From: abakhtin at openjdk.org (Alexey Bakhtin) Date: Tue, 3 Feb 2026 00:41:00 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: References: Message-ID: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> On Sun, 1 Feb 2026 04:34:02 GMT, Xue-Lei Andrew Fan wrote: >> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. > >> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. > > Thanks for the suggestion. I looked into this and found that buffer_to_offset() and any_to_offset() serve a different purpose than the _u4 versions. The _u4 versions use scaled encoding (with MetadataOffsetShift) and return a compact u4 for metadata pointer storage. The raw versions return unscaled byte offsets stored in larger types. These usages cannot switch to _u4 versions because they need raw byte offsets (not scaled) and store them in 64-bit types. > > However, the comments for the methods may be misleading after introducing the _u4 methods. What do you think to revise the comment as: > > // The address p points to an object inside the output buffer. When the archive is mapped > // at the requested address, what's the byte offset of this object from _requested_static_archive_bottom? > uintx buffer_to_offset(address p) const; > > // Same as buffer_to_offset, except that the address p points to either (a) an object > // inside the output buffer, or (b), an object in the currently mapped static archive. > uintx any_to_offset(address p) const; > > // The reverse of buffer_to_offset_u4() - converts scaled offset units back to buffered address. > address offset_to_buffered_address(u4 offset_units) const; > > > I am also OK to rename the method names to: `buffer_to_offset_bytes()` and `any_to_offset_bytes()`, if the new names are clearer. > > @ashu-mehra What do you think? Hi @XueleiFan, I've tried the suggested code with an archive size more than 4Gb, but it fails with an assertion: # Internal Error (aotMetaspace.cpp:1955), pid=96332, tid=4099 # guarantee(archive_space_size < max_encoding_range_size - class_space_alignment) failed: Archive too large CDC archive was created successfully: [187.068s][info ][cds ] Shared file region (rw) 0: 822453584 bytes, addr 0x0000000800004000 file offset 0x00004000 crc 0x132b652e [189.176s][info ][cds ] Shared file region (ro) 1: 3576115584 bytes, addr 0x0000000831060000 file offset 0x31060000 crc 0x71b020a2 [197.653s][info ][cds ] Shared file region (ac) 4: 0 bytes [198.870s][info ][cds ] Shared file region (bm) 2: 56555664 bytes, addr 0x0000000000000000 file offset 0x1062d4000 crc 0xbd87f804 [199.504s][info ][cds ] Shared file region (hp) 3: 16091256 bytes, addr 0x00000000ff000000 file offset 0x1098c4000 crc 0x7834b7c3 [199.684s][debug ][cds ] bm space: 56555664 [ 1.3% of total] out of 56555664 bytes [100.0% used] [199.684s][debug ][cds ] hp space: 16091256 [ 0.4% of total] out of 16091256 bytes [100.0% used] at 0x0000000c6d000000 [199.684s][debug ][cds ] total : 4471216088 [100.0% of total] out of 4471228536 bytes [100.0% used] ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3838062386 From jbhateja at openjdk.org Tue Feb 3 03:31:52 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Feb 2026 03:31:52 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v7] In-Reply-To: References: Message-ID: > As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. > > Patch add new lane type constants and pass them to vector intrinsic entry points. > > All existing Vector API jtreg test are passing with the patch. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolution ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29481/files - new: https://git.openjdk.org/jdk/pull/29481/files/23022d42..c1935efc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29481&range=05-06 Stats: 3 lines in 2 files changed: 1 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29481.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29481/head:pull/29481 PR: https://git.openjdk.org/jdk/pull/29481 From jbhateja at openjdk.org Tue Feb 3 03:31:54 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Feb 2026 03:31:54 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v6] In-Reply-To: References: Message-ID: <1g1hwUyCoVEwQmSnil3tnLEbyNDXAUGkfPSz3R8lNAg=.ca6498cb-acec-4b00-9b38-a01e720046df@github.com> On Mon, 2 Feb 2026 20:22:46 GMT, Paul Sandoz wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comment resolution > > Very good. Approved, there is just one comment related to adding a comment for the LT_* values. Thank you for separating this out from the float16 PR. Needs a HotSpot reviewer too. We will run it through tier 1 to 3 testing. Thanks @PaulSandoz , @merykitty please let me know if this is good to land. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29481#issuecomment-3838835040 From iklam at openjdk.org Tue Feb 3 04:01:02 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 04:01:02 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: Message-ID: <2HWRyOkAnKfSNQEOxjsezqs0Hgx-2w0PltNil27r86o=.7d4b31ee-c06e-4c54-9eb2-b46103e2a69d@github.com> On Sun, 1 Feb 2026 03:47:08 GMT, Xue-Lei Andrew Fan wrote: >> src/hotspot/share/cds/aotMetaspace.cpp line 2102: >> >>> 2100: unmap_archive(mapinfo); >>> 2101: return MAP_ARCHIVE_OTHER_FAILURE; >>> 2102: } >> >> Since `ArchiveUtils::OFFSET_SHIFT` is a constant for this JVM build, there's no need to save it into the archive and validate the saved value at runtime. We don't perform such checks for other constants. >> >> The archive contains the VM version string, so it cannot be used by a different JVM build. > > Make sense to me. Updated. > > Is it OK to keep the CURRENT_CDS_ARCHIVE_VERSION stay as 19 in src/hotspot/share/include/cds.h? > > - #define CURRENT_CDS_ARCHIVE_VERSION 19 > + #define CURRENT_CDS_ARCHIVE_VERSION 20 Since the header has not been changed, I think we should leave the version number unchanged. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2757071704 From iklam at openjdk.org Tue Feb 3 04:13:01 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 04:13:01 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: <64hZpvWXWK3cRG_gVpeRvkMT2f35dkwJqgO0ZfY4YHY=.fe44589f-bf65-4ad7-bb57-f02da7f6548e@github.com> References: <43jWfoF7waaehspCCA-pV-eWsXF5AGCKvjyiC2uguTU=.297fbe19-7cb1-49e9-9994-f4b8ffb1ef09@github.com> <64hZpvWXWK3cRG_gVpeRvkMT2f35dkwJqgO0ZfY4YHY=.fe44589f-bf65-4ad7-bb57-f02da7f6548e@github.com> Message-ID: On Mon, 2 Feb 2026 23:35:13 GMT, Xue-Lei Andrew Fan wrote: >> umm, I removed this change to `os.cpp` and ran `LargeArchive.java` test on x86-64 system and it passed. On which platform/OS did you see the failure? > > I was on MacOS. Here is the failure without the os.cpp update: > > % make test TEST="test/hotspot/jtreg/resourcehogs/runtime/aot/LargeArchive.java" JTREG="JAVA_OPTIONS=-Dtest.archive.large.all.workflows=true" > ... > [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Reference$ReferenceHandler > [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Finalizer$FinalizerThread > [0.039s][info][cds] Loading classes to share ... > [0.039s][info][cds] Parsing LargeArchive.classlist > [0.047s][info][aot] JVM_StartThread() ignored: jdk.internal.misc.InnocuousThread > [136.124s][info][cds] Parsing /Users/xuelei.fan/workspace/openjdk/jdk-xf.git/build/macosx-aarch64-server-release/images/jdk/lib/classlist (lambda form invokers only) > [136.126s][info][cds] Loading classes to share: done. > [136.127s][info][aot] Rewriting and linking classes ... > [136.245s][info][aot] Rewriting and linking classes: done > [136.245s][info][aot] Regenerate MethodHandle Holder classes... > [136.344s][info][aot] Regenerate MethodHandle Holder classes...done > [136.351s][info][cds] Dumping shared data to file: LargeArchive.static.jsa > [136.351s][info][cds] Gathering all archivable objects ... > [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_F: dynamically generated > [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_J: dynamically generated > [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader$Source: used only when dumping CDS archive > [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader: used only when dumping CDS archive > [136.417s][info][cds] Heap range = [0x00000003c0000000 - 0x00000004c0000000] > [136.430s][info][aot] Archived 7975 interned strings > [136.431s][info][cds] Gathering classes and symbols ... > [143.110s][info][cds] Sorting symbols ... > [143.113s][info][cds] Sorting classes ... > [149.873s][info][cds] Reserved output buffer space at 0x0000007000000000 [34359738368 bytes] > [149.900s][info][cds] Allocating RW objects ... > [149.954s][info][cds] done (46510 objects) > [149.954s][info][cds] Allocating RO objects ... > [151.033s][info][cds] done (142334 objects) > [151.033s][info][cds] Relocating embedded pointers in core regions ... > [152.400s][info][cds] Relocating 150345176 pointers, 0 tagged, 17461 nulled > [152.400s][info][aot] Make classes shareable > [152.558s][info][cds] Number of classes 4485 > [152.558s][info][cds] instance classes = 4340, aot-linked = 0, inited = 0 > [152.558s][info][cds] ... According to https://gitlab.haskell.org/ghc/ghc/-/issues/17414 > File reads/writes bigger than 2GB result in an "Invalid argument" exception on macOS. Files bigger than 2GB still work, but individual read/write operations bigger than 2GB fail. I think it's better to move this fix into `os::pd_write()` (within `#ifdef __APPLE__`) to limit the writes to less than 2GB. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2757096105 From asmehra at openjdk.org Tue Feb 3 04:28:04 2026 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Feb 2026 04:28:04 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: <43jWfoF7waaehspCCA-pV-eWsXF5AGCKvjyiC2uguTU=.297fbe19-7cb1-49e9-9994-f4b8ffb1ef09@github.com> <64hZpvWXWK3cRG_gVpeRvkMT2f35dkwJqgO0ZfY4YHY=.fe44589f-bf65-4ad7-bb57-f02da7f6548e@github.com> Message-ID: On Tue, 3 Feb 2026 04:10:41 GMT, Ioi Lam wrote: >> I was on MacOS. Here is the failure without the os.cpp update: >> >> % make test TEST="test/hotspot/jtreg/resourcehogs/runtime/aot/LargeArchive.java" JTREG="JAVA_OPTIONS=-Dtest.archive.large.all.workflows=true" >> ... >> [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Reference$ReferenceHandler >> [0.018s][info][aot] JVM_StartThread() ignored: java.lang.ref.Finalizer$FinalizerThread >> [0.039s][info][cds] Loading classes to share ... >> [0.039s][info][cds] Parsing LargeArchive.classlist >> [0.047s][info][aot] JVM_StartThread() ignored: jdk.internal.misc.InnocuousThread >> [136.124s][info][cds] Parsing /Users/xuelei.fan/workspace/openjdk/jdk-xf.git/build/macosx-aarch64-server-release/images/jdk/lib/classlist (lambda form invokers only) >> [136.126s][info][cds] Loading classes to share: done. >> [136.127s][info][aot] Rewriting and linking classes ... >> [136.245s][info][aot] Rewriting and linking classes: done >> [136.245s][info][aot] Regenerate MethodHandle Holder classes... >> [136.344s][info][aot] Regenerate MethodHandle Holder classes...done >> [136.351s][info][cds] Dumping shared data to file: LargeArchive.static.jsa >> [136.351s][info][cds] Gathering all archivable objects ... >> [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_F: dynamically generated >> [136.371s][info][cds] Skipping java/lang/invoke/BoundMethodHandle$Species_J: dynamically generated >> [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader$Source: used only when dumping CDS archive >> [136.413s][info][cds] Skipping jdk/internal/misc/CDS$UnregisteredClassLoader: used only when dumping CDS archive >> [136.417s][info][cds] Heap range = [0x00000003c0000000 - 0x00000004c0000000] >> [136.430s][info][aot] Archived 7975 interned strings >> [136.431s][info][cds] Gathering classes and symbols ... >> [143.110s][info][cds] Sorting symbols ... >> [143.113s][info][cds] Sorting classes ... >> [149.873s][info][cds] Reserved output buffer space at 0x0000007000000000 [34359738368 bytes] >> [149.900s][info][cds] Allocating RW objects ... >> [149.954s][info][cds] done (46510 objects) >> [149.954s][info][cds] Allocating RO objects ... >> [151.033s][info][cds] done (142334 objects) >> [151.033s][info][cds] Relocating embedded pointers in core regions ... >> [152.400s][info][cds] Relocating 150345176 pointers, 0 tagged, 17461 nulled >> [152.400s][info][aot] Make classes shareable >> [152.558s][info][cds] Number of classes 4485 >> [152.558s][info][cds] instance class... > > According to https://gitlab.haskell.org/ghc/ghc/-/issues/17414 > >> File reads/writes bigger than 2GB result in an "Invalid argument" exception on macOS. Files bigger than 2GB still work, but individual read/write operations bigger than 2GB fail. > > I think it's better to move this fix into `os::pd_write()` (within `#ifdef __APPLE__`) to limit the writes to less than 2GB. @iklam thanks for digging that up. It explains why the INT_MAX limit worked. But I should also mention that the above output does not show the actual reason for the failure. Here the `os::write` failed causing the `fd` to be closed in `FileMapInfo::write_bytes`. However, the error is not propagated up the call chain and we end up calling `FileMapInfo::seek_to_position` which throws EBADF (Bad file descriptor). So while we can keep the change in `os::write` (or `os::pd_write` as suggested) I think we should also fix `FileMapInfo::write_bytes` to 1) print the os error, and 2) terminate the write operation gracefully. I am also fine if this is done in a follow-up pr. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2757134702 From kbarrett at openjdk.org Tue Feb 3 06:15:36 2026 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 3 Feb 2026 06:15:36 GMT Subject: RFR: 8332189: Enable -Wzero-as-null-pointer-constant for gcc/clang Message-ID: Please review this change which enables `-Wzero-as-null-pointer-constant` warnings in HotSpot code when building with gcc or clang. There are three parts to this change. The first part augments the warning flags setup to support adding warning options that are only applied to HotSpot, rather than the JDK as a whole. There was previously some unused and possibly incomplete support for this when using gcc. Note that the Windows/Visual Studio support hasn't been tested much, and I think might not be working yet. I'm going to investigate that further in followup work. The second part enables `-Wzero-as-null-pointer-constant` for HotSpot code. This follows the guidance to avoid such in the HotSpot Style Guide. The third part removes a note in the HotSpot Style Guide about lingering uses of literal 0 as a null pointer constant. Those have been removed, and this change will block backsliding. Testing: mach5 tier1, GHA Sanity tests Integration of this change needs to wait for JDK-8376758. ------------- Commit messages: - remove obsolete note from style guide - enable -Wzero-as-null-pointer-constant for VM with gcc/clang - support hotspot-specific warnings Changes: https://git.openjdk.org/jdk/pull/29497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8332189 Stats: 40 lines in 3 files changed: 14 ins; 13 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/29497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29497/head:pull/29497 PR: https://git.openjdk.org/jdk/pull/29497 From dholmes at openjdk.org Tue Feb 3 06:34:02 2026 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Feb 2026 06:34:02 GMT Subject: RFR: 8332189: Enable -Wzero-as-null-pointer-constant for gcc/clang In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 00:16:54 GMT, Kim Barrett wrote: > Please review this change which enables `-Wzero-as-null-pointer-constant` > warnings in HotSpot code when building with gcc or clang. > > There are three parts to this change. > > The first part augments the warning flags setup to support adding warning > options that are only applied to HotSpot, rather than the JDK as a whole. > There was previously some unused and possibly incomplete support for this when > using gcc. Note that the Windows/Visual Studio support hasn't been tested > much, and I think might not be working yet. I'm going to investigate that > further in followup work. > > The second part enables `-Wzero-as-null-pointer-constant` for HotSpot code. > This follows the guidance to avoid such in the HotSpot Style Guide. > > The third part removes a note in the HotSpot Style Guide about lingering uses > of literal 0 as a null pointer constant. Those have been removed, and this > change will block backsliding. > > Testing: mach5 tier1, GHA Sanity tests > > Integration of this change needs to wait for JDK-8376758. Looks reasonable to me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29497#pullrequestreview-3743249921 From iklam at openjdk.org Tue Feb 3 06:40:02 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 06:40:02 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: References: Message-ID: <2wZEIEuVyQR2YTbWlib002hxcA5VGuGbPgijtBNqE7k=.d43836be-5ace-4445-9a84-986f31d45f9b@github.com> On Sun, 1 Feb 2026 04:34:02 GMT, Xue-Lei Andrew Fan wrote: >> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. > >> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. > > Thanks for the suggestion. I looked into this and found that buffer_to_offset() and any_to_offset() serve a different purpose than the _u4 versions. The _u4 versions use scaled encoding (with MetadataOffsetShift) and return a compact u4 for metadata pointer storage. The raw versions return unscaled byte offsets stored in larger types. These usages cannot switch to _u4 versions because they need raw byte offsets (not scaled) and store them in 64-bit types. > > However, the comments for the methods may be misleading after introducing the _u4 methods. What do you think to revise the comment as: > > // The address p points to an object inside the output buffer. When the archive is mapped > // at the requested address, what's the byte offset of this object from _requested_static_archive_bottom? > uintx buffer_to_offset(address p) const; > > // Same as buffer_to_offset, except that the address p points to either (a) an object > // inside the output buffer, or (b), an object in the currently mapped static archive. > uintx any_to_offset(address p) const; > > // The reverse of buffer_to_offset_u4() - converts scaled offset units back to buffered address. > address offset_to_buffered_address(u4 offset_units) const; > > > I am also OK to rename the method names to: `buffer_to_offset_bytes()` and `any_to_offset_bytes()`, if the new names are clearer. > > @ashu-mehra What do you think? > Hi @XueleiFan, > > I've tried the suggested code with an archive size more than 4Gb, but it fails with an assertion: > > ``` > # Internal Error (aotMetaspace.cpp:1955), pid=96332, tid=4099 > # guarantee(archive_space_size < max_encoding_range_size - class_space_alignment) failed: Archive too large > ``` > > CDC archive was created successfully: > > ``` > [187.068s][info ][cds ] Shared file region (rw) 0: 822453584 bytes, addr 0x0000000800004000 file offset 0x00004000 crc 0x132b652e > [189.176s][info ][cds ] Shared file region (ro) 1: 3576115584 bytes, addr 0x0000000831060000 file offset 0x31060000 crc 0x71b020a2 > [197.653s][info ][cds ] Shared file region (ac) 4: 0 bytes > [198.870s][info ][cds ] Shared file region (bm) 2: 56555664 bytes, addr 0x0000000000000000 file offset 0x1062d4000 crc 0xbd87f804 > [199.504s][info ][cds ] Shared file region (hp) 3: 16091256 bytes, addr 0x00000000ff000000 file offset 0x1098c4000 crc 0x7834b7c3 > [199.684s][debug ][cds ] bm space: 56555664 [ 1.3% of total] out of 56555664 bytes [100.0% used] > [199.684s][debug ][cds ] hp space: 16091256 [ 0.4% of total] out of 16091256 bytes [100.0% used] at 0x0000000c6d000000 > [199.684s][debug ][cds ] total : 4471216088 [100.0% of total] out of 4471228536 bytes [100.0% used] > ``` I think we need to make `ArchiveUtils::MaxMetadataOffsetBytes` around 3.5 GB, since all AOT metadata are mapped into the compressed klass space, whose max size is 4GB. We want to leave some headroom for loading new classes in the production run. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839353960 From dholmes at openjdk.org Tue Feb 3 06:56:04 2026 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Feb 2026 06:56:04 GMT Subject: RFR: 8376568: Change Thread::getStackTrace to use handshake op for all cases [v3] In-Reply-To: References: <6WdkzWF-d6yGLKVUP9pCiYE1ghOdL5sTlcBiA1bE4c0=.802606b6-f958-4dea-a6a7-3d8a406c177c@github.com> Message-ID: On Fri, 30 Jan 2026 08:07:44 GMT, Alan Bateman wrote: >> Still not clear to me why any new thread is not already filtered out long before now; nor why we have not needed this in the past. > > We want ThreadSnapshot.of(Thread) to accept a Thread in any state. Existing behavior is to return null for platform threads that are not alive. For virtual threads it will return a snapshot so we want to change that. The ThreadNotAlive test in the PR allows us to test these cases as they are hard to demonstrate with the thread dump. > > ThreadSnapshot.of(Thread) does not filter out the "not alive" cases. It could, in which case ThreadSnapshotFactory::get_thread_snapshot would need to assert if called with a new/unstarted thread. The terminating thread case would still need to be handled by ThreadSnapshotFactory::get_thread_snapshot. For platform threads there is no JavaThread so it bails easy. For virtual threads it needs to examine the state. Would you prefer if ThreadSnapshot.of(Thread) pre-checked the state so that get_thread_snapshot could be guaranteed to never see a new/unstarted thread? > > Update: I changed ThreadSnapshot.of(Thread) to filter before calling get_thread_snapshot, hopefully this will be easier to understand. I was assuming/expecting that the top-level code in `ThreadDumper` would filter out not-alive threads the same way `Thread.getStackTrace` does. You don't want lower-level code to have to worry about NEW threads, though of course they still have to deal with races against termination. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29461#discussion_r2757501486 From rsunderbabu at openjdk.org Tue Feb 3 07:06:15 2026 From: rsunderbabu at openjdk.org (Ramkumar Sunderbabu) Date: Tue, 3 Feb 2026 07:06:15 GMT Subject: RFR: 8375443: AVX-512: Disabling through UseSHA doesn't affect UseSHA3Intrinsics [v4] In-Reply-To: References: Message-ID: > UseSHA flag is not respected while enabling/disabling UseSHA3Intrinsics flag in x86 builds. > Added UseSHA in the mix. > > Testing: Only Basic testing done. I will run more compiler related testing. Ramkumar Sunderbabu has updated the pull request incrementally with two additional commits since the last revision: - add test for unsupported platform - simpler requires condition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29266/files - new: https://git.openjdk.org/jdk/pull/29266/files/84acd692..a09cb5ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29266&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29266&range=02-03 Stats: 153 lines in 2 files changed: 150 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/29266.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29266/head:pull/29266 PR: https://git.openjdk.org/jdk/pull/29266 From stuefe at openjdk.org Tue Feb 3 07:08:05 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 07:08:05 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan wrote: >> **Summary** >> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. >> >> **Problem** >> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: >> >> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. >> >> >> **Solution** >> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. >> >> Current: address = base + offset_bytes (max ~2GB) >> Proposed: address = base + (offset_units << 3) (max 32GB) >> >> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! >> >> Current byte offset (aligned to 8 bytes): >> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 >> ??? Always 000! >> >> Scaled offset (shift=3): >> 0x00000200 = Same address, but stored in 29 bits instead of 32 >> Frees up 3 bits ? 8x larger range! >> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! >> >> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. >> >> **Test** >> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). >> >> Archive: >> - 300000 simple classes >> - 2000 mega-classes >> - 5000 FieldObject classes >> - Total: 307000 classes >> >> AOT Cache: >> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms >> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms >> >> Static CDS: >> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms >> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1... > > Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: > > add hotspot_resourcehogs_no_cds test group This issue definitely needs more discussion. How would this work with compressed class pointers and a limited encoding space of 4G? Note that we are in the process of removing the uncompressed Klass pointer mode: [https://bugs.openjdk.org/browse/JDK-8363996 - see ](https://bugs.openjdk.org/browse/JDK-8372065) and https://github.com/openjdk/jdk/pull/28366. See also the preceding discussions. In the future, we plan to make compact object headers the default. The current limit gives us 4GB of encoding space; that is enough (with -UseCompressedKlassPointers) for roughly 5-6 million classes, possibly more. What scenario would require more classes than that? @ping rkennke ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839483028 From duke at openjdk.org Tue Feb 3 07:10:02 2026 From: duke at openjdk.org (Shawn M Emery) Date: Tue, 3 Feb 2026 07:10:02 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() In-Reply-To: References: Message-ID: <1ueZt1yRnN71yJlDZ1jsOpXgGkp4bzOxNpWjbdiXx6I=.f58e8738-db1a-40bb-8d8a-bee26d7547fe@github.com> On Wed, 21 Jan 2026 08:32:59 GMT, Guanqiang Han wrote: > Please review this change. Thanks! > > **Description:** > > VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). > > **Fix:** > > Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. > > **Test:** > > GHA Nice work! Just a couple of suggestions/comments. src/hotspot/cpu/x86/vm_version_x86.cpp line 1141: > 1139: FLAG_SET_DEFAULT(UseAESIntrinsics, false); > 1140: if (UseAESCTRIntrinsics && !FLAG_IS_DEFAULT(UseAESCTRIntrinsics)) { > 1141: warning("AES_CTR intrinsics require UseAES flag to be enabled. Intrinsics will be disabled."); I propose the following changes: OLD "Intrinsics will be disabled." NEW "AES_CTR intrinsics will be disabled." test/hotspot/jtreg/compiler/cpuflags/TestUseAESCTRIntrinsicsWithUseAESDisabled.java line 28: > 26: * @bug 8374516 > 27: * @summary Regression test for -XX:+UseAESCTRIntrinsics -XX:-UseAES crash > 28: * @requires os.arch=="amd64" | os.arch=="x86_64" These are the only two architectures that exhibit this bug? I was able to reproduce the problem with this test case on my x86_64 desktop and confirmed that the fix did indeed resolve the problem. ------------- PR Review: https://git.openjdk.org/jdk/pull/29338#pullrequestreview-3743400759 PR Review Comment: https://git.openjdk.org/jdk/pull/29338#discussion_r2757536259 PR Review Comment: https://git.openjdk.org/jdk/pull/29338#discussion_r2757539553 From duke at openjdk.org Tue Feb 3 07:10:04 2026 From: duke at openjdk.org (Shawn M Emery) Date: Tue, 3 Feb 2026 07:10:04 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 11:06:09 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Hi @vnkozlov and @ascarpino , Sorry for the ping ? could you please take a look at this PR when you have a moment? Hi @hgqxjj, I will take a look at the changes later today. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29338#issuecomment-3837219100 From stuefe at openjdk.org Tue Feb 3 07:23:08 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 07:23:08 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan wrote: >> **Summary** >> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. >> >> **Problem** >> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: >> >> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. >> >> >> **Solution** >> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. >> >> Current: address = base + offset_bytes (max ~2GB) >> Proposed: address = base + (offset_units << 3) (max 32GB) >> >> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! >> >> Current byte offset (aligned to 8 bytes): >> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 >> ??? Always 000! >> >> Scaled offset (shift=3): >> 0x00000200 = Same address, but stored in 29 bits instead of 32 >> Frees up 3 bits ? 8x larger range! >> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! >> >> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. >> >> **Test** >> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). >> >> Archive: >> - 300000 simple classes >> - 2000 mega-classes >> - 5000 FieldObject classes >> - Total: 307000 classes >> >> AOT Cache: >> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms >> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms >> >> Static CDS: >> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms >> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1... > > Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: > > add hotspot_resourcehogs_no_cds test group Looking at the issue closer, and the provided example. The classes seem to be both numerous and monstrous. How realistic is this scenario? Such objects would pose other challenges too, e.g. to GC. We can, and should, certainly make the dividing line between CDS and class space more fluid to allow for a larger CDS at the cost of the class space. As @iklam wrote, 3.5 GB, possibly even more, can be done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839553187 From iklam at openjdk.org Tue Feb 3 07:38:04 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 07:38:04 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 20:34:29 GMT, Ashutosh Mehra wrote: > > These usages cannot switch to _u4 versions because they need raw byte offsets (not scaled) and store them in 64-bit types. > > I am not sure why we can't store the scaled offsets in such cases. Are data structures not aligned properly that prevents from storing as scaled offsets. Its true they are stored in 64-bit types but that doesn't prevent scaling the offsets. IMO I would rather have a single API to compute offsets, otherwise we will end up with a system that has two types of offsets and it would be confusing when to use which. @iklam what do you think? I tried switching everything to the encoded offsets, but the changes are quite extensive. Most tests passed but serviceability/sa/ClhsdbCDSCore.java is still failing. Here's my patch: https://github.com/openjdk/jdk/commit/3f6dea9963bba05ca2f22abfe02199fa7767f82d I think this should be done in a follow-up RFE. In this PR, I think we should update the APIs so it's more obvious which "offset" we are talking about: - byte offsets should be called "raw offset". - the "u4 offset" should be called "encoded offset" So we'd have - `ArchiveUtils::encoded_offset_to_archived_address()` - `ArchiveBuilder::buffer_to_raw_offset()` - `ArchiveBuilder::any_to_encoded_offset()` - etc Eventually, I want to move the encoding logic to its own class (patterned after `CompressedKlassPointers`): https://github.com/openjdk/jdk/commit/8d5b3d5e684381005f1631e1577af2f716c4be9c ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839602927 From iklam at openjdk.org Tue Feb 3 07:47:01 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 07:47:01 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Tue, 3 Feb 2026 07:05:08 GMT, Thomas Stuefe wrote: > The current limit gives us 4GB of encoding space; that is enough (with -UseCompressedKlassPointers) for roughly 5-6 million classes, possibly more. What scenario would require more classes than that? The problem is that CDS stuffs all data (not just Klasses) into the ro/rw regions, which are mapped into the compressed class space. If we want to support millions of classes, we would need to split the classes out into its own region, and map only that into the CCS. In any case, supporting very large set of classes is not our priority. I think it's OK to make small tweaks to allow more classes, but we won't have time for more drastic changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839637819 From stuefe at openjdk.org Tue Feb 3 08:00:06 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 08:00:06 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Tue, 3 Feb 2026 07:44:09 GMT, Ioi Lam wrote: > > The current limit gives us 4GB of encoding space; that is enough (with -UseCompressedKlassPointers) for roughly 5-6 million classes, possibly more. What scenario would require more classes than that? > > The problem is that CDS stuffs all data (not just Klasses) into the ro/rw regions, which are mapped into the compressed class space. If we want to support millions of classes, we would need to split the classes out into its own region, and map only that into the CCS. Ah, right. This is an issue. It would be very nice to solve that. How complex would it be? > > In any case, supporting very large set of classes is not our priority. I think it's OK to make small tweaks to allow more classes, but we won't have time for more drastic changes. I tend to agree. The implications of increasing the nK encoding size make me apprehensive. I am all for tweaking the dividing line, though. Similar to how we did for https://bugs.openjdk.org/browse/JDK-8332514, just into the other direction. That should be simple enough. In fact, I would have assumed that already works. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839684485 From iklam at openjdk.org Tue Feb 3 08:00:08 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 08:00:08 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: Message-ID: On Mon, 2 Feb 2026 21:58:58 GMT, Xue-Lei Andrew Fan wrote: >> Sounds good to me then. >> >> Could you add the following to `test/hotspot/jtreg/TEST.groups`: >> >> >> hotspot_resourcehogs_no_cds = \ >> :hotspot_resourcehogs \ >> - resourcehogs/runtime/aot > > updated. Thank you! I tried running the test on my Intel Core i7-14700 box and it took more than 10 minuets on a fastdebug build. I am not sure if we actually need this test case -- archiving a lot of classes doesn't increase coverage in any meaningful way. We need a test case for the `Out of memory in the CDS archive` error in archiveUtils.cpp, but that's not tested in this test. If we really want to do it, it's much easier to add a new diagnostic switch to limit the size of the buffer, so you can get into the out of memory condition quickly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2757715162 From iklam at openjdk.org Tue Feb 3 08:14:03 2026 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Feb 2026 08:14:03 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Tue, 3 Feb 2026 07:55:47 GMT, Thomas Stuefe wrote: > > > The current limit gives us 4GB of encoding space; that is enough (with -UseCompressedKlassPointers) for roughly 5-6 million classes, possibly more. What scenario would require more classes than that? > > > > > > The problem is that CDS stuffs all data (not just Klasses) into the ro/rw regions, which are mapped into the compressed class space. If we want to support millions of classes, we would need to split the classes out into its own region, and map only that into the CCS. > > Ah, right. This is an issue. It would be very nice to solve that. How complex would it be? We would add a new "class" region and reserve 4GB for it. Then the ro/rw regions would need to be moved to above the 4GB boundary, and they can grow to 32GB. We are also thinking of mapping the code cache (which will hold both AOT and JIT compiled methods) at a fixed offset to the metadata, so we can map it at the 32 GB boundary, growing upwards. In any case, that's not a problem I want to solve in the near future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839752816 From erfang at openjdk.org Tue Feb 3 08:28:47 2026 From: erfang at openjdk.org (Eric Fang) Date: Tue, 3 Feb 2026 08:28:47 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction [v3] In-Reply-To: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> Message-ID: > When optimizing some VectorMask related APIs , we found an optimization opportunity related to the `cpy (immediate, zeroing)` instruction [1]. Implementing the functionality of this instruction using `cpy (immediate, merging)` instruction [2] leads to better performance. > > Currently the `cpy (imm, zeroing)` instruction is used in code generated by `VectorStoreMaskNode` and `VectorReinterpretNode`. Doing this optimization benefits all vector APIs that generate these two IRs potentially, such as `VectorMask.intoArray()` and `VectorMask.toLong()`. > > Microbenchmarks show this change brings performance uplift ranging from **11%** to **33%**, depending on the specific operation and data types. > > The specific changes in this PR: > 1. Achieve the functionality of the `cpy (imm, zeroing)` instruction with the `movi + cpy (imm, merging)` instructions in assembler: > > cpy z17.d, p1/z, #1 => > > movi v17.2d, #0 // this instruction is zero cost > cpy z17.d, p1/m, #1 > > > 2. Add a new option `PreferSVEMergingModeCPY` to indicate whether to apply this optimization or not. > - This option belongs to the Arch product category. > - The default value is true on Neoverse-V1/V2 where the improvement has been confirmed, false on others. > - When its value is true, the change is applied. > > 3. Add a jtreg test to verify the behavior of this option. > > This PR was tested on aarch64 and x86 machines with different configurations, and all tests passed. > > JMH benchmarks: > > On a Nvidia Grace (Neoverse-V2) machine with 128-bit SVE2: > > Benchmark Unit size Before Error After Error Uplift > byteIndexInRange ops/ms 7.00 471816.15 1125.96 473237.77 1593.92 1.00 > byteIndexInRange ops/ms 256.00 149654.21 416.57 149259.95 116.59 1.00 > byteIndexInRange ops/ms 259.00 177850.31 991.13 179785.19 1110.07 1.01 > byteIndexInRange ops/ms 512.00 133393.26 167.26 133484.61 281.83 1.00 > doubleIndexInRange ops/ms 7.00 302176.39 12848.8 299813.02 37.76 0.99 > doubleIndexInRange ops/ms 256.00 47831.93 56.70 46708.70 56.11 0.98 > doubleIndexInRange ops/ms 259.00 11550.02 27.95 15333.50 10.40 1.33 > doubleIndexInRange ops/ms 512.00 23687.76 61.65 23996.08 69.52 1.01 > floatIndexInRange ops/ms 7.00 412195.79 124.71 411770.23 78.73 1.00 > floatIndexInRange ops/ms 256.00 84479.98 70.69 84237.31 70.15 1.00 > floatIndexInRange ops/ms 259.00 22585.65 80.07 28296.21 7.98 1.25 > floatIndexInRange ops/ms 512.00 46902.99 51.60 46686.68 66.01 1.00 > intIndexInRange ops/ms 7.00 413411.70 50.59 420684.66 253.55 1.02 > intIndexInRange ops/... Eric Fang has updated the pull request incrementally with one additional commit since the last revision: Refine the code comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29359/files - new: https://git.openjdk.org/jdk/pull/29359/files/884a11f2..94d6d144 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29359&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29359&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29359.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29359/head:pull/29359 PR: https://git.openjdk.org/jdk/pull/29359 From alanb at openjdk.org Tue Feb 3 08:31:13 2026 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 3 Feb 2026 08:31:13 GMT Subject: RFR: 8376568: Change Thread::getStackTrace to use handshake op for all cases [v3] In-Reply-To: References: <6WdkzWF-d6yGLKVUP9pCiYE1ghOdL5sTlcBiA1bE4c0=.802606b6-f958-4dea-a6a7-3d8a406c177c@github.com> Message-ID: On Tue, 3 Feb 2026 06:53:36 GMT, David Holmes wrote: >> We want ThreadSnapshot.of(Thread) to accept a Thread in any state. Existing behavior is to return null for platform threads that are not alive. For virtual threads it will return a snapshot so we want to change that. The ThreadNotAlive test in the PR allows us to test these cases as they are hard to demonstrate with the thread dump. >> >> ThreadSnapshot.of(Thread) does not filter out the "not alive" cases. It could, in which case ThreadSnapshotFactory::get_thread_snapshot would need to assert if called with a new/unstarted thread. The terminating thread case would still need to be handled by ThreadSnapshotFactory::get_thread_snapshot. For platform threads there is no JavaThread so it bails easy. For virtual threads it needs to examine the state. Would you prefer if ThreadSnapshot.of(Thread) pre-checked the state so that get_thread_snapshot could be guaranteed to never see a new/unstarted thread? >> >> Update: I changed ThreadSnapshot.of(Thread) to filter before calling get_thread_snapshot, hopefully this will be easier to understand. > > I was assuming/expecting that the top-level code in `ThreadDumper` would filter out not-alive threads the same way `Thread.getStackTrace` does. You don't want lower-level code to have to worry about NEW threads, though of course they still have to deal with races against termination. The proposal is that ThreadSnapshot.of(Thread) can be called with any platform or virtual Thread in any state. With the update, it eagerly tests with isAlive so will filter NEW and already TERMINATED threads. If/when we change Thread.getStackTrace to use ThreadSnapshot then the isAlive check can be dropped from Thread.getStackTrace. The underlying implementation in get_thread_snapshot does not need to deal with the NEW state. There is no need for ThreadDumper to do any additional filtering. The thread streams that it consumes filter Threads that are not alive. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29461#discussion_r2757854181 From erfang at openjdk.org Tue Feb 3 08:33:21 2026 From: erfang at openjdk.org (Eric Fang) Date: Tue, 3 Feb 2026 08:33:21 GMT Subject: RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction In-Reply-To: References: <_0ouKSVAIyzg0g9hA2jZXNH-_cCqJjNCSh7kM2dn80w=.b93145c3-c465-423a-ab68-c8d7bd7e4280@github.com> <_qJ_Qo_Mqexx7dYu0Vkc9ru4SxZ0izfqifaUIAL1iyQ=.741b11d4-a89d-495e-8d31-78fed690abf6@github.com> Message-ID: <6tSti7c6Y4rC4XCtDImv0BO1WYfUB0QG8Ytu-CqhbZ4=.015d979c-4865-4fec-81fa-0bd80e2bf14d@github.com> On Wed, 28 Jan 2026 10:17:30 GMT, Andrew Haley wrote: >> @fg1417 thanks for your help, this is really helpful! >> >> You've also noticed slight regression in a few cases, which is reasonable. The optimization effect is influenced by multiple factors, such as the alignment you mentioned on N2, as well as code generation and register allocation. The underlying principle of this optimization is that the latency of the `cpy(imm, zeroing)` instruction seems quite high, while the `movi + cpy(imm, merging)` combination improves the parallelism of the program. In some cases, a `mov` or other instruction with the same effect is already generated before the `cpy(imm, zeroing)` instruction, thus achieving the optimization effect of the `movi + cpy(imm, merging)` instruction combination. Therefore, the slight regression caused by the extra `movi` instruction in these cases is reasonable. However, for cases where this optimization applies, the performance improvement will be more significant. For example, in the following case, I even saw a **2x** performance improvement on Neoverse-V2. >> >> @Param({"128"}) >> private int loop_iteration; >> private static final VectorSpecies ispecies = VectorSpecies.ofLargestShape(int.class); >> private boolean[] mask_arr; >> >> @Setup(Level.Trial) >> public void BmSetup() { >> int array_size = loop_iteration * bspecies.length(); >> mask_arr = new boolean[array_size]; >> Random r = new Random(); >> for (int i = 0; i < array_size; i++) { >> mask_arr[i] = r.nextBoolean(); >> } >> } >> >> @CompilerControl(CompilerControl.Mode.INLINE) >> private long testIndexInRangeToLongKernel(VectorSpecies species) { >> long sum = 0; >> VectorMask m = VectorMask.fromArray(species, mask_arr, 0); >> for (int i = 0; i < loop_iteration; i++) { >> sum += m.indexInRange(i & (m.length() - 1), m.length()).toLong(); >> } >> return sum; >> } >> >> @Benchmark >> public long indexInRangeToLongInt() { >> return testIndexInRangeToLongKernel(ispecies); >> } >> >> >> Therefore, when you test this change using the C case, you will see a significant performance improvement. >>> I see 2% uplift on these numbers. >> >> @theRealAph And I think this also explains your question on these numbers. >> >>> One thing you can do is add a flag to control this minor optimization, but make it constexpr bool = true until we know what other SVE implementations might do. >> In general: >> Dea... > >> Therefore, when you test this change using the C case, you will see a significant performance improvement. >> >> > I see 2% uplift on these numbers. >> >> @theRealAph And I think this also explains your question on these numbers. > > Not at all. > > The performance claim above was: > >> Microbenchmarks show this change brings performance uplift ranging from 11% to 33%, depending on the specific operation and data types. > > But the real performance uplift, as measured in Java microbenchmarks, is 2%. Hi @theRealAph, I?ve updated the code comments based on your suggestions. Thank you for your patient review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/29359#issuecomment-3839845869 From dlong at openjdk.org Tue Feb 3 08:36:33 2026 From: dlong at openjdk.org (Dean Long) Date: Tue, 3 Feb 2026 08:36:33 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching [v26] In-Reply-To: <5ndj0gE-T75cTq9SIs6slsLOnumMzlXPWOFGk3KZvgE=.a4d9ede5-5e71-4da5-a8f3-d380e58f1a34@github.com> References: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> <5ndj0gE-T75cTq9SIs6slsLOnumMzlXPWOFGk3KZvgE=.a4d9ede5-5e71-4da5-a8f3-d380e58f1a34@github.com> Message-ID: On Mon, 2 Feb 2026 18:08:54 GMT, Andrew Haley wrote: >> In MacOS/AArch64 HotSpot, we have to deal with the fact that a thread must be in one of two modes: it either may write to code cache memory or it may execute (and read) code or data in it. A system call `pthread_jit_write_protect_np(int enabled)` changes from one to the other. >> >> Today, we change mode whenever making a transition from interpreter to VM. This means that we change mode a lot: experiments have shown that during `jshell` startup we change mode 4 million times. Other experiments have shown that we only needed to change mode 45 thousand times. >> >> This "eager" mode switching is perhaps too eager, and we'd be better off switching lazily. While the system call that changes mode is very fast, mode switching still amounts to about 100ms of startup time. Switching eagerly also means that some native calls (e.g. to do arithmetic) are disproportionately expensive, given that they have no need of mode switching at all. >> >> The approach in this PR is to defer transitioning from exec-but-don't-write mode (`WXExec`) to write-but-don't-exec mode (`WXWrite`) until we need to write. Instead of enabling `WXWrite` immediately, we switch to a mode called `WXArmedForWrite`. When in this mode, when we need to write into code memory we call `os_bsd_jit_exec_enabled(false)` to enable writing and then set the current mode to `WXWrite`. >> >> We mark all sites that we know will write to code memory with >> `MACOS_AARCH64_ONLY(os::thread_wx_enable_write());` Judicious placement of these markers, such as when entering patching code, means that we have a fairly small number of these. >> >> We also keep track (in thread-local storage) of the current state of `pthread_jit_write_protect_np` in order to avoid making the system call unnecessarily. >> >> It is possible that we have missed some sites where we do need to make a transition from write-protected to -enabled. While we haven't seen any in testing, we have a fallback path. An attempt to write into code memory triggers a `SIGILL` signal. A signal handler detects this, and if the current mode `WXArmedForWrite` it changes mode to write-enabled and returns. In addition, the handler "heals" the VM entry point so that next time the same point is entered (and for the rest of the lifetime of the VM) it will immediately transition to `WXWrite`. >> >> One other possibility remains: we could omit all of the `wx_enable_write` markers and use healing instead. We've experimented with this. It works well enough, but is rather crude... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Back out 37730e6aac899e1fbdcf4f201ac2ae1013201432 Testing passed. Let's ship it! Looks like you need 1 more review. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26562#pullrequestreview-3743825703 From duke at openjdk.org Tue Feb 3 09:01:19 2026 From: duke at openjdk.org (Harshit470250) Date: Tue, 3 Feb 2026 09:01:19 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v6] In-Reply-To: References: Message-ID: > This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. Harshit470250 has updated the pull request incrementally with four additional commits since the last revision: - remove import of barrierSetC2 from runtime - make shenandoah types private - move _clone_type_Type to BarrierSetC2 - remove whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27279/files - new: https://git.openjdk.org/jdk/pull/27279/files/6a02f24e..a4302cba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27279&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27279&range=04-05 Stats: 29 lines in 6 files changed: 14 ins; 10 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27279.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27279/head:pull/27279 PR: https://git.openjdk.org/jdk/pull/27279 From sspitsyn at openjdk.org Tue Feb 3 09:20:55 2026 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 3 Feb 2026 09:20:55 GMT Subject: RFR: 8373367: interp-only mechanism fails to work for carrier threads in a corner case [v3] In-Reply-To: References: <4kL5ukI7hOKtKX0zkyc6K_7RMq3v1t_fJdvdwvmXfsw=.60ebbe1d-0133-4bff-953c-db953eed86db@github.com> Message-ID: <5reQqFFmFv9fMfLlNkBbrK_M7KwSrhsJ7N8IR4Fl-zs=.ecaf046d-23f1-42b9-9fcf-99226efa5826@github.com> On Mon, 2 Feb 2026 02:06:24 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: moved and extended comment in JvmtiThreadState ctor > > src/hotspot/share/prims/jvmtiThreadState.cpp line 61: > >> 59: >> 60: // The _thread field is a link to the JavaThread associated with JvmtiThreadState. >> 61: // The _thread_saved field is used for carrier threads only when a virtual thread, > > Suggestion: > > // The _thread_saved field is used for carrier threads only when a virtual thread Good catches, thanks! Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29436#discussion_r2758063991 From sspitsyn at openjdk.org Tue Feb 3 09:29:06 2026 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 3 Feb 2026 09:29:06 GMT Subject: RFR: 8373367: interp-only mechanism fails to work for carrier threads in a corner case [v4] In-Reply-To: <4kL5ukI7hOKtKX0zkyc6K_7RMq3v1t_fJdvdwvmXfsw=.60ebbe1d-0133-4bff-953c-db953eed86db@github.com> References: <4kL5ukI7hOKtKX0zkyc6K_7RMq3v1t_fJdvdwvmXfsw=.60ebbe1d-0133-4bff-953c-db953eed86db@github.com> Message-ID: > The `interp-only` mechanism is based on the `JavaThread` objects. Carrier and virtual threads can temporary share the same `JavaThread`. The `java_thread->jvmti_thread_state()` is re-linked to a virtual thread at `mount` and to the carrier thread at `unmount`. The `JvmtiThreadState` has a back link to the `JavaThread` which is also set for virtual thread at a `mount` and carrier thread at an `unmount`. Just one of these two links at the same time is set to the `JavaThread`, the other one has to be set to `nullptr`. The `interp-only` mechanism needs this invariant. > However, there is a corner case when this invariant is broken. It happens when the `JvmtiThreadState` for carrier thread has just been created. In such case, the link to `JavaThread` is always `non-nullptr` even though a virtual thread is currently mounted on a carrier thread. This simple update fixes the issue in the `JvmtiThreadState` ctor. > > Testing: > - TBD: Mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: fixed minor typos in newly added comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29436/files - new: https://git.openjdk.org/jdk/pull/29436/files/e5735668..65784f93 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29436&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29436&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/29436.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29436/head:pull/29436 PR: https://git.openjdk.org/jdk/pull/29436 From sspitsyn at openjdk.org Tue Feb 3 09:29:08 2026 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 3 Feb 2026 09:29:08 GMT Subject: RFR: 8373367: interp-only mechanism fails to work for carrier threads in a corner case [v3] In-Reply-To: References: <4kL5ukI7hOKtKX0zkyc6K_7RMq3v1t_fJdvdwvmXfsw=.60ebbe1d-0133-4bff-953c-db953eed86db@github.com> Message-ID: On Mon, 2 Feb 2026 02:11:35 GMT, David Holmes wrote: > I appreciate the expanded comments but I still don't fully understand what _thread and _saved_thread point to at different times. The lifecycle of these fields really needs to be clearly described somewhere. I had a plan to get rid of the `_saved_thread` filed. I can try to do that in this PR to make it simpler. Removing this field can also simplify the description of the `_thread` field lifecycle. Please, let me try to do this first. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29436#issuecomment-3840113575 From jsjolen at openjdk.org Tue Feb 3 09:55:58 2026 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 3 Feb 2026 09:55:58 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan wrote: >> **Summary** >> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. >> >> **Problem** >> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: >> >> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. >> >> >> **Solution** >> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. >> >> Current: address = base + offset_bytes (max ~2GB) >> Proposed: address = base + (offset_units << 3) (max 32GB) >> >> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! >> >> Current byte offset (aligned to 8 bytes): >> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 >> ??? Always 000! >> >> Scaled offset (shift=3): >> 0x00000200 = Same address, but stored in 29 bits instead of 32 >> Frees up 3 bits ? 8x larger range! >> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! >> >> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. >> >> **Test** >> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). >> >> Archive: >> - 300000 simple classes >> - 2000 mega-classes >> - 5000 FieldObject classes >> - Total: 307000 classes >> >> AOT Cache: >> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms >> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms >> >> Static CDS: >> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms >> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1... > > Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: > > add hotspot_resourcehogs_no_cds test group This is a drive-by comment. You are going to have to track whether a value is "offset-scaled" or "raw" and using `enum class`es can remove the risk of not catching a mistake. ```c++ enum class archive_offset : uintx {} template T static offset_to_archived_address(archive_offset offset_units) { assert(offset_units != 0, "sanity"); uintx offset_bytes = ((uintx)offset_units) << MetadataOffsetShift; T p = (T)(SharedBaseAddress + offset_bytes); assert(Metaspace::in_aot_cache(p), "must be"); return p; } This might be overkill, but I thought it prudent to float the idea. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3840272170 From jsjolen at openjdk.org Tue Feb 3 09:56:02 2026 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 3 Feb 2026 09:56:02 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: References: Message-ID: On Sun, 1 Feb 2026 03:59:08 GMT, Xue-Lei Andrew Fan wrote: >> src/hotspot/share/cds/filemap.cpp line 723: >> >>> 721: void FileMapInfo::seek_to_position(size_t pos) { >>> 722: if (os::lseek(_fd, (jlong)pos, SEEK_SET) < 0) { >>> 723: aot_log_error(aot)("Unable to seek to position %zu (errno=%d: %s)", pos, errno, os::strerror(errno)); >> >> This change seems to be unrelated to this patch. Can it be done in a different patch? > > The "(long)pos" to "(jlong)pos" is related, I think. I run into AOT/CDS testing issues without this update. @ashu-mehra Are you OK to keep it part of the pull request? > static jlong lseek(int fd, jlong offset, int whence); It was previously wrong, so this does technically fix a separate bug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2758211145 From adinn at openjdk.org Tue Feb 3 09:58:46 2026 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 3 Feb 2026 09:58:46 GMT Subject: RFR: 8372617: Save and restore stubgen stubs when using an AOT code cache [v9] In-Reply-To: References: Message-ID: > This PR adds save and restore of all generated stubs to the AOT code cache on x86 and aarch64. Other arches are modified to deal with the related generic PAI changes. > > Small changes were required to the aarch64 and x86_64 generator code in order to meet two key constraints: > 1. the first declared entry of every stub starts at the first instruction in the stub code range > 2. all data/code cross-references from one stub to another target a declared stub entry Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: configure low heap size to exercise more stub code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28433/files - new: https://git.openjdk.org/jdk/pull/28433/files/1c4b3f43..16e6d2a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28433&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28433&range=07-08 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28433.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28433/head:pull/28433 PR: https://git.openjdk.org/jdk/pull/28433 From adinn at openjdk.org Tue Feb 3 10:36:05 2026 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 3 Feb 2026 10:36:05 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching [v26] In-Reply-To: <5ndj0gE-T75cTq9SIs6slsLOnumMzlXPWOFGk3KZvgE=.a4d9ede5-5e71-4da5-a8f3-d380e58f1a34@github.com> References: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> <5ndj0gE-T75cTq9SIs6slsLOnumMzlXPWOFGk3KZvgE=.a4d9ede5-5e71-4da5-a8f3-d380e58f1a34@github.com> Message-ID: On Mon, 2 Feb 2026 18:08:54 GMT, Andrew Haley wrote: >> In MacOS/AArch64 HotSpot, we have to deal with the fact that a thread must be in one of two modes: it either may write to code cache memory or it may execute (and read) code or data in it. A system call `pthread_jit_write_protect_np(int enabled)` changes from one to the other. >> >> Today, we change mode whenever making a transition from interpreter to VM. This means that we change mode a lot: experiments have shown that during `jshell` startup we change mode 4 million times. Other experiments have shown that we only needed to change mode 45 thousand times. >> >> This "eager" mode switching is perhaps too eager, and we'd be better off switching lazily. While the system call that changes mode is very fast, mode switching still amounts to about 100ms of startup time. Switching eagerly also means that some native calls (e.g. to do arithmetic) are disproportionately expensive, given that they have no need of mode switching at all. >> >> The approach in this PR is to defer transitioning from exec-but-don't-write mode (`WXExec`) to write-but-don't-exec mode (`WXWrite`) until we need to write. Instead of enabling `WXWrite` immediately, we switch to a mode called `WXArmedForWrite`. When in this mode, when we need to write into code memory we call `os_bsd_jit_exec_enabled(false)` to enable writing and then set the current mode to `WXWrite`. >> >> We mark all sites that we know will write to code memory with >> `MACOS_AARCH64_ONLY(os::thread_wx_enable_write());` Judicious placement of these markers, such as when entering patching code, means that we have a fairly small number of these. >> >> We also keep track (in thread-local storage) of the current state of `pthread_jit_write_protect_np` in order to avoid making the system call unnecessarily. >> >> It is possible that we have missed some sites where we do need to make a transition from write-protected to -enabled. While we haven't seen any in testing, we have a fallback path. An attempt to write into code memory triggers a `SIGILL` signal. A signal handler detects this, and if the current mode `WXArmedForWrite` it changes mode to write-enabled and returns. In addition, the handler "heals" the VM entry point so that next time the same point is entered (and for the rest of the lifetime of the VM) it will immediately transition to `WXWrite`. >> >> One other possibility remains: we could omit all of the `wx_enable_write` markers and use healing instead. We've experimented with this. It works well enough, but is rather crude... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Back out 37730e6aac899e1fbdcf4f201ac2ae1013201432 Looks good I've been following this and am happy to do the honours. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26562#pullrequestreview-3744469329 PR Comment: https://git.openjdk.org/jdk/pull/26562#issuecomment-3840491586 From azafari at openjdk.org Tue Feb 3 10:49:32 2026 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 3 Feb 2026 10:49:32 GMT Subject: RFR: 8332189: Enable -Wzero-as-null-pointer-constant for gcc/clang In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 00:16:54 GMT, Kim Barrett wrote: > Please review this change which enables `-Wzero-as-null-pointer-constant` > warnings in HotSpot code when building with gcc or clang. > > There are three parts to this change. > > The first part augments the warning flags setup to support adding warning > options that are only applied to HotSpot, rather than the JDK as a whole. > There was previously some unused and possibly incomplete support for this when > using gcc. Note that the Windows/Visual Studio support hasn't been tested > much, and I think might not be working yet. I'm going to investigate that > further in followup work. > > The second part enables `-Wzero-as-null-pointer-constant` for HotSpot code. > This follows the guidance to avoid such in the HotSpot Style Guide. > > The third part removes a note in the HotSpot Style Guide about lingering uses > of literal 0 as a null pointer constant. Those have been removed, and this > change will block backsliding. > > Testing: mach5 tier1, GHA Sanity tests > > Integration of this change needs to wait for JDK-8376758. Thanks for this, Kim! Copyright year is to be updated, IMHO. ------------- PR Review: https://git.openjdk.org/jdk/pull/29497#pullrequestreview-3744538574 From dbriemann at openjdk.org Tue Feb 3 10:52:17 2026 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 3 Feb 2026 10:52:17 GMT Subject: RFR: 8376113: PPC64: Implement special MachNodes for floating point Min / Max [v3] In-Reply-To: References: Message-ID: > Add mach nodes MinF, MaxF, MinD, MaxD for PPC. David Briemann has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - add match_rule_supported check for MinF, MaxF, MinD, MaxD and PPC >= 9 - 8376113: PPC64: Implement special MachNodes for floating point Min / Max ------------- Changes: https://git.openjdk.org/jdk/pull/29361/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29361&range=02 Stats: 63 lines in 3 files changed: 63 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29361/head:pull/29361 PR: https://git.openjdk.org/jdk/pull/29361 From rsunderbabu at openjdk.org Tue Feb 3 10:54:33 2026 From: rsunderbabu at openjdk.org (Ramkumar Sunderbabu) Date: Tue, 3 Feb 2026 10:54:33 GMT Subject: RFR: 8375443: AVX-512: Disabling through UseSHA doesn't affect UseSHA3Intrinsics [v5] In-Reply-To: References: Message-ID: > UseSHA flag is not respected while enabling/disabling UseSHA3Intrinsics flag in x86 builds. > Added UseSHA in the mix. Ramkumar Sunderbabu has updated the pull request incrementally with one additional commit since the last revision: fixed whitespace issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29266/files - new: https://git.openjdk.org/jdk/pull/29266/files/a09cb5ad..e61d58eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29266&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29266&range=03-04 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/29266.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29266/head:pull/29266 PR: https://git.openjdk.org/jdk/pull/29266 From dholmes at openjdk.org Tue Feb 3 11:07:51 2026 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Feb 2026 11:07:51 GMT Subject: RFR: 8376568: Change Thread::getStackTrace to use handshake op for all cases [v3] In-Reply-To: References: <6WdkzWF-d6yGLKVUP9pCiYE1ghOdL5sTlcBiA1bE4c0=.802606b6-f958-4dea-a6a7-3d8a406c177c@github.com> Message-ID: On Tue, 3 Feb 2026 08:28:41 GMT, Alan Bateman wrote: > There is no need for ThreadDumper to do any additional filtering. The thread streams that it consumes filter out Threads that are not alive. Okay that is not at all obvious. But in that case you are never passing a NEW thread to `ThreadSnapshot.of()`. But if that is to be the primary API for getting stacktraces then having it filter on isAlive is reasonable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29461#discussion_r2758520184 From dbriemann at openjdk.org Tue Feb 3 11:24:13 2026 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 3 Feb 2026 11:24:13 GMT Subject: RFR: 8376113: PPC64: Implement special MachNodes for floating point Min / Max [v4] In-Reply-To: References: Message-ID: <4urTG34Sbor6QUhUFuiJgcuZ6jEwDWnnqM32JUG6KKY=.7361d419-1b0d-47bb-95bb-aa6042c34a7d@github.com> > Add mach nodes MinF, MaxF, MinD, MaxD for PPC. David Briemann has updated the pull request incrementally with one additional commit since the last revision: add missing format strings, enable IR matching for >= PPC9 in TestMinMaxIdentity.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29361/files - new: https://git.openjdk.org/jdk/pull/29361/files/c4417170..4621017d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29361&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29361&range=02-03 Stats: 9 lines in 3 files changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/29361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29361/head:pull/29361 PR: https://git.openjdk.org/jdk/pull/29361 From ghan at openjdk.org Tue Feb 3 12:00:09 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Tue, 3 Feb 2026 12:00:09 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() In-Reply-To: <1ueZt1yRnN71yJlDZ1jsOpXgGkp4bzOxNpWjbdiXx6I=.f58e8738-db1a-40bb-8d8a-bee26d7547fe@github.com> References: <1ueZt1yRnN71yJlDZ1jsOpXgGkp4bzOxNpWjbdiXx6I=.f58e8738-db1a-40bb-8d8a-bee26d7547fe@github.com> Message-ID: On Tue, 3 Feb 2026 07:04:28 GMT, Shawn M Emery wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > test/hotspot/jtreg/compiler/cpuflags/TestUseAESCTRIntrinsicsWithUseAESDisabled.java line 28: > >> 26: * @bug 8374516 >> 27: * @summary Regression test for -XX:+UseAESCTRIntrinsics -XX:-UseAES crash >> 28: * @requires os.arch=="amd64" | os.arch=="x86_64" > > These are the only two architectures that exhibit this bug? > I was able to reproduce the problem with this test case on my x86_64 desktop and confirmed that the fix did indeed resolve the problem. > All AES Java and hotspot regression tests have also passed. Hi @smemery , thanks for the review. Before making this change, I did a quick check across other architectures and found that some architectures don?t support this flag, and some already handle it correctly. Only the x86 architecture wasn?t handling it correctly. So I only tested on those two architectures. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29338#discussion_r2758718178 From ghan at openjdk.org Tue Feb 3 12:04:50 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Tue, 3 Feb 2026 12:04:50 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v2] In-Reply-To: References: Message-ID: > Please review this change. Thanks! > > **Description:** > > VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). > > **Fix:** > > Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. > > **Test:** > > GHA Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: optimize warning log ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29338/files - new: https://git.openjdk.org/jdk/pull/29338/files/6e7035dd..7eb2b386 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29338&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29338&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29338/head:pull/29338 PR: https://git.openjdk.org/jdk/pull/29338 From kfarrell at openjdk.org Tue Feb 3 12:05:42 2026 From: kfarrell at openjdk.org (Kieran Farrell) Date: Tue, 3 Feb 2026 12:05:42 GMT Subject: RFR: 8359706: Add file descriptor count to VM.info [v8] In-Reply-To: References: Message-ID: On Thu, 29 Jan 2026 09:40:53 GMT, Thomas Stuefe wrote: >> Kieran Farrell has updated the pull request incrementally with one additional commit since the last revision: >> >> minor updates > > src/hotspot/os/linux/os_linux.cpp line 5412: > >> 5410: timed_out = true; >> 5411: break; >> 5412: } > > Can you please do a little manual test like this: > > if (fds > some number) sleep(Timeout * 2); > > and check if the timeout works? the above returns `Open File Descriptors: > (number)` as expected ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2758735078 From ghan at openjdk.org Tue Feb 3 12:07:52 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Tue, 3 Feb 2026 12:07:52 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v2] In-Reply-To: <1ueZt1yRnN71yJlDZ1jsOpXgGkp4bzOxNpWjbdiXx6I=.f58e8738-db1a-40bb-8d8a-bee26d7547fe@github.com> References: <1ueZt1yRnN71yJlDZ1jsOpXgGkp4bzOxNpWjbdiXx6I=.f58e8738-db1a-40bb-8d8a-bee26d7547fe@github.com> Message-ID: On Tue, 3 Feb 2026 07:03:35 GMT, Shawn M Emery wrote: >> Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: >> >> optimize warning log > > src/hotspot/cpu/x86/vm_version_x86.cpp line 1141: > >> 1139: FLAG_SET_DEFAULT(UseAESIntrinsics, false); >> 1140: if (UseAESCTRIntrinsics && !FLAG_IS_DEFAULT(UseAESCTRIntrinsics)) { >> 1141: warning("AES_CTR intrinsics require UseAES flag to be enabled. Intrinsics will be disabled."); > > I propose the following changes: > OLD > "Intrinsics will be disabled." > NEW > "AES_CTR intrinsics will be disabled." @smemery Fixed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29338#discussion_r2758743279 From jbhateja at openjdk.org Tue Feb 3 12:56:52 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Feb 2026 12:56:52 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v12] In-Reply-To: References: Message-ID: On Mon, 26 Jan 2026 17:43:41 GMT, Sandhya Viswanathan wrote: >> LGTM! Thanks for your updating! > >> Hi @XiaohongGong , your comments have been addressed. Hi @sviswa7, can you kindly review x86 part. > > Thanks @jatin-bhateja. I will take a look next week. Hi @sviswa7 , Please let me know if you have comments on x86 backend part. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24104#issuecomment-3841145292 From mbaesken at openjdk.org Tue Feb 3 13:36:46 2026 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 3 Feb 2026 13:36:46 GMT Subject: RFR: 8376956: Add JVMTI phase entering/setting to hserr event log Message-ID: We should add some info to the hserr/hsinfo event logs about JVMTI phases. ------------- Commit messages: - JDK-8376956 Changes: https://git.openjdk.org/jdk/pull/29525/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29525&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8376956 Stats: 8 lines in 1 file changed: 7 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29525.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29525/head:pull/29525 PR: https://git.openjdk.org/jdk/pull/29525 From mdoerr at openjdk.org Tue Feb 3 13:45:19 2026 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Feb 2026 13:45:19 GMT Subject: RFR: 8376113: PPC64: Implement special MachNodes for floating point Min / Max [v4] In-Reply-To: <4urTG34Sbor6QUhUFuiJgcuZ6jEwDWnnqM32JUG6KKY=.7361d419-1b0d-47bb-95bb-aa6042c34a7d@github.com> References: <4urTG34Sbor6QUhUFuiJgcuZ6jEwDWnnqM32JUG6KKY=.7361d419-1b0d-47bb-95bb-aa6042c34a7d@github.com> Message-ID: On Tue, 3 Feb 2026 11:24:13 GMT, David Briemann wrote: >> Add mach nodes MinF, MaxF, MinD, MaxD for PPC. > > David Briemann has updated the pull request incrementally with one additional commit since the last revision: > > add missing format strings, enable IR matching for >= PPC9 in TestMinMaxIdentity.java Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29361#pullrequestreview-3745355514 From egahlin at openjdk.org Tue Feb 3 14:08:54 2026 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 3 Feb 2026 14:08:54 GMT Subject: RFR: 8373096: JFR: Path-to-gc-roots search should be non-recursive [v6] In-Reply-To: References: Message-ID: On Thu, 29 Jan 2026 14:57:04 GMT, Thomas Stuefe wrote: >> This is a continuation - second attempt - of https://github.com/openjdk/jdk/pull/28659. >> >> ---- >> >> A customer reported a native stack overflow when producing a JFR recording with path-to-gc-roots=true. This happens regularly, see similar cases in JBS (e.g. https://bugs.openjdk.org/browse/JDK-8371630, https://bugs.openjdk.org/browse/JDK-8282427 etc). >> >> We limit the maximum graph search depth (DFSClosure::max_dfs_depth) to prevent stack overflows. That solution is brittle, however, since recursion depth is not a good proxy for thread stack usage: it depends on many factors, e.g., compiler inlining decisions and platform specifics. In this case, the VMThread's stack was too small. >> >> This patch rewrites the DFS heap tracer to be non-recursive. This is mostly textbook stuff, but the devil is in the details. Nevertheless, the algorithm should be a straightforward read. >> >> ### Memory usage of old vs new algorithm: >> >> The new algorithm uses, on average, a bit less memory than the old one. The old algorithm did cost ((avg stackframe size in bytes) * depth). As we have seen, e.g., in JDK-8371630, a depth of 3200 can max out ~1MB of stack space. >> >> The new algorithm costs ((avg number of outgoing refs per instanceKlass oop) * depth * 16. For a depth of 3200, we get typical probe stack sizes of 100KB..200KB. But we also cap probestack size, similar to how we cap the max. graph depth. >> >> In any case, these numbers are nothing to worry about. For a more in-depth explanation about memory cost, please see the comment in dfsClosure.cpp. >> >> ### Possible improvements/simplifications in the future: >> >> DFS works perfectly well alone now. It no longer depends on stack size, and its memory usage is typically smaller than BFS. IMHO, it would be perfectly fine to get rid of BFS and rely solely on the non-recursive DFS. The benefit would be a decrease in complexity and fewer tests to run and maintain. It should also be easy to convert into a parallelized version later. >> >> I kept the _max_dfs_depth_ parameter for now, but tbh it is no longer very useful. Before, it prevented stack overflows. Now, it is just an indirect way to limit probe stack size. But we also explicitly cap the probe stack size, so _max_dfs_depth_ is redundant. Removing it would require changing the statically allocated reference stack to be dynamically allocated, but that should not be difficult. >> >> ### Observable differences >> >> There is one observable side effect to the changed a... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - remove unnecessary copyright change > - remove debug output > - Erics test suggestions There is still a log_info in DFSClosure destructor that should be log_debug. I wonder if you have looked at performance. For example, in which order is it best to check for nullptr, oop has been visited or whether the stack is full? You added a _num_objects_processed variable, but it?s never used, so you might want to remove it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29382#issuecomment-3841536590 From stuefe at openjdk.org Tue Feb 3 14:17:14 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 14:17:14 GMT Subject: RFR: 8359706: Add file descriptor count to VM.info [v8] In-Reply-To: References: Message-ID: On Tue, 20 Jan 2026 19:53:41 GMT, Kieran Farrell wrote: >> Currently, it is only possible to read the number of open file descriptors of a Java process via the `UnixOperatingSystemMXBean` which is only accessible via JMX enabled tools. To improve servicability, it would be benifical to be able to view this information from jcmd VM.info output or hs_err_pid crash logs. This could help diagnose resource exhaustion and troubleshoot "too many open files" errors in Java processes on Unix platforms. >> >> This PR adds reporting the current open file descriptor count to both jcmd VM.info output or hs_err_pid crash logs by refactoring the native JNI logic from `Java_com_sun_management_internal_OperatingSystemImpl_getOpenFileDescriptorCount0` of the `UnixOperatingSystemMXBean` into hotspot. Apple's API for retrieving open file descriptor count provides an array of the actual FDs to determine the count. To avoid using `malloc` to store this array in a potential signal handling context where stack space may be limited, the apple implementation instead allocates a fixed 32KB struct on the stack to store the open FDs and only reports the result if the struct is less than the max (1024 FDs). This should cover the majoirty of use cases. > > Kieran Farrell has updated the pull request incrementally with one additional commit since the last revision: > > minor updates src/hotspot/os/bsd/os_bsd.cpp line 2589: > 2587: #ifdef __APPLE__ > 2588: const int MAX_SAFE_FDS = 1024; > 2589: struct proc_fdinfo fds[MAX_SAFE_FDS]; Hmm, this may be a bad idea. This function is called during signal handling. Don't allocate massive amounts of stack storage here, that may lead to secondary crashes during error handling which we could not recover from. See below, I am not sure it is even necessary. src/hotspot/os/bsd/os_bsd.cpp line 2608: > 2606: > 2607: nfiles = res / sizeof(struct proc_fdinfo); > 2608: if (nfiles >= MAX_SAFE_FDS) { About MAX_SAFE_FDS: What is really returned by `pid_for_task`? If return values > MAX_SAFE_FDS are possible, would that be the reliable "number of open fds" ? If so, why not print that instead of ">1024"? In fact, if so, why even bother returning an array at all, why not make the array 1-element-sized only, pro format? All we are interested in is the number. src/hotspot/os/bsd/os_bsd.cpp line 2615: > 2613: st->print_cr("Open File Descriptors: %d", nfiles); > 2614: #else > 2615: st->print_cr("Open File Descriptors: unknown"); Indentation looks off Also, the code could be condensed quite a bit. src/hotspot/os/linux/os_linux.cpp line 5406: > 5404: if (isdigit(dentp->d_name[0])) fds++; > 5405: if (fds % 100 == 0) { > 5406: clock_gettime(CLOCK_MONOTONIC, &now); You don't query the return code here. If clock_gettime fails, content of `now` is undefined. Could lead to premature abortion of this loop. If runtime errors are possible, they should be handled; otherwise an assertion would be the right thing to do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2759298203 PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2759274151 PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2759277532 PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2759289309 From egahlin at openjdk.org Tue Feb 3 14:25:46 2026 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 3 Feb 2026 14:25:46 GMT Subject: RFR: 8373096: JFR: Path-to-gc-roots search should be non-recursive [v6] In-Reply-To: References: Message-ID: On Thu, 29 Jan 2026 14:57:04 GMT, Thomas Stuefe wrote: >> This is a continuation - second attempt - of https://github.com/openjdk/jdk/pull/28659. >> >> ---- >> >> A customer reported a native stack overflow when producing a JFR recording with path-to-gc-roots=true. This happens regularly, see similar cases in JBS (e.g. https://bugs.openjdk.org/browse/JDK-8371630, https://bugs.openjdk.org/browse/JDK-8282427 etc). >> >> We limit the maximum graph search depth (DFSClosure::max_dfs_depth) to prevent stack overflows. That solution is brittle, however, since recursion depth is not a good proxy for thread stack usage: it depends on many factors, e.g., compiler inlining decisions and platform specifics. In this case, the VMThread's stack was too small. >> >> This patch rewrites the DFS heap tracer to be non-recursive. This is mostly textbook stuff, but the devil is in the details. Nevertheless, the algorithm should be a straightforward read. >> >> ### Memory usage of old vs new algorithm: >> >> The new algorithm uses, on average, a bit less memory than the old one. The old algorithm did cost ((avg stackframe size in bytes) * depth). As we have seen, e.g., in JDK-8371630, a depth of 3200 can max out ~1MB of stack space. >> >> The new algorithm costs ((avg number of outgoing refs per instanceKlass oop) * depth * 16. For a depth of 3200, we get typical probe stack sizes of 100KB..200KB. But we also cap probestack size, similar to how we cap the max. graph depth. >> >> In any case, these numbers are nothing to worry about. For a more in-depth explanation about memory cost, please see the comment in dfsClosure.cpp. >> >> ### Possible improvements/simplifications in the future: >> >> DFS works perfectly well alone now. It no longer depends on stack size, and its memory usage is typically smaller than BFS. IMHO, it would be perfectly fine to get rid of BFS and rely solely on the non-recursive DFS. The benefit would be a decrease in complexity and fewer tests to run and maintain. It should also be easy to convert into a parallelized version later. >> >> I kept the _max_dfs_depth_ parameter for now, but tbh it is no longer very useful. Before, it prevented stack overflows. Now, it is just an indirect way to limit probe stack size. But we also explicitly cap the probe stack size, so _max_dfs_depth_ is redundant. Removing it would require changing the statically allocated reference stack to be dynamically allocated, but that should not be difficult. >> >> ### Observable differences >> >> There is one observable side effect to the changed a... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - remove unnecessary copyright change > - remove debug output > - Erics test suggestions For the future, I think we want to keep BFS. Initially, I only had DFS, but the chains became so weird that I had to implement BFS. Regarding ordering, I think we want an order that makes the most sense to the user. ClassLoader is easier for users to understand than Global Object Handle. I'm not sure if that code is still present, but we had a specific order in which we processed roots. I think BFS will be able to cover most of the heap, and if there is a "linked list" where DFS would need to take over, the order is unlikely to matter at that point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29382#issuecomment-3841633851 From kfarrell at openjdk.org Tue Feb 3 14:29:33 2026 From: kfarrell at openjdk.org (Kieran Farrell) Date: Tue, 3 Feb 2026 14:29:33 GMT Subject: RFR: 8359706: Add file descriptor count to VM.info [v9] In-Reply-To: References: Message-ID: > Currently, it is only possible to read the number of open file descriptors of a Java process via the `UnixOperatingSystemMXBean` which is only accessible via JMX enabled tools. To improve servicability, it would be benifical to be able to view this information from jcmd VM.info output or hs_err_pid crash logs. This could help diagnose resource exhaustion and troubleshoot "too many open files" errors in Java processes on Unix platforms. > > This PR adds reporting the current open file descriptor count to both jcmd VM.info output or hs_err_pid crash logs by refactoring the native JNI logic from `Java_com_sun_management_internal_OperatingSystemImpl_getOpenFileDescriptorCount0` of the `UnixOperatingSystemMXBean` into hotspot. Apple's API for retrieving open file descriptor count provides an array of the actual FDs to determine the count. To avoid using `malloc` to store this array in a potential signal handling context where stack space may be limited, the apple implementation instead allocates a fixed 32KB struct on the stack to store the open FDs and only reports the result if the struct is less than the max (1024 FDs). This should cover the majoirty of use cases. Kieran Farrell has updated the pull request incrementally with one additional commit since the last revision: update test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27971/files - new: https://git.openjdk.org/jdk/pull/27971/files/926fc920..e9ed58ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27971&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27971&range=07-08 Stats: 7 lines in 2 files changed: 5 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27971.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27971/head:pull/27971 PR: https://git.openjdk.org/jdk/pull/27971 From syan at openjdk.org Tue Feb 3 14:31:20 2026 From: syan at openjdk.org (SendaoYan) Date: Tue, 3 Feb 2026 14:31:20 GMT Subject: RFR: 8376023: Reconcile ClassUnloader with ClassUnloadCommon Message-ID: Hi all, This PR merge test/hotspot/jtreg/vmTestbase/nsk/share/ClassUnloader.java to test/lib/jdk/test/lib/classloader/ClassUnloadCommon.java, and make the vmTestbase tests use the ClassUnloadCommon. Additional testing: - [ ] Full jtreg tests include vmTestbase ------------- Commit messages: - Update comment - Remove test/hotspot/jtreg/vmTestbase/nsk/share/ClassUnloader.bak - Add "import jdk.test.lib.classloader.ClassUnloadCommon;" - Add "import jdk.test.lib.classloader.ClassUnloadCommon;" - Replace ClassUnloader as ClassUnloadCommon - Fix test/hotspot/jtreg/vmTestbase/nsk/jdi/ClassUnloadRequest/addClassFilter/filter001/TestDescription.java - Fix vmTestbase/nsk/jvmti/scenarios/events/EM07/em07t002/TestDescription.java - Fix vmTestbase/nsk/jdi/ReferenceType/fields/fields003/TestDescription.java - Fix vmTestbase/nsk/jvmti/ObjectFree/objfree001/TestDescription.java - Fix vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java - ... and 3 more: https://git.openjdk.org/jdk/compare/e21cb852...458fb127 Changes: https://git.openjdk.org/jdk/pull/29545/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29545&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8376023 Stats: 613 lines in 71 files changed: 244 ins; 275 del; 94 mod Patch: https://git.openjdk.org/jdk/pull/29545.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29545/head:pull/29545 PR: https://git.openjdk.org/jdk/pull/29545 From stuefe at openjdk.org Tue Feb 3 14:45:28 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 14:45:28 GMT Subject: RFR: 8373096: JFR: Path-to-gc-roots search should be non-recursive [v6] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 14:05:51 GMT, Erik Gahlin wrote: > I wonder if you have looked at performance. For example, in which order is it best to check for nullptr, oop has been visited or whether the stack is full? You added a _num_objects_processed variable, but it?s never used, so you might want to remove it. Will do. > For the future, I think we want to keep BFS. Initially, I only had DFS, but the chains became so weird that I had to implement BFS. Sure; you are the maintainer, after all. > > Regarding ordering, I think we want an order that makes the most sense to the user. ClassLoader is easier for users to understand than Global Object Handle. I'm not sure if that code is still present, but we had a specific order in which we processed roots. Did you mean this? https://github.com/openjdk/jdk/blob/99bc98357dab78bef2cce7a10c98d13d1e5730e3/src/hotspot/share/jfr/leakprofiler/chains/rootSetClosure.cpp#L87-L97 I can reverse the order in there and thus get the (roughly) reversed order. I already tested that, but refrained from adding it to the patch. What do you think about printing the actual (first, first and second ...) objects that are referenced by the roots? I think that would often help a lot. It helped me a lot in understanding what happened. In fact, I just traced the whole chain during development, hence the logging added to add_chain(). ------------- PR Comment: https://git.openjdk.org/jdk/pull/29382#issuecomment-3841744021 From alanb at openjdk.org Tue Feb 3 14:48:49 2026 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 3 Feb 2026 14:48:49 GMT Subject: RFR: 8376568: Change Thread::getStackTrace to use handshake op for all cases [v3] In-Reply-To: References: <6WdkzWF-d6yGLKVUP9pCiYE1ghOdL5sTlcBiA1bE4c0=.802606b6-f958-4dea-a6a7-3d8a406c177c@github.com> Message-ID: On Tue, 3 Feb 2026 11:04:46 GMT, David Holmes wrote: >> The proposal is that ThreadSnapshot.of(Thread) can be called with any platform or virtual Thread in any state. With the update, it eagerly tests with isAlive so will filter out NEW and already TERMINATED threads. If/when we change Thread.getStackTrace to use ThreadSnapshot then the isAlive check can be dropped from Thread.getStackTrace. The underlying implementation in get_thread_snapshot does not need to deal with the NEW state. >> >> There is no need for ThreadDumper to do any additional filtering. The thread streams that it consumes filter out Threads that are not alive. > >> There is no need for ThreadDumper to do any additional filtering. The thread streams that it consumes filter out Threads that are not alive. > > Okay that is not at all obvious. But in that case you are never passing a NEW thread to `ThreadSnapshot.of()`. But if that is to be the primary API for getting stacktraces then having it filter on isAlive is reasonable. The thread stream is documented to be a stream of "live" threads and the implementations filter out unstarted or already terminated threads. If I read your latest message correctly then I think you are agreement with the current chnage to test isAlive at the "front door". It means get_thread_snapshot is never called with an unstarted/NEW Thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29461#discussion_r2759456977 From erikj at openjdk.org Tue Feb 3 14:56:47 2026 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 3 Feb 2026 14:56:47 GMT Subject: RFR: 8332189: Enable -Wzero-as-null-pointer-constant for gcc/clang In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 00:16:54 GMT, Kim Barrett wrote: > Please review this change which enables `-Wzero-as-null-pointer-constant` > warnings in HotSpot code when building with gcc or clang. > > There are three parts to this change. > > The first part augments the warning flags setup to support adding warning > options that are only applied to HotSpot, rather than the JDK as a whole. > There was previously some unused and possibly incomplete support for this when > using gcc. Note that the Windows/Visual Studio support hasn't been tested > much, and I think might not be working yet. I'm going to investigate that > further in followup work. > > The second part enables `-Wzero-as-null-pointer-constant` for HotSpot code. > This follows the guidance to avoid such in the HotSpot Style Guide. > > The third part removes a note in the HotSpot Style Guide about lingering uses > of literal 0 as a null pointer constant. Those have been removed, and this > change will block backsliding. > > Testing: mach5 tier1, GHA Sanity tests > > Integration of this change needs to wait for JDK-8376758. Marked as reviewed by erikj (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29497#pullrequestreview-3745761404 From eastigeevich at openjdk.org Tue Feb 3 15:00:31 2026 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 3 Feb 2026 15:00:31 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GCs and JIT performance [v13] In-Reply-To: References: Message-ID: On Wed, 3 Dec 2025 16:11:14 GMT, Aleksey Shipilev wrote: >> Evgeny Astigeevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: >> >> - Fix linux-cross-compile build aarch64 >> - Merge branch 'master' into JDK-8370947 >> - Remove trailing whitespaces >> - Add support of deferred icache invalidation to other GCs and JIT >> - Add UseDeferredICacheInvalidation to defer invalidation on CPU with hardware cache coherence >> - Add jtreg test >> - Fix linux-cross-compile aarch64 build >> - Fix regressions for Java methods without field accesses >> - Fix code style >> - Correct ifdef; Add dsb after ic >> - ... and 9 more: https://git.openjdk.org/jdk/compare/3d54a802...4b04496f > > Interesting work! I was able to look through it very briefly: @shipilev @theRealAph @fisk Ping ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3841835786 From stuefe at openjdk.org Tue Feb 3 15:13:48 2026 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 3 Feb 2026 15:13:48 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3] In-Reply-To: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> References: <2nI8SoEjkM35uhS-1dUEjvHOVj2RoSFGLzK6Tk4Ck7M=.a164d5e9-47ab-4be6-9f17-d770651b616b@github.com> Message-ID: On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan wrote: >> **Summary** >> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. >> >> **Problem** >> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: >> >> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. >> >> >> **Solution** >> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. >> >> Current: address = base + offset_bytes (max ~2GB) >> Proposed: address = base + (offset_units << 3) (max 32GB) >> >> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! >> >> Current byte offset (aligned to 8 bytes): >> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 >> ??? Always 000! >> >> Scaled offset (shift=3): >> 0x00000200 = Same address, but stored in 29 bits instead of 32 >> Frees up 3 bits ? 8x larger range! >> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! >> >> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. >> >> **Test** >> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). >> >> Archive: >> - 300000 simple classes >> - 2000 mega-classes >> - 5000 FieldObject classes >> - Total: 307000 classes >> >> AOT Cache: >> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms >> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms >> >> Static CDS: >> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms >> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1... > > Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: > > add hotspot_resourcehogs_no_cds test group Maybe I am slow, but I still don't understand how this works with compressed class pointers. Don't we map the file into the address space as one block, or at least per region? Into the encoding range? How can that work with a 32GB CDS file if the encoding range is limited to 4G? If you somehow manage to do that, even if you change narrow Klass encoding to work with shift and offset, there are implicit assumptions that the *shifted* nKlass value must not spill over into the upper half of 64bit. Pretty sure that's at least the case on aarch64. These errors may only surface if you actually run compiled code that works with instances of classes that have very high Klass IDs. Zooming back, strategically, I am not sure this is a good route to go. Adding the ability to load a huge number of classes will introduce significant technical debt, because once it's possible, you'll need to continue supporting it. That may prevent future developments that rely on the number of classes being reasonable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3841921280 From xuelei at openjdk.org Tue Feb 3 15:49:54 2026 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Tue, 3 Feb 2026 15:49:54 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> References: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> Message-ID: On Tue, 3 Feb 2026 00:38:31 GMT, Alexey Bakhtin wrote: >>> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. >> >> Thanks for the suggestion. I looked into this and found that buffer_to_offset() and any_to_offset() serve a different purpose than the _u4 versions. The _u4 versions use scaled encoding (with MetadataOffsetShift) and return a compact u4 for metadata pointer storage. The raw versions return unscaled byte offsets stored in larger types. These usages cannot switch to _u4 versions because they need raw byte offsets (not scaled) and store them in 64-bit types. >> >> However, the comments for the methods may be misleading after introducing the _u4 methods. What do you think to revise the comment as: >> >> // The address p points to an object inside the output buffer. When the archive is mapped >> // at the requested address, what's the byte offset of this object from _requested_static_archive_bottom? >> uintx buffer_to_offset(address p) const; >> >> // Same as buffer_to_offset, except that the address p points to either (a) an object >> // inside the output buffer, or (b), an object in the currently mapped static archive. >> uintx any_to_offset(address p) const; >> >> // The reverse of buffer_to_offset_u4() - converts scaled offset units back to buffered address. >> address offset_to_buffered_address(u4 offset_units) const; >> >> >> I am also OK to rename the method names to: `buffer_to_offset_bytes()` and `any_to_offset_bytes()`, if the new names are clearer. >> >> @ashu-mehra What do you think? > > Hi @XueleiFan, > > I've tried the suggested code with an archive size more than 4Gb, but it fails with an assertion: > > # Internal Error (aotMetaspace.cpp:1955), pid=96332, tid=4099 > # guarantee(archive_space_size < max_encoding_range_size - class_space_alignment) failed: Archive too large > > CDC archive was created successfully: > > [187.068s][info ][cds ] Shared file region (rw) 0: 822453584 bytes, addr 0x0000000800004000 file offset 0x00004000 crc 0x132b652e > [189.176s][info ][cds ] Shared file region (ro) 1: 3576115584 bytes, addr 0x0000000831060000 file offset 0x31060000 crc 0x71b020a2 > [197.653s][info ][cds ] Shared file region (ac) 4: 0 bytes > [198.870s][info ][cds ] Shared file region (bm) 2: 56555664 bytes, addr 0x0000000000000000 file offset 0x1062d4000 crc 0xbd87f804 > [199.504s][info ][cds ] Shared file region (hp) 3: 16091256 bytes, addr 0x00000000ff000000 file offset 0x1098c4000 crc 0x7834b7c3 > [199.684s][debug ][cds ] bm space: 56555664 [ 1.3% of total] out of 56555664 bytes [100.0% used] > [199.684s][debug ][cds ] hp space: 16091256 [ 0.4% of total] out of 16091256 bytes [100.0% used] at 0x0000000c6d000000 > [199.684s][debug ][cds ] total : 4471216088 [100.0% of total] out of 4471228536 bytes [100.0% used] @alexeybakhtin Thank you for testing of bigger archives (>4GB). I was wondering if it is OK to support 4GB+ archive when UseCompactObjectHeaders is false. The following prototype works. However, we prefer UseCompactObjectHeaders in practice, and the biggest archive size (5.6M objects) is about 2.1G at this moment. Could we have 4GB archive size limit as a open issue, and address it separately? diff --git a/src/hotspot/share/cds/aotMetaspace.cpp b/src/hotspot/share/cds/aotMetaspace.cpp index 62d76957c0a..676d54cba33 100644 --- a/src/hotspot/share/cds/aotMetaspace.cpp +++ b/src/hotspot/share/cds/aotMetaspace.cpp @@ -1950,8 +1950,12 @@ char* AOTMetaspace::reserve_address_space_for_archives(FileMapInfo* static_mapin const size_t ccs_begin_offset = align_up(archive_space_size, class_space_alignment); const size_t gap_size = ccs_begin_offset - archive_space_size; - // Reduce class space size if it would not fit into the Klass encoding range - constexpr size_t max_encoding_range_size = 4 * G; + // Reduce class space size if it would not fit into the Klass encoding range. + // The max encoding range depends on narrow Klass pointer bits and max shift: + // - With UseCompactObjectHeaders: 22-bit + shift 10 = 4GB + // - Without UseCompactObjectHeaders (legacy): 32-bit + shift 3 = 32GB + const size_t max_encoding_range_size = + nth_bit(CompressedKlassPointers::narrow_klass_pointer_bits() + CompressedKlassPointers::max_shift()); guarantee(archive_space_size < max_encoding_range_size - class_space_alignment, "Archive too large"); if ((archive_space_size + gap_size + class_space_size) > max_encoding_range_size) { class_space_size = align_down(max_encoding_range_size - archive_space_size - gap_size, class_space_alignment); diff --git a/src/hotspot/share/cds/archiveBuilder.cpp b/src/hotspot/share/cds/archiveBuilder.cpp index e1130e7befc..85f6922dd50 100644 --- a/src/hotspot/share/cds/archiveBuilder.cpp +++ b/src/hotspot/share/cds/archiveBuilder.cpp @@ -1122,16 +1122,16 @@ class RelocateBufferToRequested : public BitMapClosure { #ifdef _LP64 int ArchiveBuilder::precomputed_narrow_klass_shift() { - // Legacy Mode: - // We use 32 bits for narrowKlass, which should cover the full 4G Klass range. Shift can be 0. - // CompactObjectHeader Mode: - // narrowKlass is much smaller, and we use the highest possible shift value to later get the maximum - // Klass encoding range. + // We use the highest possible shift value to get the maximum Klass encoding range: + // - Legacy Mode (UseCompactObjectHeaders=false): + // 32-bit narrowKlass + shift 3 = 32GB encoding range + // - CompactObjectHeader Mode (UseCompactObjectHeaders=true): + // 22-bit narrowKlass + shift 10 = 4GB encoding range // // Note that all of this may change in the future, if we decide to correct the pre-calculated // narrow Klass IDs at archive load time. assert(UseCompressedClassPointers, "Only needed for compressed class pointers"); - return UseCompactObjectHeaders ? CompressedKlassPointers::max_shift() : 0; + return CompressedKlassPointers::max_shift(); } #endif // _LP64 diff --git a/src/hotspot/share/oops/compressedKlass.cpp b/src/hotspot/share/oops/compressedKlass.cpp index b32d10c74d2..f642a162adb 100644 --- a/src/hotspot/share/oops/compressedKlass.cpp +++ b/src/hotspot/share/oops/compressedKlass.cpp @@ -46,9 +46,10 @@ size_t CompressedKlassPointers::_protection_zone_size = 0; size_t CompressedKlassPointers::max_klass_range_size() { #ifdef _LP64 - const size_t encoding_allows = nth_bit(narrow_klass_pointer_bits() + max_shift()); - constexpr size_t cap = 4 * G; - return MIN2(encoding_allows, cap); + // The max klass range is determined by narrow Klass pointer bits and max shift: + // - With UseCompactObjectHeaders: 22-bit + shift 10 = 4GB + // - Without UseCompactObjectHeaders (legacy): 32-bit + shift 3 = 32GB + return nth_bit(narrow_klass_pointer_bits() + max_shift()); #else // 32-bit: only 32-bit "narrow" Klass pointers allowed. If we ever support smaller narrow // Klass pointers here, coding needs to be revised. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3842105081 From xuelei at openjdk.org Tue Feb 3 17:06:56 2026 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Tue, 3 Feb 2026 17:06:56 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v4] In-Reply-To: References: Message-ID: > **Summary** > This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. > > **Problem** > Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: > > [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. > > > **Solution** > Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. > > Current: address = base + offset_bytes (max ~2GB) > Proposed: address = base + (offset_units << 3) (max 32GB) > > All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! > > Current byte offset (aligned to 8 bytes): > 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 > ??? Always 000! > > Scaled offset (shift=3): > 0x00000200 = Same address, but stored in 29 bits instead of 32 > Frees up 3 bits ? 8x larger range! > Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! > > By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. > > **Test** > All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). > > Archive: > - 300000 simple classes > - 2000 mega-classes > - 5000 FieldObject classes > - Total: 307000 classes > > AOT Cache: > Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms > Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms > > Static CDS: > Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms > Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1010ms > > Base static CDS + Dynamic CDS: > Times (wall): base_create=157186ms dynamic_... Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: keep cds version ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29494/files - new: https://git.openjdk.org/jdk/pull/29494/files/4257c837..e18293e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29494&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29494&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29494.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29494/head:pull/29494 PR: https://git.openjdk.org/jdk/pull/29494 From kvn at openjdk.org Tue Feb 3 19:10:17 2026 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Feb 2026 19:10:17 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v2] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 12:04:50 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: > > optimize warning log Hi @hgqxjj The fix is fine but not complete. There are more issues down in code. The main is that `FLAG_SET_DEFAULT()` is called under `!FLAG_IS_DEFAULT(UseAESCTRIntrinsics)) ` check which should be used only for `warning()` message. See lines 1162 and 1180-1192. ------------- PR Review: https://git.openjdk.org/jdk/pull/29338#pullrequestreview-3747073765 From kvn at openjdk.org Tue Feb 3 19:22:43 2026 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Feb 2026 19:22:43 GMT Subject: RFR: 8375443: AVX-512: Disabling through UseSHA doesn't affect UseSHA3Intrinsics [v5] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 10:54:33 GMT, Ramkumar Sunderbabu wrote: >> UseSHA flag is not respected while enabling/disabling UseSHA3Intrinsics flag in x86 builds. >> Added UseSHA in the mix. > > Ramkumar Sunderbabu has updated the pull request incrementally with one additional commit since the last revision: > > fixed whitespace issue SHA code here is mess. I would like it be re-written similar to what we have for `UseAES`. Is only x86 code affected or other platforms have it too? Also look on [PR 29338](https://git.openjdk.org/jdk/pull/29338) src/hotspot/cpu/x86/vm_version_x86.cpp line 1346: > 1344: if (FLAG_IS_DEFAULT(UseSHA3Intrinsics)) { > 1345: FLAG_SET_DEFAULT(UseSHA3Intrinsics, true); > 1346: UseSHA3Intrinsics = true; You don't need this line - `FLAG_SET_DEFAULT` sets it. ------------- PR Review: https://git.openjdk.org/jdk/pull/29266#pullrequestreview-3747105520 PR Review Comment: https://git.openjdk.org/jdk/pull/29266#discussion_r2760610604 From duke at openjdk.org Tue Feb 3 19:22:45 2026 From: duke at openjdk.org (duke) Date: Tue, 3 Feb 2026 19:22:45 GMT Subject: RFR: 8372942: AArch64: Set JVM flags for Neoverse V3AE core [v2] In-Reply-To: References: Message-ID: On Fri, 30 Jan 2026 22:22:50 GMT, Ruben wrote: >> For Neoverse N1, N2, N3, V1, V2 and V3, the following JVM flags are set: >> - UseSIMDForMemoryOps=true >> - OnSpinWaitInst=isb >> - OnSpinWaitInstCount=1 >> - AlwaysMergeDMB=false >> >> Additionally, for Neoverse V1, V2 and V3 only, these flags are set: >> - UseCryptoPmullForCRC32=true >> - CodeEntryAlignment=32 >> >> Enable the same flags for Neoverse V3AE. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Introduce `model_is_in` @ruben-arm Your change (at version 5705b5a77d14779483c25703161150ada0ee24e4) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28607#issuecomment-3843188947 From abakhtin at openjdk.org Tue Feb 3 21:31:46 2026 From: abakhtin at openjdk.org (Alexey Bakhtin) Date: Tue, 3 Feb 2026 21:31:46 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> References: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> Message-ID: On Tue, 3 Feb 2026 00:38:31 GMT, Alexey Bakhtin wrote: >>> There are two more apis that return "unchecked" offset: `ArchiveBuilder::buffer_to_offset()` and `ArchiveBuilder::any_to_offset()`. These apis are not returning the scaled offset. I think it is better to get rid of these apis and replace their usage with `_u4` version which has the offset range check. I noticed there are only 1-2 instances that use these "unchecked" apis. >> >> Thanks for the suggestion. I looked into this and found that buffer_to_offset() and any_to_offset() serve a different purpose than the _u4 versions. The _u4 versions use scaled encoding (with MetadataOffsetShift) and return a compact u4 for metadata pointer storage. The raw versions return unscaled byte offsets stored in larger types. These usages cannot switch to _u4 versions because they need raw byte offsets (not scaled) and store them in 64-bit types. >> >> However, the comments for the methods may be misleading after introducing the _u4 methods. What do you think to revise the comment as: >> >> // The address p points to an object inside the output buffer. When the archive is mapped >> // at the requested address, what's the byte offset of this object from _requested_static_archive_bottom? >> uintx buffer_to_offset(address p) const; >> >> // Same as buffer_to_offset, except that the address p points to either (a) an object >> // inside the output buffer, or (b), an object in the currently mapped static archive. >> uintx any_to_offset(address p) const; >> >> // The reverse of buffer_to_offset_u4() - converts scaled offset units back to buffered address. >> address offset_to_buffered_address(u4 offset_units) const; >> >> >> I am also OK to rename the method names to: `buffer_to_offset_bytes()` and `any_to_offset_bytes()`, if the new names are clearer. >> >> @ashu-mehra What do you think? > > Hi @XueleiFan, > > I've tried the suggested code with an archive size more than 4Gb, but it fails with an assertion: > > # Internal Error (aotMetaspace.cpp:1955), pid=96332, tid=4099 > # guarantee(archive_space_size < max_encoding_range_size - class_space_alignment) failed: Archive too large > > CDC archive was created successfully: > > [187.068s][info ][cds ] Shared file region (rw) 0: 822453584 bytes, addr 0x0000000800004000 file offset 0x00004000 crc 0x132b652e > [189.176s][info ][cds ] Shared file region (ro) 1: 3576115584 bytes, addr 0x0000000831060000 file offset 0x31060000 crc 0x71b020a2 > [197.653s][info ][cds ] Shared file region (ac) 4: 0 bytes > [198.870s][info ][cds ] Shared file region (bm) 2: 56555664 bytes, addr 0x0000000000000000 file offset 0x1062d4000 crc 0xbd87f804 > [199.504s][info ][cds ] Shared file region (hp) 3: 16091256 bytes, addr 0x00000000ff000000 file offset 0x1098c4000 crc 0x7834b7c3 > [199.684s][debug ][cds ] bm space: 56555664 [ 1.3% of total] out of 56555664 bytes [100.0% used] > [199.684s][debug ][cds ] hp space: 16091256 [ 0.4% of total] out of 16091256 bytes [100.0% used] at 0x0000000c6d000000 > [199.684s][debug ][cds ] total : 4471216088 [100.0% of total] out of 4471228536 bytes [100.0% used] > @alexeybakhtin Thank you for testing of bigger archives (>4GB). > > I was wondering if it is OK to support 4GB+ archive when UseCompactObjectHeaders is false. The following prototype works. However, we prefer UseCompactObjectHeaders in practice, and the biggest archive size (5.6M objects) is about 2.1G at this moment. Could we have 4GB archive size limit as an open issue, and address it separately if needed? > My test passes with the patch provided and default UseCompactObjectHeaders (false) However, it crashes with UseCompactObjectHeaders=true size_t CompressedClassSpaceSize=18446744073592111104 is outside the allowed range [ 1048576 ... 4294967296 ] # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (jvmFlagAccess.cpp:117), pid=35685, tid=5891 # fatal error: FLAG_SET_ERGO cannot be used to set an invalid value for CompressedClassSpaceSize Thrown from aotMetaspace.cpp:1963 (FLAG_SET_ERGO(CompressedClassSpaceSize, class_space_size);) In my case, max_encoding_range_size=4294967296, archive_space_size=4400103424, and gap_size=12304384. So, it causes a miscalculation of class_space_size ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3843772344 From xuelei at openjdk.org Tue Feb 3 22:09:22 2026 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Tue, 3 Feb 2026 22:09:22 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes In-Reply-To: References: <0T2Eu5ZqTNlBV3T3wsj11szuDgtVw7yxASYDRbAl5_0=.1d49b3e9-18fe-4763-b0cb-1d15e97f7272@github.com> Message-ID: On Tue, 3 Feb 2026 21:28:30 GMT, Alexey Bakhtin wrote: > However, it crashes with UseCompactObjectHeaders=true That's the expected behavior for the patch, as UseCompactObjectHeaders is limited to 4GB. I am exploring @iklam and @tstuefe's proposal so that archive size could be extended to 32GB with UseCompactObjectHeaders, and will update if I can go through the prototype. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3843927162 From dlong at openjdk.org Tue Feb 3 23:20:07 2026 From: dlong at openjdk.org (Dean Long) Date: Tue, 3 Feb 2026 23:20:07 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v6] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 09:01:19 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request incrementally with four additional commits since the last revision: > > - remove import of barrierSetC2 from runtime > - make shenandoah types private > - move _clone_type_Type to BarrierSetC2 > - remove whitespace src/hotspot/share/opto/type.cpp line 737: > 735: > 736: #if INCLUDE_SHENANDOAHGC > 737: ShenandoahBarrierSetC2::make_write_barrier_pre_Type(); How about replacing these 3 calls with a single call, something like ShenandoahBarrierSetC2::init()? Can these functions then be made private? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27279#discussion_r2761458562 From ghan at openjdk.org Wed Feb 4 01:14:05 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Wed, 4 Feb 2026 01:14:05 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v3] In-Reply-To: References: Message-ID: > Please review this change. Thanks! > > **Description:** > > VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). > > **Fix:** > > Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. > > **Test:** > > GHA Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: Fix improper use of FLAG_SET_DEFAULT() under !FLAG_IS_DEFAULT(UseAESCTRIntrinsics) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29338/files - new: https://git.openjdk.org/jdk/pull/29338/files/7eb2b386..44189900 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29338&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29338&range=01-02 Stats: 8 lines in 1 file changed: 4 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29338/head:pull/29338 PR: https://git.openjdk.org/jdk/pull/29338 From ghan at openjdk.org Wed Feb 4 01:17:32 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Wed, 4 Feb 2026 01:17:32 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() In-Reply-To: References: Message-ID: On Wed, 28 Jan 2026 11:06:09 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Hi @vnkozlov and @ascarpino , Sorry for the ping ? could you please take a look at this PR when you have a moment? > Hi @hgqxjj > > The fix is fine but not complete. There are more issues down in code. The main is that `FLAG_SET_DEFAULT()` is called under `!FLAG_IS_DEFAULT(UseAESCTRIntrinsics)) ` check which should be used only for `warning()` message. > > See lines 1162 and 1180-1192. Hi @vnkozlov, thank you for pointing this out. I've updated the PR, please take another look. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/29338#issuecomment-3844683167 From kvn at openjdk.org Wed Feb 4 03:25:24 2026 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Feb 2026 03:25:24 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v3] In-Reply-To: References: Message-ID: On Wed, 4 Feb 2026 01:14:05 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: > > Fix improper use of FLAG_SET_DEFAULT() under !FLAG_IS_DEFAULT(UseAESCTRIntrinsics) Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29338#pullrequestreview-3748734975 From duke at openjdk.org Wed Feb 4 03:48:36 2026 From: duke at openjdk.org (Shawn M Emery) Date: Wed, 4 Feb 2026 03:48:36 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v3] In-Reply-To: References: Message-ID: On Wed, 4 Feb 2026 01:14:05 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: > > Fix improper use of FLAG_SET_DEFAULT() under !FLAG_IS_DEFAULT(UseAESCTRIntrinsics) Updates look good. I also searched for the same logic error in the other architectures and reached the same conclusion as you did, that this was the only affected one. ------------- Marked as reviewed by smemery at github.com (no known OpenJDK username). PR Review: https://git.openjdk.org/jdk/pull/29338#pullrequestreview-3748775264 From duke at openjdk.org Wed Feb 4 05:23:26 2026 From: duke at openjdk.org (duke) Date: Wed, 4 Feb 2026 05:23:26 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v3] In-Reply-To: References: Message-ID: On Wed, 4 Feb 2026 01:14:05 GMT, Guanqiang Han wrote: >> Please review this change. Thanks! >> >> **Description:** >> >> VM crashes during startup on x86 when running with -XX:+UseAESCTRIntrinsics -XX:-UseAES. In this configuration, UseAESCTRIntrinsics may remain enabled while UseAES is explicitly disabled, and the VM generates AES-CTR stubs, hitting an assert(UseAES) in generate_counterMode_AESCrypt_Parallel(). >> >> **Fix:** >> >> Update x86 flag initialization to enforce the dependency between UseAESCTRIntrinsics and UseAES. When UseAES is disabled, explicitly disable UseAESCTRIntrinsics (with a warning when it was set on the command line), aligning behavior with the existing UseAES/UseAESIntrinsics gating and avoiding stub generation with inconsistent flag states. >> >> **Test:** >> >> GHA > > Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: > > Fix improper use of FLAG_SET_DEFAULT() under !FLAG_IS_DEFAULT(UseAESCTRIntrinsics) @hgqxjj Your change (at version 44189900c0e8921ea2504d1272ac7b180573539b) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29338#issuecomment-3845394909 From ghan at openjdk.org Wed Feb 4 05:23:27 2026 From: ghan at openjdk.org (Guanqiang Han) Date: Wed, 4 Feb 2026 05:23:27 GMT Subject: RFR: 8374516: -version asserts with "-XX:+UseAESCTRIntrinsics -XX:-UseAES": "need AES instructions and misaligned SSE support" in generate_counterMode_AESCrypt_Parallel() [v3] In-Reply-To: References: Message-ID: On Wed, 4 Feb 2026 03:22:27 GMT, Vladimir Kozlov wrote: >> Guanqiang Han has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix improper use of FLAG_SET_DEFAULT() under !FLAG_IS_DEFAULT(UseAESCTRIntrinsics) > > Looks good. @vnkozlov @smemery Thank you for the reviews. I've integrated this PR, could you please sponsor it? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/29338#issuecomment-3845405051 From jbhateja at openjdk.org Wed Feb 4 06:59:14 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Feb 2026 06:59:14 GMT Subject: RFR: 8376187: [VectorAPI] Define new lane type constants and pass them to intrinsic entries [v7] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 03:31:52 GMT, Jatin Bhateja wrote: >> As per [discussions ](https://github.com/openjdk/jdk/pull/28002#issuecomment-3789507594) on JDK-8370691 pull request, splitting out portion of PR#28002 into a separate patch in preparation of Float16 vector API support. >> >> Patch add new lane type constants and pass them to vector intrinsic entry points. >> >> All existing Vector API jtreg test are passing with the patch. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolution Hi @merykitty , looking for your review approval to check-in this ------------- PR Comment: https://git.openjdk.org/jdk/pull/29481#issuecomment-3845705837 From jbhateja at openjdk.org Wed Feb 4 07:01:44 2026 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Feb 2026 07:01:44 GMT Subject: RFR: 8376794: Enable copy and mismatch Partial Inlining for AMD AVX512 targets Message-ID: Partial in-lining handles copy and mismatch for small array sizes less than -XX:ArrayOperationPartialInlineSize bytes through JIT code rather than calling optimized stubs thereby saving costly call overhead. Enabling partial in-lining optimization for AMD EPYC servers supporting AVX-512 feature. Following are the performance numbers on Turin at fixed frequency of 2.1GHz image image Kindly review and share your feedback. Best Regards, Jatin ------------- Commit messages: - Extending micro-benchmark for short array mismatch - 8376794: Enable copy and mismatch Partial Inlining for AMD AVX512 targets Changes: https://git.openjdk.org/jdk/pull/29519/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29519&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8376794 Stats: 75 lines in 2 files changed: 47 ins; 5 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/29519.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29519/head:pull/29519 PR: https://git.openjdk.org/jdk/pull/29519 From xuelei at openjdk.org Wed Feb 4 07:42:31 2026 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Wed, 4 Feb 2026 07:42:31 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v4] In-Reply-To: References: Message-ID: On Tue, 3 Feb 2026 17:06:56 GMT, Xue-Lei Andrew Fan wrote: >> **Summary** >> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding. >> >> **Problem** >> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with: >> >> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes. >> >> >> **Solution** >> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 ? 8 bytes) while maintaining backward compatibility with the existing u4 offset fields. >> >> Current: address = base + offset_bytes (max ~2GB) >> Proposed: address = base + (offset_units << 3) (max 32GB) >> >> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero ? we're wasting them! >> >> Current byte offset (aligned to 8 bytes): >> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 >> ??? Always 000! >> >> Scaled offset (shift=3): >> 0x00000200 = Same address, but stored in 29 bits instead of 32 >> Frees up 3 bits ? 8x larger range! >> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 ??? Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits ? 8x larger range! >> >> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB. >> >> **Test** >> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB). >> >> Archive: >> - 300000 simple classes >> - 2000 mega-classes >> - 5000 FieldObject classes >> - Total: 307000 classes >> >> AOT Cache: >> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms >> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms >> >> Static CDS: >> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms >> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1... > > Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision: > > keep cds version I [prototyped](https://github.com/openjdk/jdk/pull/29556) the idea to support large CDS archives with UseCompactObjectHeaders. The test looks positive to me (tier1, tier2, LargeArchive for 3GB and 10GB, no visible performance impact). The [prototype](https://github.com/openjdk/jdk/pull/29556) is based on this pull request, and you may take a look at [this commit](https://github.com/openjdk/jdk/pull/29556/changes/e7a12c372480f405d2a08a75bdabac91c7328346) only. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3845839646 From ayang at openjdk.org Wed Feb 4 07:49:12 2026 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Feb 2026 07:49:12 GMT Subject: RFR: 8377141: G1: Remove unused local declaration in G1BarrierSetC2 Message-ID: Trivial removing dead code. Test: tier1 ------------- Commit messages: - g1-trivial-remove-local-var Changes: https://git.openjdk.org/jdk/pull/29561/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29561&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8377141 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/29561.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29561/head:pull/29561 PR: https://git.openjdk.org/jdk/pull/29561 From xgong at openjdk.org Wed Feb 4 07:56:08 2026 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 4 Feb 2026 07:56:08 GMT Subject: RFR: 8372136: VectorAPI: Refactor subword gather load API java implementation [v2] In-Reply-To: References: Message-ID: > The current subword (`byte`/`short`) gather load API implementation is not well-suited for platforms that provide native vector instructions for these operations. As **discussed in PR [1]**, we'd like to re-implement these APIs with a **unified cross-platform** solution. > > The main idea is to re-implement the API at Java-level, by performing multiple sub-gather operations. Each sub-gather operation loads a portion of elements using a specific index vector by calling the HotSpot intrinsic API. The partial results are then merged using vector `slice` and `or` operations. This design simplifies the VM compiler intrinsic implementation and better aligns with the Vector API design principles. > > Key changes: > 1. Re-implement the subword gather load API at the Java level. The HotSpot intrinsic `VectorSupport.loadWithMap` is simplified by reducing the vector index parameters from four (vix1-vix4) to a single parameter. > 2. Adjust the compiler intrinsic implementation to support the new Java API, including updates to the x86 backend implementation. > > The performance impact varies across different scenarios on X86. I tested the performance with different AVX levels on an X86 machine that supports AVX512. To achieve optimal performance, I also **applied PR [2]**, which improves the performance of the **`slice()`** API on X86. Following is the summarized performance gains, where: > > - "non masked" means the gather operation is not the masked gather API. > - "masked" means the gather operation is the masked gather API. > - "1 gather cases" means the gather API is implemented with a single gather operation. E.g. Load `Short128Vector` with `MaxVectorSize=256`. > - "2 gather cases" means the gather API is implemented with 2 parts of gather operations. E.g. Load `Short256Vector` with `MaxVectorSize=256`. > - "4 gather cases" means the gather API is implemented with 4 parts of gather operations. E.g. Load `Byte256Vector` with `MaxVectorSize=256`. > - "Un-intrinsified" means the gather operation is not supported to be intrinsified by hotspot. E.g. Load `Byte512Vector` with `MaxVectorSize=256`. The singificant performance uplifts comes from the Java-level changes which removes the vector index generation and range checks for such cases. > > > ---------------------------------------------------------------------------- > | UseAVX=3 | UseAVX=2 | > |-----------------------------|-----------------------------| > | non maske... Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge 'jdk:master' into JDK-8372136 - 8372136: VectorAPI: Refactor subword gather load API java implementation ------------- Changes: https://git.openjdk.org/jdk/pull/28520/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28520&range=01 Stats: 558 lines in 13 files changed: 383 ins; 78 del; 97 mod Patch: https://git.openjdk.org/jdk/pull/28520.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28520/head:pull/28520 PR: https://git.openjdk.org/jdk/pull/28520 From iklam at openjdk.org Wed Feb 4 08:02:12 2026 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 4 Feb 2026 08:02:12 GMT Subject: RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v4] In-Reply-To: References: Message-ID: <9Hax24ruGv2c1MOfSD_DV7pWuzf1Gd1cSMj8wOucw2I=.b52fc34a-70d6-4f50-8705-ef61b59dc9a3@github.com> On Wed, 4 Feb 2026 07:39:24 GMT, Xue-Lei Andrew Fan wrote: > I [prototyped](https://github.com/openjdk/jdk/pull/29556) the idea to support large CDS archives with UseCompactObjectHeaders. The test looks positive to me (tier1, tier2, LargeArchive for 3GB and 10GB, no visible performance impact). The [prototype](https://github.com/openjdk/jdk/pull/29556) is based on this pull request, and you may look at [this commit](https://github.com/openjdk/jdk/pull/29556/changes/e7a12c372480f405d2a08a75bdabac91c7328346) only. I think we should keep this PR simple -- only introduce the shift encoding and limit max archive size to 3.5GB. The next step, whether to change UseCompactObjectHeaders to allow a larger range should be done in a follow up, as there are many more interested parties that may have different opionons. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3845908335 From tschatzl at openjdk.org Wed Feb 4 08:11:02 2026 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 4 Feb 2026 08:11:02 GMT Subject: RFR: 8377141: G1: Remove unused local declaration in G1BarrierSetC2 In-Reply-To: References: Message-ID: On Wed, 4 Feb 2026 07:37:21 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. > > Test: tier1 Trivial, but copyright update is missing. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/29561#pullrequestreview-3749567979 From ayang at openjdk.org Wed Feb 4 08:15:57 2026 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Feb 2026 08:15:57 GMT Subject: RFR: 8377141: G1: Remove unused local declaration in G1BarrierSetC2 [v2] In-Reply-To: References: Message-ID: <_VXEdaxQL-SDtgfYgtFIDgja7IEMrnUo5ilJlNrxl6s=.22b89b4e-5a7a-4a2e-9915-cdcd55bf0bc9@github.com> > Trivial removing dead code. > > Test: tier1 Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29561/files - new: https://git.openjdk.org/jdk/pull/29561/files/81319b57..0b074d3e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29561&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29561&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/29561.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29561/head:pull/29561 PR: https://git.openjdk.org/jdk/pull/29561 From shade at openjdk.org Wed Feb 4 08:32:16 2026 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Feb 2026 08:32:16 GMT Subject: RFR: 8377141: G1: Remove unused local declaration in G1BarrierSetC2 [v2] In-Reply-To: <_VXEdaxQL-SDtgfYgtFIDgja7IEMrnUo5ilJlNrxl6s=.22b89b4e-5a7a-4a2e-9915-cdcd55bf0bc9@github.com> References: <_VXEdaxQL-SDtgfYgtFIDgja7IEMrnUo5ilJlNrxl6s=.22b89b4e-5a7a-4a2e-9915-cdcd55bf0bc9@github.com> Message-ID: On Wed, 4 Feb 2026 08:15:57 GMT, Albert Mingkun Yang wrote: >> Trivial removing dead code. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > copyright Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/29561#pullrequestreview-3749658092 From alanb at openjdk.org Wed Feb 4 08:40:12 2026 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 4 Feb 2026 08:40:12 GMT Subject: RFR: 8376568: Change Thread::getStackTrace to use handshake op for all cases [v4] In-Reply-To: References: Message-ID: > JDK-8364343 upgraded the virtual thread transition management to be independent of JVMTI. We can update java_lang_Thread::async_get_stack_trace to use it and remove the suspend + retry code from Thread.getStackTrace. > > A summary of the changes: > > - java_lang_Thread::async_get_stack_trace is changed to use the new handshake op so it can be called to get the stack trace of a started thread in any state > - Thread::getStackTrace is changed to use async_get_stack_trace for all cases > - The SUSPENDED substate in VirtualThread is removed > - JVM_CreateThreadSnapshot is changed to be usable when JVMTI is not compiled in > - ThreadSnapshotFactory::get_thread_snapshot is changed to not upcall to StackTraceElement to complete the init of the stack trace > > The changes mean that Thread::getStackTrace may be slower when sampling a virtual thread in transition. This case should be rare, and it isn't really a performance critical op anyway. I prototyped use a spin loop and an increasing wait time in MountUnmountDisabler::disable_transition_for_one to avoid the wait(10) but decided to leave it out for now. Future work may examine this issue as there may be other cases (with JVMTI) that would benefit from avoiding the wait. > > A future PR might propose to change Thread.getStackTrace to use ThreadSnapshot and allow java_lang_Thread::async_get_stack_trace be removed. This requires more extensive changes to ThreadSnapshotFactory to reduce overhead when only the stack trace is required. > > Testing: tier1-5. The changes has been already been tested in the loom repo for a few months. Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into JDK-8376568 - Merge branch 'master' into JDK-8376568 - Review feedback - Improve asserts - Cleanup - Merge branch 'master' into Thread.getStackTrace - Initial commit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/29461/files - new: https://git.openjdk.org/jdk/pull/29461/files/1b7c9ad7..9e7b266a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=29461&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=29461&range=02-03 Stats: 42808 lines in 491 files changed: 19796 ins; 14778 del; 8234 mod Patch: https://git.openjdk.org/jdk/pull/29461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29461/head:pull/29461 PR: https://git.openjdk.org/jdk/pull/29461