From dholmes at openjdk.org Mon Sep 1 00:59:46 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 1 Sep 2025 00:59:46 GMT Subject: RFR: 8366024: Remove unnecessary InstanceKlass::cast() [v2] In-Reply-To: References: <7QAphYNlPFcXmHo86DFEuVPnjjOwjoYMsksNtXFnsl0=.aa09352d-087e-4e23-805b-c05e38bb658d@github.com> Message-ID: <26NJ6xwmV3vbZdHpM320sGwZHiJ8Mn7d3gtJ--Ygz2U=.551378e5-35ff-4162-8984-ec0b5b585984@github.com> On Wed, 27 Aug 2025 18:11:33 GMT, Ioi Lam wrote: >> We have a lot of `InstanceKlass::cast(k)` calls where `k` is statically known to be an `InstanceKlass`. I fixed many instances of this pattern: >> >> >> InstanceKlass* x = ....; >> Klass* s = x->super(); // should call java_super() >> InstanceKlass::cast(s)->xyz(); >> >> >> The `super()` method has a very confusing API. It has the return type of `Klass*` because for for an `ObjArrayKlass` like `[Ljava/lang/String;`: >> >> - `super()` returns `[Ljava/lang/Object;` >> - `java_super()` returns `Ljava/lang/Object;` >> >> However, for `[Ljava/lang/Object;`, all `TypeArrayKlasses` and all `InstanceKlasses`, `super()` and `java_super()` return an identical value of that always have the actual type of `InstanceKlass*`. >> >> See here about the difference between `super()` and `java_super()`: https://github.com/openjdk/jdk/blob/7b9969dc8f20989497ff617abb45543d182b684d/src/hotspot/share/oops/klass.hpp#L218-L221 >> >> Unfortunately, we have a lot of code that incorrectly uses `super()` instead of `java_super()`, which leads to ` InstanceKlass::cast()` calls. I tried to fixed a bunch of easy ones in this PR, although there are a few more to go. >> >> I also fixed some calls to `local_interafaces()->at()` that widens the return type for `InstanceKlass*` to `Klass*`, which may lead to unnecessary ` InstanceKlass::cast()` calls. >> >> I also removed the `Klass::superklass()` API. This was used only in a few places and all of them can be safely replaced with `Klass::java_super()`. >> >> To avoid confusion, I think we should rename `super()` to something more obvious, but let's do that in a future PR. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @adinn comment - remove InstanceKlass::cast() in edgeUtils.cpp This seems fine though I will also comment that `java_super` seems completely mis-named in relation to `super` as there is nothing more `Java` about it. The implementation of `java_super` for arrays is quite odd - I'm not sure when we don't care about the actual superclass and want to go straight to object. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26908#pullrequestreview-3171880405 From iklam at openjdk.org Mon Sep 1 04:06:49 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 1 Sep 2025 04:06:49 GMT Subject: RFR: 8366024: Remove unnecessary InstanceKlass::cast() In-Reply-To: References: <7QAphYNlPFcXmHo86DFEuVPnjjOwjoYMsksNtXFnsl0=.aa09352d-087e-4e23-805b-c05e38bb658d@github.com> Message-ID: On Wed, 27 Aug 2025 09:17:31 GMT, Andrew Dinn wrote: >> We have a lot of `InstanceKlass::cast(k)` calls where `k` is statically known to be an `InstanceKlass`. I fixed many instances of this pattern: >> >> >> InstanceKlass* x = ....; >> Klass* s = x->super(); // should call java_super() >> InstanceKlass::cast(s)->xyz(); >> >> >> The `super()` method has a very confusing API. It has the return type of `Klass*` because for for an `ObjArrayKlass` like `[Ljava/lang/String;`: >> >> - `super()` returns `[Ljava/lang/Object;` >> - `java_super()` returns `Ljava/lang/Object;` >> >> However, for `[Ljava/lang/Object;`, all `TypeArrayKlasses` and all `InstanceKlasses`, `super()` and `java_super()` return an identical value of that always have the actual type of `InstanceKlass*`. >> >> See here about the difference between `super()` and `java_super()`: https://github.com/openjdk/jdk/blob/7b9969dc8f20989497ff617abb45543d182b684d/src/hotspot/share/oops/klass.hpp#L218-L221 >> >> Unfortunately, we have a lot of code that incorrectly uses `super()` instead of `java_super()`, which leads to ` InstanceKlass::cast()` calls. I tried to fixed a bunch of easy ones in this PR, although there are a few more to go. >> >> I also fixed some calls to `local_interafaces()->at()` that widens the return type for `InstanceKlass*` to `Klass*`, which may lead to unnecessary ` InstanceKlass::cast()` calls. >> >> I also removed the `Klass::superklass()` API. This was used only in a few places and all of them can be safely replaced with `Klass::java_super()`. >> >> To avoid confusion, I think we should rename `super()` to something more obvious, but let's do that in a future PR. > > I found two more occurrences of casting super() to InstanceKlass: > > src/hotspot/share/jfr/leakprofiler/chains/edgeUtils.cpp:79 > > ik = (const InstanceKlass*)ik->super(); > > src/hotspot/share/prims/jni.cpp:209 > > Klass* field_klass = k; > Klass* super_klass = field_klass->super(); > // With compressed oops the most super class with nonstatic fields would > // be the owner of fields embedded in the header. > while (InstanceKlass::cast(super_klass)->has_nonstatic_fields() && > InstanceKlass::cast(super_klass)->contains_field_offset(offset)) { > field_klass = super_klass; // super contains the field also > super_klass = field_klass->super(); > } > > The first one ought perhaps to be using InstanceKlass::superklass()? Thanks @adinn @dholmes-ora @coleenp for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/26908#issuecomment-3240791556 From iklam at openjdk.org Mon Sep 1 04:06:49 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 1 Sep 2025 04:06:49 GMT Subject: RFR: 8366024: Remove unnecessary InstanceKlass::cast() [v2] In-Reply-To: <26NJ6xwmV3vbZdHpM320sGwZHiJ8Mn7d3gtJ--Ygz2U=.551378e5-35ff-4162-8984-ec0b5b585984@github.com> References: <7QAphYNlPFcXmHo86DFEuVPnjjOwjoYMsksNtXFnsl0=.aa09352d-087e-4e23-805b-c05e38bb658d@github.com> <26NJ6xwmV3vbZdHpM320sGwZHiJ8Mn7d3gtJ--Ygz2U=.551378e5-35ff-4162-8984-ec0b5b585984@github.com> Message-ID: <9PsU6uvER6LcD__iFv6n23heIOuIAD3kxgkhAq-rA8A=.398b2031-0c5b-4444-b1e1-8dd32bcac4bc@github.com> On Mon, 1 Sep 2025 00:56:43 GMT, David Holmes wrote: > This seems fine though I will also comment that `java_super` seems completely mis-named in relation to `super` as there is nothing more `Java` about it. The implementation of `java_super` for arrays is quite odd - I'm not sure when we don't care about the actual superclass and want to go straight to object. As we discussed offline, I will try to add a new `InstanceKlass* InstanceKlass::super()` method. Then in most case people can just call `ik->super()` and it will do "the right thing". ------------- PR Comment: https://git.openjdk.org/jdk/pull/26908#issuecomment-3240793456 From iklam at openjdk.org Mon Sep 1 04:06:50 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 1 Sep 2025 04:06:50 GMT Subject: Integrated: 8366024: Remove unnecessary InstanceKlass::cast() In-Reply-To: <7QAphYNlPFcXmHo86DFEuVPnjjOwjoYMsksNtXFnsl0=.aa09352d-087e-4e23-805b-c05e38bb658d@github.com> References: <7QAphYNlPFcXmHo86DFEuVPnjjOwjoYMsksNtXFnsl0=.aa09352d-087e-4e23-805b-c05e38bb658d@github.com> Message-ID: <-_Xcpc5liL09SoygjhPedFMbo5EPoBsVlxm4LhEX1Co=.d6515314-7994-4867-848b-5eb43dab2d72@github.com> On Fri, 22 Aug 2025 23:45:41 GMT, Ioi Lam wrote: > We have a lot of `InstanceKlass::cast(k)` calls where `k` is statically known to be an `InstanceKlass`. I fixed many instances of this pattern: > > > InstanceKlass* x = ....; > Klass* s = x->super(); // should call java_super() > InstanceKlass::cast(s)->xyz(); > > > The `super()` method has a very confusing API. It has the return type of `Klass*` because for for an `ObjArrayKlass` like `[Ljava/lang/String;`: > > - `super()` returns `[Ljava/lang/Object;` > - `java_super()` returns `Ljava/lang/Object;` > > However, for `[Ljava/lang/Object;`, all `TypeArrayKlasses` and all `InstanceKlasses`, `super()` and `java_super()` return an identical value of that always have the actual type of `InstanceKlass*`. > > See here about the difference between `super()` and `java_super()`: https://github.com/openjdk/jdk/blob/7b9969dc8f20989497ff617abb45543d182b684d/src/hotspot/share/oops/klass.hpp#L218-L221 > > Unfortunately, we have a lot of code that incorrectly uses `super()` instead of `java_super()`, which leads to ` InstanceKlass::cast()` calls. I tried to fixed a bunch of easy ones in this PR, although there are a few more to go. > > I also fixed some calls to `local_interafaces()->at()` that widens the return type for `InstanceKlass*` to `Klass*`, which may lead to unnecessary ` InstanceKlass::cast()` calls. > > I also removed the `Klass::superklass()` API. This was used only in a few places and all of them can be safely replaced with `Klass::java_super()`. > > To avoid confusion, I think we should rename `super()` to something more obvious, but let's do that in a future PR. This pull request has now been integrated. Changeset: 2427c901 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/2427c901b31dbdccc6f8f39404875a0140460479 Stats: 99 lines in 16 files changed: 0 ins; 16 del; 83 mod 8366024: Remove unnecessary InstanceKlass::cast() Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/26908 From tschatzl at openjdk.org Mon Sep 1 13:24:37 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 1 Sep 2025 13:24:37 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v51] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 68 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * fix merge error - * forgot to actually save the files - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * remove unused G1DetachedRefinementStats_lock - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 58 more: https://git.openjdk.org/jdk/compare/98af1892...4a41b40b ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=50 Stats: 7100 lines in 112 files changed: 2584 ins; 3578 del; 938 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 1 14:24:34 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 1 Sep 2025 14:24:34 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v52] In-Reply-To: References: Message-ID: <8ppqEuxHuhM5tXXssVXaq-uQlixvWerkx3UxT-XYTmg=.b6a5ee73-33f0-4f58-b14d-808895da6347@github.com> > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * commit merge changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/4a41b40b..b3873d66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=51 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=50-51 Stats: 11 lines in 2 files changed: 0 ins; 11 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From fbredberg at openjdk.org Tue Sep 2 11:07:00 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 2 Sep 2025 11:07:00 GMT Subject: RFR: 8365190: Remove LockingMode related code from share Message-ID: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. Passes tier1-tier5 with no added problems. ------------- Commit messages: - 8365190: Remove LockingMode related code from share Changes: https://git.openjdk.org/jdk/pull/27041/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365190 Stats: 1268 lines in 50 files changed: 6 ins; 1129 del; 133 mod Patch: https://git.openjdk.org/jdk/pull/27041.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27041/head:pull/27041 PR: https://git.openjdk.org/jdk/pull/27041 From ayang at openjdk.org Tue Sep 2 13:40:41 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 2 Sep 2025 13:40:41 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3176613129 From rcastanedalo at openjdk.org Tue Sep 2 13:48:45 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 2 Sep 2025 13:48:45 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. src/hotspot/share/opto/phaseX.cpp line 1672: > 1670: // Found (linux x64 only?) with: > 1671: // serviceability/sa/ClhsdbThreadContext.java > 1672: // -XX:+UnlockExperimentalVMOptions -XX:LockingMode=1 -XX:+IgnoreUnrecognizedVMOptions -XX:VerifyIterativeGVN=1110 For traceability, I suggest leaving this line untouched and adding a comment in the next line clarifying that `-XX:LockingMode` is not available anymore. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2316150051 From kdnilsen at openjdk.org Tue Sep 2 18:37:42 2025 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 2 Sep 2025 18:37:42 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch In-Reply-To: References: Message-ID: On Fri, 29 Aug 2025 00:02:42 GMT, Cesar Soares Lucas wrote: > Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. > > The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. Thank you for doing this. Looks good to me. ------------- Marked as reviewed by kdnilsen (Committer). PR Review: https://git.openjdk.org/jdk/pull/26999#pullrequestreview-3177693651 From dlong at openjdk.org Tue Sep 2 19:27:37 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 2 Sep 2025 19:27:37 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic Message-ID: At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). ------------- Commit messages: - Merge branch 'openjdk:master' into 8366461-mh-invoke - first pass Changes: https://git.openjdk.org/jdk/pull/27059/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8366461 Stats: 538 lines in 68 files changed: 7 ins; 487 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From wkemper at openjdk.org Tue Sep 2 19:37:42 2025 From: wkemper at openjdk.org (William Kemper) Date: Tue, 2 Sep 2025 19:37:42 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch In-Reply-To: References: Message-ID: On Fri, 29 Aug 2025 00:02:42 GMT, Cesar Soares Lucas wrote: > Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. > > The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. LGTM ------------- Marked as reviewed by wkemper (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26999#pullrequestreview-3177866470 From dlong at openjdk.org Tue Sep 2 19:55:58 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 2 Sep 2025 19:55:58 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v2] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: arm32 build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/4998cacc..303305ae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From coleenp at openjdk.org Tue Sep 2 20:35:48 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Sep 2025 20:35:48 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. A few comments and suggestions for your next RFE. src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 344: > 342: volatile_nonstatic_field(ObjectMonitor, _entry_list, ObjectWaiter*) \ > 343: volatile_nonstatic_field(ObjectMonitor, _succ, int64_t) \ > 344: volatile_nonstatic_field(ObjectMonitor, _stack_locker, BasicLock*) \ There are some jvmci tests that check that the java side of jvmci matches, ie: make test TEST=compiler/jvmci src/hotspot/share/runtime/basicLock.hpp line 51: > 49: void set_bad_metadata_deopt() { set_metadata(badDispHeaderDeopt); } > 50: > 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } Also delete line 51 ? src/hotspot/share/runtime/javaThread.cpp line 2007: > 2005: #ifdef SUPPORT_MONITOR_COUNT > 2006: // Nothing to do. Just do some sanity check. > 2007: assert(_held_monitor_count == 0, "counter should not be used"); In further cleanup, can we now remove _held_monitor_count next? src/hotspot/share/runtime/lightweightSynchronizer.cpp line 769: > 767: > 768: // LightweightSynchronizer::inflate_locked_or_imse is used to to get an inflated > 769: // ObjectMonitor* when lightweight locking is used. It is used from contexts I guess you don't need the phrase "when lightweight locking is used". src/hotspot/share/runtime/lightweightSynchronizer.cpp line 823: > 821: ObjectMonitor* LightweightSynchronizer::inflate_into_object_header(oop object, ObjectSynchronizer::InflateCause cause, JavaThread* locking_thread, Thread* current) { > 822: > 823: // The JavaThread* locking_thread parameter is only used by lightweight locking and Same here. suggestion: // The JavaThread* locking parameter requires that the locking_thread == JavaThread::current, or is suspended // throughout the call by some other mechanism. src/hotspot/share/runtime/synchronizer.cpp line 542: > 540: } > 541: ObjectMonitor* monitor; > 542: monitor = LightweightSynchronizer::inflate_locked_or_imse(obj(), inflate_cause_notify, CHECK); Declare and initialize on the same line: ObjectMonitor* monitor = LightwightSynchronizer::inflate_locked_or_imse(obj...); src/hotspot/share/runtime/synchronizer.cpp line 557: > 555: > 556: ObjectMonitor* monitor; > 557: monitor = LightweightSynchronizer::inflate_locked_or_imse(obj(), inflate_cause_notify, CHECK); same here with ObjectMonitor* monitor = LIght ... I think we should have another RFE to look at eliminating the middle call. Call these in LIghtweightSynchronizer::notify, notifyAll and waitInterruptably directly and remove these functions. src/hotspot/share/runtime/synchronizer.inline.hpp line 48: > 46: assert(current == Thread::current(), "must be"); > 47: > 48: LightweightSynchronizer::enter(obj, lock, current); In the further RFE, we should remove these dispatch functions too. ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3177963667 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317054927 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317063086 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317069783 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317072241 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317077253 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317084869 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317088830 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317095107 From dlong at openjdk.org Tue Sep 2 20:52:32 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 2 Sep 2025 20:52:32 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: Message-ID: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with three additional commits since the last revision: - revert whitespace change - undo debug changes - cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/303305ae..eac482a5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=01-02 Stats: 7 lines in 4 files changed: 1 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From vlivanov at openjdk.org Tue Sep 2 21:11:46 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 2 Sep 2025 21:11:46 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Tue, 2 Sep 2025 20:52:32 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with three additional commits since the last revision: > > - revert whitespace change > - undo debug changes > - cleanup Nice cleanup! Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27059#pullrequestreview-3178139499 From dlong at openjdk.org Tue Sep 2 22:16:44 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 2 Sep 2025 22:16:44 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Tue, 2 Sep 2025 21:09:27 GMT, Vladimir Ivanov wrote: >> Dean Long has updated the pull request incrementally with three additional commits since the last revision: >> >> - revert whitespace change >> - undo debug changes >> - cleanup > > Nice cleanup! Looks good. Thanks @iwanowww ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27059#issuecomment-3246957430 From cslucas at openjdk.org Wed Sep 3 00:23:05 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 3 Sep 2025 00:23:05 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v2] In-Reply-To: References: Message-ID: > Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. > > The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Change shenandoah nmethod entry barrier fence for RISC-V (cherry picked from commit 495b07fe690ef7e3fe828fd2be27c4259c739c23) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26999/files - new: https://git.openjdk.org/jdk/pull/26999/files/62da55fb..3276b586 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26999&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26999&range=00-01 Stats: 9 lines in 4 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26999.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26999/head:pull/26999 PR: https://git.openjdk.org/jdk/pull/26999 From cslucas at openjdk.org Wed Sep 3 00:37:59 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 3 Sep 2025 00:37:59 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: > Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. > > The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Make PPC backend to also use concurrent code patching. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26999/files - new: https://git.openjdk.org/jdk/pull/26999/files/3276b586..4da31385 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26999&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26999&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26999.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26999/head:pull/26999 PR: https://git.openjdk.org/jdk/pull/26999 From fyang at openjdk.org Wed Sep 3 01:30:44 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 3 Sep 2025 01:30:44 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. I just checked the RISC-V part. LGTM. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26999#pullrequestreview-3178669201 From dzhang at openjdk.org Wed Sep 3 02:15:50 2025 From: dzhang at openjdk.org (Dingli Zhang) Date: Wed, 3 Sep 2025 02:15:50 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. LGTM, thanks! ------------- Marked as reviewed by dzhang (Author). PR Review: https://git.openjdk.org/jdk/pull/26999#pullrequestreview-3178728761 From cslucas at openjdk.org Wed Sep 3 04:29:50 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 3 Sep 2025 04:29:50 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. @TheRealMDoerr - @shipilev mentioned in private that you have access to PPC machines; could you please help testing this patch on PPC? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26999#issuecomment-3247640930 From mhaessig at openjdk.org Wed Sep 3 07:17:44 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 3 Sep 2025 07:17:44 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Tue, 2 Sep 2025 20:52:32 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with three additional commits since the last revision: > > - revert whitespace change > - undo debug changes > - cleanup Thank you for cleaning this up, @dean-long. I just have a drive-by comment. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/aarch64/AARCH64Frame.java line 372: > 370: // DEBUG_ONLY(verifyDeoptriginalPc(senderNm, raw_unextendedSp)); > 371: } > 372: } `Frame.java adjustUnextendedSP()` do not seem to do anything? Perhaps these could be cleaned up as well? ------------- PR Review: https://git.openjdk.org/jdk/pull/27059#pullrequestreview-3179245014 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2317990499 From aboldtch at openjdk.org Wed Sep 3 09:51:45 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 3 Sep 2025 09:51:45 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <0CMB0g0_Ru1hF5l2NA194kD1ouNwMXrB1667Uvl9mFQ=.b6834c47-be53-41bb-b726-2e282517d6bc@github.com> On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Nothing obvious seems to be missing from the removal. And the changes look correct. As @coleenp already mentioned there is even more code now that is effectively unused. Mostly to do with legacy + loom interactions. But I think it is fine to remove that in a follow up RFE. Similarly there are some nomenclature that should be updated, but I know you have expressed wanting to do that in a follow up RFE as well. I think it the main refactoring that are left are cleaning up the Synchronizer APIs, unifying some functions etc. _As for unifying LightweightSynchronizer with the ObjectSynchronizer, there might be an opportunity to let ObjectSynchronizer define the general API used by the rest of the VM to interact with the locking subsystem. And let LightweightSynchronizer contain all of the implementation. This could including moving the locking specific implementation details of relocking, deopting etc. behind an interface, decoupling them, and avoiding leaking implementation._ src/hotspot/share/runtime/basicLock.hpp line 51: > 49: void set_bad_metadata_deopt() { set_metadata(badDispHeaderDeopt); } > 50: > 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } The `badDispHeaderDeopt` and `badDispHeaderOSR` constants should also be renamed. src/hotspot/share/runtime/synchronizer.cpp line 634: > 632: } > 633: > 634: static intptr_t install_hash_code(Thread* current, oop obj) { A future RFE could potentially simplify and unify this with `FastHashCode` ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3179315618 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2318059361 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2318041555 From lmesnik at openjdk.org Wed Sep 3 15:49:44 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 3 Sep 2025 15:49:44 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. svc part looks good. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3181274100 From dlong at openjdk.org Thu Sep 4 00:31:42 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 4 Sep 2025 00:31:42 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Wed, 3 Sep 2025 07:12:20 GMT, Manuel H?ssig wrote: >> Dean Long has updated the pull request incrementally with three additional commits since the last revision: >> >> - revert whitespace change >> - undo debug changes >> - cleanup > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/aarch64/AARCH64Frame.java line 372: > >> 370: // DEBUG_ONLY(verifyDeoptriginalPc(senderNm, raw_unextendedSp)); >> 371: } >> 372: } > > `Frame.java adjustUnextendedSP()` do not seem to do anything? Perhaps these could be cleaned up as well? Yes, it's tempting to want to clean these up, but I noticed that SA code really tries to mirror the C++ code, so I'm inclined to leave it. Is there a Serviceability expert that would like to see this code cleaned up further? @plummercj , what do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2320526360 From dholmes at openjdk.org Thu Sep 4 02:35:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Sep 2025 02:35:49 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Looks good. Great cleanup! A couple of nits/suggestions. Thanks src/hotspot/share/c1/c1_LIRGenerator.cpp line 638: > 636: LIR_Opr hdr = lock; > 637: lock = new_hdr; > 638: CodeStub* slow_path = new MonitorExitStub(lock, true, monitor_no); It seems all creators for `MonitorExitStub` now pass `true` so that parameter can be removed. src/hotspot/share/runtime/basicLock.hpp line 40: > 38: // Used as a cache of the ObjectMonitor* used when locking. Must either > 39: // be nullptr or the ObjectMonitor* used when locking. > 40: volatile uintptr_t _metadata; So this could now be a properly typed and named field. Future RFE. src/hotspot/share/runtime/basicLock.hpp line 53: > 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } > 52: > 53: // For lightweight locking If the following are for lightweight locking then what are the two previous for? ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3179255959 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2317998409 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2320634417 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2320643671 From dholmes at openjdk.org Thu Sep 4 02:35:51 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Sep 2025 02:35:51 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 20:15:51 GMT, Coleen Phillimore wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/runtime/basicLock.hpp line 51: > >> 49: void set_bad_metadata_deopt() { set_metadata(badDispHeaderDeopt); } >> 50: >> 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } > > Also delete line 51 ? Still appears used in LIRAssembler. But the assert in which it is used doesn't really make sense as it just checks the offset == 0. > src/hotspot/share/runtime/lightweightSynchronizer.cpp line 769: > >> 767: >> 768: // LightweightSynchronizer::inflate_locked_or_imse is used to to get an inflated >> 769: // ObjectMonitor* when lightweight locking is used. It is used from contexts > > I guess you don't need the phrase "when lightweight locking is used". Even calling it "lightweight locking" is no longer needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2320640694 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2320651595 From fbredberg at openjdk.org Thu Sep 4 09:32:44 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 09:32:44 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 20:11:15 GMT, Coleen Phillimore wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 344: > >> 342: volatile_nonstatic_field(ObjectMonitor, _entry_list, ObjectWaiter*) \ >> 343: volatile_nonstatic_field(ObjectMonitor, _succ, int64_t) \ >> 344: volatile_nonstatic_field(ObjectMonitor, _stack_locker, BasicLock*) \ > > There are some jvmci tests that check that the java side of jvmci matches, ie: > > make test TEST=compiler/jvmci Tried that and got: `TEST SUCCESS` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321428734 From fbredberg at openjdk.org Thu Sep 4 09:48:45 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 09:48:45 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Thu, 4 Sep 2025 02:16:46 GMT, David Holmes wrote: >> src/hotspot/share/runtime/basicLock.hpp line 51: >> >>> 49: void set_bad_metadata_deopt() { set_metadata(badDispHeaderDeopt); } >>> 50: >>> 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } >> >> Also delete line 51 ? > > Still appears used in LIRAssembler. But the assert in which it is used doesn't really make sense as it just checks the offset == 0. This is unfortunately still used by some of the platform files. Since I want to keep this PR clean of platform changes, I have added this to the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321466475 From fbredberg at openjdk.org Thu Sep 4 10:00:47 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 10:00:47 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 20:19:06 GMT, Coleen Phillimore wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/runtime/javaThread.cpp line 2007: > >> 2005: #ifdef SUPPORT_MONITOR_COUNT >> 2006: // Nothing to do. Just do some sanity check. >> 2007: assert(_held_monitor_count == 0, "counter should not be used"); > > In further cleanup, can we now remove _held_monitor_count next? I think so, but I'm not sure. Anyhow I've added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). > src/hotspot/share/runtime/synchronizer.inline.hpp line 48: > >> 46: assert(current == Thread::current(), "must be"); >> 47: >> 48: LightweightSynchronizer::enter(obj, lock, current); > > In the further RFE, we should remove these dispatch functions too. Added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321483375 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321488805 From fbredberg at openjdk.org Thu Sep 4 10:00:45 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 10:00:45 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <0CMB0g0_Ru1hF5l2NA194kD1ouNwMXrB1667Uvl9mFQ=.b6834c47-be53-41bb-b726-2e282517d6bc@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <0CMB0g0_Ru1hF5l2NA194kD1ouNwMXrB1667Uvl9mFQ=.b6834c47-be53-41bb-b726-2e282517d6bc@github.com> Message-ID: On Wed, 3 Sep 2025 07:37:34 GMT, Axel Boldt-Christmas wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/runtime/basicLock.hpp line 51: > >> 49: void set_bad_metadata_deopt() { set_metadata(badDispHeaderDeopt); } >> 50: >> 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } > > The `badDispHeaderDeopt` and `badDispHeaderOSR` constants should also be renamed. Added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). > src/hotspot/share/runtime/synchronizer.cpp line 634: > >> 632: } >> 633: >> 634: static intptr_t install_hash_code(Thread* current, oop obj) { > > A future RFE could potentially simplify and unify this with `FastHashCode` Added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321495256 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321493463 From fbredberg at openjdk.org Thu Sep 4 10:13:43 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 10:13:43 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <_Ola7zEQaOGFemDMFTFkjmAouaLRxSJwarKbWnEKbDk=.4469db17-0021-450b-968e-b28267c626d4@github.com> On Wed, 3 Sep 2025 07:16:06 GMT, David Holmes wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/c1/c1_LIRGenerator.cpp line 638: > >> 636: LIR_Opr hdr = lock; >> 637: lock = new_hdr; >> 638: CodeStub* slow_path = new MonitorExitStub(lock, true, monitor_no); > > It seems all creators for `MonitorExitStub` now pass `true` so that parameter can be removed. Good find! But this affects platform files, and since I want to keep this PR clean of platform changes, I have added this to the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). > src/hotspot/share/runtime/basicLock.hpp line 40: > >> 38: // Used as a cache of the ObjectMonitor* used when locking. Must either >> 39: // be nullptr or the ObjectMonitor* used when locking. >> 40: volatile uintptr_t _metadata; > > So this could now be a properly typed and named field. Future RFE. I have added this to the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321521296 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321526089 From fbredberg at openjdk.org Thu Sep 4 10:20:44 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 10:20:44 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <-yO5kJRg9ghIK9ZHi8auSW22u__Ixsva-A4mvgadZic=.4f31754d-c0a9-4646-a3b7-40aade9678a4@github.com> On Thu, 4 Sep 2025 02:18:38 GMT, David Holmes wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > src/hotspot/share/runtime/basicLock.hpp line 53: > >> 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } >> 52: >> 53: // For lightweight locking > > If the following are for lightweight locking then what are the two previous for? For something old and soon forgotten. :) After I had integrated some platforms I realized that this was no longer needed, but since I want to keep this PR clean of platform changes, I have added this to the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321540639 From coleenp at openjdk.org Thu Sep 4 11:21:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Sep 2025 11:21:52 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: <-yO5kJRg9ghIK9ZHi8auSW22u__Ixsva-A4mvgadZic=.4f31754d-c0a9-4646-a3b7-40aade9678a4@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <-yO5kJRg9ghIK9ZHi8auSW22u__Ixsva-A4mvgadZic=.4f31754d-c0a9-4646-a3b7-40aade9678a4@github.com> Message-ID: On Thu, 4 Sep 2025 10:18:27 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/basicLock.hpp line 53: >> >>> 51: static int displaced_header_offset_in_bytes() { return metadata_offset_in_bytes(); } >>> 52: >>> 53: // For lightweight locking >> >> If the following are for lightweight locking then what are the two previous for? > > For something old and soon forgotten. :) > After I had integrated some platforms I realized that this was no longer needed, but since I want to keep this PR clean of platform changes, I have added this to the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). This makes sense to change the displaced header names in BasicLock with the changes to various disp_hdr and other register names in the next cleanup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321704905 From coleenp at openjdk.org Thu Sep 4 11:21:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Sep 2025 11:21:55 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Thu, 4 Sep 2025 02:26:26 GMT, David Holmes wrote: >> src/hotspot/share/runtime/lightweightSynchronizer.cpp line 769: >> >>> 767: >>> 768: // LightweightSynchronizer::inflate_locked_or_imse is used to to get an inflated >>> 769: // ObjectMonitor* when lightweight locking is used. It is used from contexts >> >> I guess you don't need the phrase "when lightweight locking is used". > > Even calling it "lightweight locking" is no longer needed. I think the name "lightweight locking" is used for the file name and class name, so the name is okay. It does help if you're trying to understand the history of the locking algorithm. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321711418 From coleenp at openjdk.org Thu Sep 4 11:25:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Sep 2025 11:25:53 GMT Subject: RFR: 8365190: Remove LockingMode related code from share In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Thu, 4 Sep 2025 09:55:23 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/synchronizer.inline.hpp line 48: >> >>> 46: assert(current == Thread::current(), "must be"); >>> 47: >>> 48: LightweightSynchronizer::enter(obj, lock, current); >> >> In the further RFE, we should remove these dispatch functions too. > > Added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). So to be clear, we should probably have 2+ RFEs that follow this. One to remove the cpu specific names like displaced header, one to remove the loom interactions and held_monitor_count, and another to remove the dispatches to LightweightSynchronizer and make ObjectSynchronizer's role in object locking clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2321723438 From fbredberg at openjdk.org Thu Sep 4 11:36:02 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 11:36:02 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v2] In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: Update after review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27041/files - new: https://git.openjdk.org/jdk/pull/27041/files/71038b71..3c1b56c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=00-01 Stats: 17 lines in 4 files changed: 1 ins; 4 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/27041.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27041/head:pull/27041 PR: https://git.openjdk.org/jdk/pull/27041 From fbredberg at openjdk.org Thu Sep 4 12:10:01 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 12:10:01 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: New version for Coleen ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27041/files - new: https://git.openjdk.org/jdk/pull/27041/files/3c1b56c5..f2fc9a5f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=01-02 Stats: 9 lines in 1 file changed: 0 ins; 2 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/27041.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27041/head:pull/27041 PR: https://git.openjdk.org/jdk/pull/27041 From coleenp at openjdk.org Thu Sep 4 12:10:01 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Sep 2025 12:10:01 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> Message-ID: On Thu, 4 Sep 2025 12:07:43 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for Coleen Looks great!! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3184882023 From fbredberg at openjdk.org Thu Sep 4 13:03:50 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 13:03:50 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 13:46:11 GMT, Roberto Casta?eda Lozano wrote: >> Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: >> >> New version for Coleen > > src/hotspot/share/opto/phaseX.cpp line 1672: > >> 1670: // Found (linux x64 only?) with: >> 1671: // serviceability/sa/ClhsdbThreadContext.java >> 1672: // -XX:+UnlockExperimentalVMOptions -XX:LockingMode=1 -XX:+IgnoreUnrecognizedVMOptions -XX:VerifyIterativeGVN=1110 > > For traceability, I suggest leaving this line untouched and adding a comment in the next line clarifying that `-XX:LockingMode` is not available anymore. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2322046338 From fbredberg at openjdk.org Thu Sep 4 13:03:51 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 13:03:51 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Thu, 4 Sep 2025 11:19:14 GMT, Coleen Phillimore wrote: >> Even calling it "lightweight locking" is no longer needed. > > I think the name "lightweight locking" is used for the file name and class name, so the name is okay. It does help if you're trying to understand the history of the locking algorithm. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2322050243 From fbredberg at openjdk.org Thu Sep 4 13:03:53 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 4 Sep 2025 13:03:53 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Tue, 2 Sep 2025 20:22:45 GMT, Coleen Phillimore wrote: >> Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: >> >> New version for Coleen > > src/hotspot/share/runtime/lightweightSynchronizer.cpp line 823: > >> 821: ObjectMonitor* LightweightSynchronizer::inflate_into_object_header(oop object, ObjectSynchronizer::InflateCause cause, JavaThread* locking_thread, Thread* current) { >> 822: >> 823: // The JavaThread* locking_thread parameter is only used by lightweight locking and > > Same here. suggestion: > > > // The JavaThread* locking parameter requires that the locking_thread == JavaThread::current, or is suspended > // throughout the call by some other mechanism. Fixed > src/hotspot/share/runtime/synchronizer.cpp line 542: > >> 540: } >> 541: ObjectMonitor* monitor; >> 542: monitor = LightweightSynchronizer::inflate_locked_or_imse(obj(), inflate_cause_notify, CHECK); > > Declare and initialize on the same line: > > ObjectMonitor* monitor = LightwightSynchronizer::inflate_locked_or_imse(obj...); Fixed > src/hotspot/share/runtime/synchronizer.cpp line 557: > >> 555: >> 556: ObjectMonitor* monitor; >> 557: monitor = LightweightSynchronizer::inflate_locked_or_imse(obj(), inflate_cause_notify, CHECK); > > same here with > ObjectMonitor* monitor = LIght ... > > I think we should have another RFE to look at eliminating the middle call. Call these in LIghtweightSynchronizer::notify, notifyAll and waitInterruptably directly and remove these functions. Added this to o the [next cleanup](https://bugs.openjdk.org/browse/JDK-8365191). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2322051302 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2322052236 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2322057900 From rcastanedalo at openjdk.org Thu Sep 4 13:34:52 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 4 Sep 2025 13:34:52 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> Message-ID: On Thu, 4 Sep 2025 12:10:01 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for Coleen Compiler changes look good, thanks! ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3185331487 From tschatzl at openjdk.org Thu Sep 4 15:08:53 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 4 Sep 2025 15:08:53 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v53] In-Reply-To: References: Message-ID: <-6I6w_xqk0J13gqMX8ZQel4elME1K0ZYs55Li2lmgd8=.fdd1d6fb-eaae-4d93-a2ac-74b03d3f4212@github.com> > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 71 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * improve logging for refinement, making it similar to marking logging - * commit merge changes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * fix merge error - * forgot to actually save the files - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 61 more: https://git.openjdk.org/jdk/compare/e1903557...2a614a2c ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=52 Stats: 7117 lines in 112 files changed: 2592 ins; 3585 del; 940 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From cslucas at openjdk.org Thu Sep 4 17:50:46 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 4 Sep 2025 17:50:46 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. @TheRealMDoerr - gentle nudge. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26999#issuecomment-3254845065 From cslucas at openjdk.org Fri Sep 5 17:27:12 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 5 Sep 2025 17:27:12 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: <2mUEu3_9zBcWAkwSos4lG5VxpzWLuqmVk6iiJlkaLYg=.949860c5-1d21-4634-89f0-df25b2714445@github.com> On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. @dbriemann - do you have access to a PPC machine that you can use to test this patch? TIA. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26999#issuecomment-3259209337 From dholmes at openjdk.org Mon Sep 8 05:38:15 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Sep 2025 05:38:15 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> Message-ID: On Thu, 4 Sep 2025 12:10:01 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for Coleen Marked as reviewed by dholmes (Reviewer). src/hotspot/share/runtime/lightweightSynchronizer.cpp line 768: > 766: } > 767: > 768: // LightweightSynchronizer::inflate_locked_or_imse is used to to get an Suggestion: // LightweightSynchronizer::inflate_locked_or_imse is used to get an src/hotspot/share/runtime/lightweightSynchronizer.cpp line 822: > 820: // The JavaThread* locking parameter requires that the > 821: // locking_thread == JavaThread::current, or is suspended throughout > 822: // the call by some other mechanism. Suggestion: // The JavaThread* locking parameter requires that the locking_thread == JavaThread::current, // or is suspended throughout the call by some other mechanism. ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3195015687 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2329176488 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2329179011 From kbarrett at openjdk.org Mon Sep 8 06:32:08 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Sep 2025 06:32:08 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess Message-ID: Please review this change that renames the all-static class `Atomic` to `AtomicAccess`. The reason for this name change is to allow the introduction of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). The PR has several commits, according to the specific category of change being made. It may be easier to review the PR by studying these individual commits. Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose to not rename the various "atomic_.*" and "atomic__.*" files. There are a number of comments containing the word "Atomic" that I didn't change. They are generically about atomic operations, and will just as well serve as referring to the future `Atomic`. Testing: mach5 tier1, GHA sanity tests. This is one of those changes where successful builds indicate the change is good. ------------- Commit messages: - update copyrights - misc cleanups - fix indentation from rename - rename Atomic => AtomicAccess in gtests - rename Atomic => AtomicAccess - change includes of atomic.hpp in gtests - change includes of atomic.hpp - rename atomic.hpp => atomicAccess.hpp Changes: https://git.openjdk.org/jdk/pull/27135/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27135&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8367014 Stats: 4894 lines in 427 files changed: 1237 ins; 1235 del; 2422 mod Patch: https://git.openjdk.org/jdk/pull/27135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27135/head:pull/27135 PR: https://git.openjdk.org/jdk/pull/27135 From fbredberg at openjdk.org Mon Sep 8 07:08:31 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Sep 2025 07:08:31 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v4] In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier5 with no added problems. Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: New version for David ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27041/files - new: https://git.openjdk.org/jdk/pull/27041/files/f2fc9a5f..905ef3fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27041&range=02-03 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/27041.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27041/head:pull/27041 PR: https://git.openjdk.org/jdk/pull/27041 From aboldtch at openjdk.org Mon Sep 8 07:08:31 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 8 Sep 2025 07:08:31 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v4] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Mon, 8 Sep 2025 07:05:47 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for David Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3195220478 From fbredberg at openjdk.org Mon Sep 8 07:08:32 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Sep 2025 07:08:32 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v3] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> <44Gctjipr64z9PfAvwdEgRo4pq8sFtmpZOE4JehO4rc=.c6532986-972f-466c-b7c6-75e063613fa9@github.com> Message-ID: On Mon, 8 Sep 2025 05:33:05 GMT, David Holmes wrote: >> Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: >> >> New version for Coleen > > src/hotspot/share/runtime/lightweightSynchronizer.cpp line 768: > >> 766: } >> 767: >> 768: // LightweightSynchronizer::inflate_locked_or_imse is used to to get an > > Suggestion: > > // LightweightSynchronizer::inflate_locked_or_imse is used to get an Fixed > src/hotspot/share/runtime/lightweightSynchronizer.cpp line 822: > >> 820: // The JavaThread* locking parameter requires that the >> 821: // locking_thread == JavaThread::current, or is suspended throughout >> 822: // the call by some other mechanism. > > Suggestion: > > // The JavaThread* locking parameter requires that the locking_thread == JavaThread::current, > // or is suspended throughout the call by some other mechanism. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2329325202 PR Review Comment: https://git.openjdk.org/jdk/pull/27041#discussion_r2329325682 From dholmes at openjdk.org Mon Sep 8 07:24:14 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Sep 2025 07:24:14 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v4] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: On Mon, 8 Sep 2025 07:08:31 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier5 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for David Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27041#pullrequestreview-3195269911 From dholmes at openjdk.org Mon Sep 8 08:20:10 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Sep 2025 08:20:10 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 06:26:03 GMT, Kim Barrett wrote: > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. LGTM! Thanks for taking this on! ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27135#pullrequestreview-3195448260 From tschatzl at openjdk.org Mon Sep 8 09:09:49 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 8 Sep 2025 09:09:49 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * sort includes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/2a614a2c..4601bf88 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=53 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=52-53 Stats: 7 lines in 4 files changed: 4 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From mdoerr at openjdk.org Mon Sep 8 09:34:18 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 8 Sep 2025 09:34:18 GMT Subject: RFR: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch [v3] In-Reply-To: References: Message-ID: On Wed, 3 Sep 2025 00:37:59 GMT, Cesar Soares Lucas wrote: >> Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. >> >> The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Make PPC backend to also use concurrent code patching. This change is a NOP for PPC64 since we're generating the same code for both patching types. GHA cross build has passed, so there's nothing to worry regarding PPC64. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26999#issuecomment-3265416475 From aph at openjdk.org Mon Sep 8 09:35:16 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 8 Sep 2025 09:35:16 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: Message-ID: <12OJX5AFklPfeUixNRHyTHo38EV4dFzLX8Dp-yvMVrI=.bcd6c93a-d462-4ed9-9d34-e8197c7fb04a@github.com> On Mon, 8 Sep 2025 06:26:03 GMT, Kim Barrett wrote: > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. `AtomicAccess` is a bit wordy, and this change is going to mess up diffs for backports terribly, but I can't think of a better way to do it. Thanks. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27135#pullrequestreview-3195732355 From kbarrett at openjdk.org Mon Sep 8 09:47:14 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Sep 2025 09:47:14 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: <12OJX5AFklPfeUixNRHyTHo38EV4dFzLX8Dp-yvMVrI=.bcd6c93a-d462-4ed9-9d34-e8197c7fb04a@github.com> References: <12OJX5AFklPfeUixNRHyTHo38EV4dFzLX8Dp-yvMVrI=.bcd6c93a-d462-4ed9-9d34-e8197c7fb04a@github.com> Message-ID: On Mon, 8 Sep 2025 09:32:43 GMT, Andrew Haley wrote: > `AtomicAccess` is a bit wordy, and this change is going to mess up diffs for backports terribly, but I can't think of a better way to do it. Thanks. There was a bunch of internal to Oracle bike shedding over the names already. But I'm open to more if someone thinks they have a better idea. Note that once we're all done with switching to `Atomic` where appropriate, I don't expect very many direct uses of `AtomicAccess` to remain (though there will be _some_). Diffs for backports are going to get messed up anyway, since most uses of `AtomicAccess` will eventually be switched over to `Atomic` style usage. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3265464120 From tschatzl at openjdk.org Mon Sep 8 09:50:19 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 8 Sep 2025 09:50:19 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. Internally it went through Oracle's tier1-8 without any issue again (that is revision https://github.com/openjdk/jdk/pull/23739/commits/b3873d66cd43518d5dc71e060fc52a13372dbfa5, but the changes since then were very cosmetic). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3265474466 From aph at openjdk.org Mon Sep 8 10:18:14 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 8 Sep 2025 10:18:14 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: <12OJX5AFklPfeUixNRHyTHo38EV4dFzLX8Dp-yvMVrI=.bcd6c93a-d462-4ed9-9d34-e8197c7fb04a@github.com> Message-ID: On Mon, 8 Sep 2025 09:44:26 GMT, Kim Barrett wrote: > > `AtomicAccess` is a bit wordy, and this change is going to mess up diffs for backports terribly, but I can't think of a better way to do it. Thanks. > > There was a bunch of internal to Oracle bike shedding over the names already. Sure, I imagine there was! It's a shame when decision making happens behind closed doors in a FOSS project, but public list bikeshedding would have been too much for this change. > Diffs for backports are going to get messed up anyway, since most uses of `AtomicAccess` will eventually be switched over to `Atomic` style usage. That's fair. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3265596419 From fbredberg at openjdk.org Mon Sep 8 10:30:29 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Sep 2025 10:30:29 GMT Subject: RFR: 8365190: Remove LockingMode related code from share [v4] In-Reply-To: References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <3pd2IyLkKN1NErQilmGDrxPygkH03OTIelZwKZhDkBs=.7a5176ff-395b-4f82-a0eb-ff50ec1401a2@github.com> On Mon, 8 Sep 2025 07:08:31 GMT, Fredrik Bredberg wrote: >> Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). >> >> This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. >> >> Passes tier1-tier7 with no added problems. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > New version for David Thank you all for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27041#issuecomment-3265647098 From fbredberg at openjdk.org Mon Sep 8 10:30:30 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Sep 2025 10:30:30 GMT Subject: Integrated: 8365190: Remove LockingMode related code from share In-Reply-To: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> References: <4pmrLv9G-kotkKQ_B1wEyXSAJ9Vm3cnxditElz641_E=.23c81f11-ece2-4894-ae2f-8763ad343a4d@github.com> Message-ID: <8z9GgG5Q8z1MaKgjotL4d7FtJLtrPv_7J0XcQd9gSpo=.f15375ac-ed3b-47da-aada-9ffb10f4eb56@github.com> On Tue, 2 Sep 2025 08:24:10 GMT, Fredrik Bredberg wrote: > Since the integration of [JDK-8359437](https://bugs.openjdk.org/browse/JDK-8359437) the `LockingMode` flag can no longer be set by the user. After that, a number of PRs has been integrated which has removed all `LockingMode` related code from all platforms (except from zero, which is done in this PR). > > This PR removes `LockingMode` related code from the shared (non-platform specific) files. It also removes the `LockingMode` variable itself. > > Passes tier1-tier7 with no added problems. This pull request has now been integrated. Changeset: a2726968 Author: Fredrik Bredberg URL: https://git.openjdk.org/jdk/commit/a272696813f2e5e896ac9de9985246aaeb9d476c Stats: 1277 lines in 50 files changed: 8 ins; 1137 del; 132 mod 8365190: Remove LockingMode related code from share Reviewed-by: aboldtch, dholmes, ayang, coleenp, lmesnik, rcastanedalo ------------- PR: https://git.openjdk.org/jdk/pull/27041 From cslucas at openjdk.org Mon Sep 8 21:47:21 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 8 Sep 2025 21:47:21 GMT Subject: Integrated: 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch In-Reply-To: References: Message-ID: On Fri, 29 Aug 2025 00:02:42 GMT, Cesar Soares Lucas wrote: > Please, review this patch to make nmethod entry barriers use `conc-instruction-and-data-patch` fence mechanics when ShenandoahGC is being used on AArch64. The patch also removes (including from JVMCI interface) the old constant that was being used only by Shenandoah on AArch64. > > The patch has been tested with functional and performance benchmarks on AArch64. Improvements in DaCapo and Renaissance benchmarks can be as high as 30%. Maximum critical Jops in SPEC improved by ~10%. This pull request has now been integrated. Changeset: 81a1e8e1 Author: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/81a1e8e1363446de499a59fc706221efde12dd86 Stats: 28 lines in 12 files changed: 0 ins; 19 del; 9 mod 8364936: Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch Reviewed-by: fyang, dzhang, kdnilsen, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/26999 From fyang at openjdk.org Tue Sep 9 04:09:26 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 9 Sep 2025 04:09:26 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes > We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. > > Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. > > Internally it went through Oracle's tier1-8 without any issue again (that is revision [b3873d6](https://github.com/openjdk/jdk/commit/b3873d66cd43518d5dc71e060fc52a13372dbfa5), but the changes since then were very cosmetic). Hi @tschatzl : I witnessed some conflicts when applying this on latest jdk head. Maybe you can do another merge and rebase? I can help rerun the tests on RISC-V. (Note that there is one failing test: `test/hotspot/jtreg/gtest/GTestWrapper.java` if I use the same base as in your repo, and that issue has been resolved by [JDK-8366897](https://bugs.openjdk.org/browse/JDK-8366897)) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3268789193 From iklam at openjdk.org Tue Sep 9 05:27:22 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Sep 2025 05:27:22 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods Message-ID: The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls. This PR is intended to be a strict clean-up that preserves existing behaviors. The following helper functions are added to simplify boilerplate code in JNI methods. static Klass* java_lang_Class::as_Klass(jobject java_class); static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); static InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class); Klass* get_klass_considering_redefinition(jclass cls, JavaThread *thread); InstanceKlass* get_instance_klass_considering_redefinition(jclass cls, JavaThread *thread); Notes: [1] Before this PR, we have both patterns: java_lang_Class::as_Klass(JNIHandles::resolve(cls)); java_lang_Class::as_Klass(JNIHandles::resolve_non_null(cls)); If `cls` is null, we would get an assert in both cases (`as_Klass()` requires a non-null input). Therefore, I am using `resolve_non_null()` in the `jobject` versions of `as_Klass()`. [2] I refactored `JvmtiThreadState::class_to_verify_considering_redefinition()` so that the caller of this funcation can avoid using `InstanceKlass::cast()`. This is possible because we ONLY store `InstanceKlass*` in `JvmtiThreadState::set_class_being_redefined()` I also removed a few cases of unnecessary `InstanceKlass::cast()`. ------------- Commit messages: - more fixes - tmp: Clean up java mirror handling in JNI methods Changes: https://git.openjdk.org/jdk/pull/27158/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27158&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8367142 Stats: 278 lines in 19 files changed: 46 ins; 65 del; 167 mod Patch: https://git.openjdk.org/jdk/pull/27158.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27158/head:pull/27158 PR: https://git.openjdk.org/jdk/pull/27158 From dholmes at openjdk.org Tue Sep 9 06:01:13 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Sep 2025 06:01:13 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods In-Reply-To: References: Message-ID: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> On Tue, 9 Sep 2025 05:21:10 GMT, Ioi Lam wrote: > The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls. > > This PR is intended to be a strict clean-up that preserves existing behaviors. > > The following helper functions are added to simplify boilerplate code in JNI methods. > > > static Klass* java_lang_Class::as_Klass(jobject java_class); > static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); > static InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class); > > Klass* get_klass_considering_redefinition(jclass cls, JavaThread *thread); > InstanceKlass* get_instance_klass_considering_redefinition(jclass cls, JavaThread *thread); > > > Notes: > > [1] Before this PR, we have both patterns: > > > java_lang_Class::as_Klass(JNIHandles::resolve(cls)); > java_lang_Class::as_Klass(JNIHandles::resolve_non_null(cls)); > > > If `cls` is null, we would get an assert in both cases (`as_Klass()` requires a non-null input). Therefore, I am using `resolve_non_null()` in the `jobject` versions of `as_Klass()`. > > [2] I refactored `JvmtiThreadState::class_to_verify_considering_redefinition()` so that the caller of this funcation can avoid using `InstanceKlass::cast()`. This is possible because we ONLY store `InstanceKlass*` in `JvmtiThreadState::set_class_being_redefined()` > > I also removed a few cases of unnecessary `InstanceKlass::cast()`. Sorry but I think this PR is trying to do too many things at once. It is pushing JNI resolving inside internal JVM methods, which I think is a bad thing - we resolve JNI references at the boundary to get the oop that the VM wants to work with. Internal APIs should be oblivious to jobject and friends IMO. Also there may be times that the JNI/JVM method needs get the oop itself before extracting the klass. You are converting klass to instanceKlass where it has to be instanceKlass e.g. with redefinition. This is a good thing, but it is a very distinct thing that deserves its own cleanup (as per previous changes in that area). You are defining a helper `java_lang_Class::as_InstanceKlass` to internalize the cast - this is fine but again a simple cleanup that would be better standalone IMO. It is also not clear that JVM/JNI API's are properly checking that the incoming jobject is in fact a class of the right kind (ie not an array class object). ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27158#pullrequestreview-3199393023 From iklam at openjdk.org Tue Sep 9 06:32:14 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Sep 2025 06:32:14 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods In-Reply-To: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Tue, 9 Sep 2025 05:58:32 GMT, David Holmes wrote: > Sorry but I think this PR is trying to do too many things at once. > > It is pushing JNI resolving inside internal JVM methods, which I think is a bad thing - we resolve JNI references at the boundary to get the oop that the VM wants to work with. Internal APIs should be oblivious to jobject and friends IMO. Also there may be times that the JNI/JVM method needs get the oop itself before extracting the klass. There are 92 header files that have the word `jobject` in them. There are 71 cpp/hpp files with the word `JNIHandles::resolve` in them. So I am not sure if we really have that separation anymore. There are 54 cases where we call `as_Klass(JNIHandles::resolve_non_null(x))`, so I think it would be nice to have a way to write less code. > You are converting klass to instanceKlass where it has to be instanceKlass e.g. with redefinition. This is a good thing, but it is a very distinct thing that deserves its own cleanup (as per previous changes in that area). I can move the considering_redefinition changes in a follow-up PR. > You are defining a helper `java_lang_Class::as_InstanceKlass` to internalize the cast - this is fine but again a simple cleanup that would be better standalone IMO. > > It is also not clear that JVM/JNI API's are properly checking that the incoming jobject is in fact a class of the right kind (ie not an array class object). I am just converting InstanceKlass* ik = InstanceKlass::cast(as_Klass(mirror)); to InstanceKlass* ik = as_InstanceKlass(mirror); The code already assumes that it has an InstanceKlass, and I am not changing that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3269072414 From dholmes at openjdk.org Tue Sep 9 07:41:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Sep 2025 07:41:02 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Tue, 9 Sep 2025 06:29:51 GMT, Ioi Lam wrote: > So I am not sure if we really have that separation anymore. I think it is more that there are many bits of code that actually form the "boundary" (prims, services, some runtime, jvmci, interpreter-related). But I guess it is hard to argue this makes it markedly worse. > The code already assumes that it has an InstanceKlass, and I am not changing that. Okay. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3269331429 From epeter at openjdk.org Tue Sep 9 07:49:38 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 9 Sep 2025 07:49:38 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java line 1489: > 1487: @Test > 1488: @IR(failOn = IRNode.ALLOC) > 1489: @IR(counts = {IRNode.COUNTED_LOOP, ">1"}) // not fail Can you explain what led to the difference? Can you also set an upper bound? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2332346985 From rcastanedalo at openjdk.org Tue Sep 9 07:55:59 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 9 Sep 2025 07:55:59 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: <_sI2w99vG5BN7dhrgs-LabreQQG3r-7RaP6ejUls1_w=.ae0c7c90-87db-4136-b96f-d8b29ce8bdcf@github.com> On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes The compiler changes (including the x64 and aarch64 platform-specific code) still look good, thanks for this work! ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3199884440 From stefank at openjdk.org Tue Sep 9 11:52:03 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Sep 2025 11:52:03 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 06:26:03 GMT, Kim Barrett wrote: > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. I've checked through the patch and it looks good. I found one file that lacked alignment adjustments. > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_." and "atomic__." files. Could you motivate why you chose to not do that? src/hotspot/os_cpu/bsd_x86/atomic_bsd_x86.hpp line 43: > 41: template<> > 42: template > 43: inline D AtomicAccess::PlatformAdd<4>::fetch_then_add(D volatile* dest, I add_value, This file has multiple alignment issues. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27135#pullrequestreview-3200998425 PR Review Comment: https://git.openjdk.org/jdk/pull/27135#discussion_r2333148324 From mdoerr at openjdk.org Tue Sep 9 14:06:25 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 9 Sep 2025 14:06:25 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: <3t1NXG7f_TsOxvst83-A9dDPQ9ZOyl5ybNJDtpkhvAk=.671d5ae9-e32c-42de-a519-f52929495100@github.com> On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes Still works fine on SAP supported platforms (including PPC64). @RealFYang: Three way merge succeded. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3270891086 From iklam at openjdk.org Tue Sep 9 22:09:06 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Sep 2025 22:09:06 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: Message-ID: > The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls. > > This PR is intended to be a strict clean-up that preserves existing behaviors. > > The following helper functions are added to simplify boilerplate code in JNI methods. > > > static Klass* java_lang_Class::as_Klass(jobject java_class); > static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); > static InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class); > > Klass* get_klass_considering_redefinition(jclass cls, JavaThread *thread); > InstanceKlass* get_instance_klass_considering_redefinition(jclass cls, JavaThread *thread); > > > Notes: > > [1] Before this PR, we have both patterns: > > > java_lang_Class::as_Klass(JNIHandles::resolve(cls)); > java_lang_Class::as_Klass(JNIHandles::resolve_non_null(cls)); > > > If `cls` is null, we would get an assert in both cases (`as_Klass()` requires a non-null input). Therefore, I am using `resolve_non_null()` in the `jobject` versions of `as_Klass()`. > > [2] I refactored `JvmtiThreadState::class_to_verify_considering_redefinition()` so that the caller of this funcation can avoid using `InstanceKlass::cast()`. This is possible because we ONLY store `InstanceKlass*` in `JvmtiThreadState::set_class_being_redefined()` > > I also removed a few cases of unnecessary `InstanceKlass::cast()`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' into 8367142-simplify-java-mirror-handling-in-jni-methods - @dholmes-ora comments - remove class_to_verify_considering_redefinition() changes, to be done in separate PR - more fixes - tmp: Clean up java mirror handling in JNI methods ------------- Changes: https://git.openjdk.org/jdk/pull/27158/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27158&range=01 Stats: 150 lines in 17 files changed: 20 ins; 32 del; 98 mod Patch: https://git.openjdk.org/jdk/pull/27158.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27158/head:pull/27158 PR: https://git.openjdk.org/jdk/pull/27158 From iklam at openjdk.org Tue Sep 9 22:12:48 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Sep 2025 22:12:48 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Tue, 9 Sep 2025 07:38:03 GMT, David Holmes wrote: > > So I am not sure if we really have that separation anymore. > > I think it is more that there are many bits of code that actually form the "boundary" (prims, services, some runtime, jvmci, interpreter-related). But I guess it is hard to argue this makes it markedly worse. Arguably the translation of Java mirrors to Klasses is also a boundary (from Java representation to VM representation) :-) In reality I think because jobjects are easy to use and are just another kind of handle (like Handle and OopHandle), the leakage from JNI code to other parts of VM just happened naturally. > > The code already assumes that it has an InstanceKlass, and I am not changing that. > > Okay. BTW I removed the JVMTI changes from this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3272437346 From dlong at openjdk.org Wed Sep 10 01:09:16 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Sep 2025 01:09:16 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Tue, 2 Sep 2025 20:52:32 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with three additional commits since the last revision: > > - revert whitespace change > - undo debug changes > - cleanup I need one more review for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27059#issuecomment-3272844515 From dholmes at openjdk.org Wed Sep 10 07:31:19 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Sep 2025 07:31:19 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Tue, 9 Sep 2025 22:10:19 GMT, Ioi Lam wrote: > Arguably the translation of Java mirrors to Klasses is also a boundary (from Java representation to VM representation) :-) The mirror is an oop, both oop and kklass are internal VM representations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3273683492 From kbarrett at openjdk.org Wed Sep 10 07:34:09 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Sep 2025 07:34:09 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v2] In-Reply-To: References: Message-ID: <8HMzxvTwZd1uSZCs528eM4pHsJVeKmFGtplElc8vXpk=.643b3706-7af2-40aa-835c-c3f8a785dd0e@github.com> > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - rename recently added Atomic:: => AtomicAccess:: - Merge branch 'master' into atomic-access - fix prefiously missed arg misalignments - rename test_atomic.cpp - update copyrights - misc cleanups - fix indentation from rename - rename Atomic => AtomicAccess in gtests - rename Atomic => AtomicAccess - change includes of atomic.hpp in gtests - ... and 2 more: https://git.openjdk.org/jdk/compare/af9b9050...11007c45 ------------- Changes: https://git.openjdk.org/jdk/pull/27135/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27135&range=01 Stats: 5577 lines in 430 files changed: 1587 ins; 1585 del; 2405 mod Patch: https://git.openjdk.org/jdk/pull/27135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27135/head:pull/27135 PR: https://git.openjdk.org/jdk/pull/27135 From kbarrett at openjdk.org Wed Sep 10 07:34:11 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Sep 2025 07:34:11 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v2] In-Reply-To: References: Message-ID: On Tue, 9 Sep 2025 11:49:26 GMT, Stefan Karlsson wrote: > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > > to not rename the various "atomic_." and "atomic__." files. > > Could you motivate why you chose to not do that? I thought about it, and waffled back and forth. But I was trying to do as much as possible of this change mechanically. Renaming a file involves multiple steps that weren't all easily scriptable. (And I'd already messed up a part of the renaming of atomic.hpp during patch development.) Also, this change is going to be hard for backports as it is, and I think renamings might make that worse. Renamings can also be annoying for archeology. But if you think it's important... > src/hotspot/os_cpu/bsd_x86/atomic_bsd_x86.hpp line 43: > >> 41: template<> >> 42: template >> 43: inline D AtomicAccess::PlatformAdd<4>::fetch_then_add(D volatile* dest, I add_value, > > This file has multiple alignment issues. Oops, completely missed that file. Will fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3273667707 PR Review Comment: https://git.openjdk.org/jdk/pull/27135#discussion_r2335809380 From kbarrett at openjdk.org Wed Sep 10 07:34:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Sep 2025 07:34:12 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 06:26:03 GMT, Kim Barrett wrote: > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. Updated for recent commits adding some new uses of `Atomic::` and merge conflicts. Running GHA sanity tests. mach5 tier1 already passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3273675624 From fyang at openjdk.org Wed Sep 10 07:35:02 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 10 Sep 2025 07:35:02 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes Just FYI: My local tier1-tier3 tests on linux-riscv64 platform are still good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3273695581 From aph at openjdk.org Wed Sep 10 07:43:37 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Sep 2025 07:43:37 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v2] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:23:58 GMT, Kim Barrett wrote: > Also, this change is > going to be hard for backports as it is, and I think renamings might make that > worse. Renamings can also be annoying for archeology. Speaking as an archaeologist and the lead of multiple backport projects, I agree with you, Kim. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3273724381 From tschatzl at openjdk.org Wed Sep 10 07:56:42 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 07:56:42 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * iwalulya: remove confusing comment - * sort includes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * improve logging for refinement, making it similar to marking logging - * commit merge changes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * fix merge error - * forgot to actually save the files - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=54 Stats: 7114 lines in 112 files changed: 2590 ins; 3583 del; 941 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Wed Sep 10 07:56:44 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 07:56:44 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:47:10 GMT, Thomas Schatzl wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> * sort includes > > We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. > > Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. > > Internally it went through Oracle's tier1-8 without any issue again (that is revision https://github.com/openjdk/jdk/pull/23739/commits/b3873d66cd43518d5dc71e060fc52a13372dbfa5, but the changes since then were very cosmetic). > Hi @tschatzl : I witnessed some conflicts when applying this on latest jdk head. Maybe you can do another merge and rebase? I can help rerun the tests on RISC-V. (Note that there is one failing test: `test/hotspot/jtreg/gtest/GTestWrapper.java` if I use the same base as in your repo, and that issue has been resolved by [JDK-8366897](https://bugs.openjdk.org/browse/JDK-8366897)) Merged. Thanks for another round of testing (also to @TheRealMDoerr). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3273765120 From tschatzl at openjdk.org Wed Sep 10 08:21:59 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 08:21:59 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Tue, 9 Sep 2025 07:47:00 GMT, Emanuel Peter wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> * sort includes > > test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java line 1489: > >> 1487: @Test >> 1488: @IR(failOn = IRNode.ALLOC) >> 1489: @IR(counts = {IRNode.COUNTED_LOOP, ">1"}) // not fail > > Can you explain what led to the difference? Can you also set an upper bound? With the decreased complexity of the barrier, C2 started unrolling that loop. I do not know how to determine a bound here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2335959266 From epeter at openjdk.org Wed Sep 10 08:31:57 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 10 Sep 2025 08:31:57 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 08:19:15 GMT, Thomas Schatzl wrote: >> test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java line 1489: >> >>> 1487: @Test >>> 1488: @IR(failOn = IRNode.ALLOC) >>> 1489: @IR(counts = {IRNode.COUNTED_LOOP, ">1"}) // not fail >> >> Can you explain what led to the difference? Can you also set an upper bound? > > With the decreased complexity of the barrier, C2 started unrolling that loop. I do not know how to determine a bound here. Is this going to be GC independent? What if the VM swaps to another GC ergonomically? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2335985122 From stefank at openjdk.org Wed Sep 10 08:56:25 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Sep 2025 08:56:25 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v2] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:23:58 GMT, Kim Barrett wrote: > > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > > > to not rename the various "atomic_." and "atomic__." files. > > > > > > Could you motivate why you chose to not do that? > > I thought about it, and waffled back and forth. But I was trying to do as much as possible of this change mechanically. Renaming a file involves multiple steps that weren't all easily scriptable. (And I'd already messed up a part of the renaming of atomic.hpp during patch development.) Also, this change is going to be hard for backports as it is, and I think renamings might make that worse. Renamings can also be annoying for archeology. But if you think it's important... Sure, renames are annoying. I do think that it is bad to leave inconsistent names in a long-lived, evolving code base. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3273979685 From chagedorn at openjdk.org Wed Sep 10 08:58:52 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 10 Sep 2025 08:58:52 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 08:28:26 GMT, Emanuel Peter wrote: >> With the decreased complexity of the barrier, C2 started unrolling that loop. I do not know how to determine a bound here. > > Is this going to be GC independent? What if the VM swaps to another GC ergonomically? The test runs with `vm.flagless`. But I suggest to just go with `>= 1` instead to be on the safe side. The purpose of this IR rule in the context of this test is really just that it does not fail and not about catching real issues/verifying the IR. If we still want to test the improved loop unrolling opportunities, I suggest to create a separate IR test for it, possibly in a separate RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336052099 From amitkumar at openjdk.org Wed Sep 10 09:22:33 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 10 Sep 2025 09:22:33 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:53:15 GMT, Thomas Schatzl wrote: >> We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. >> >> Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. >> >> Internally it went through Oracle's tier1-8 without any issue again (that is revision https://github.com/openjdk/jdk/pull/23739/commits/b3873d66cd43518d5dc71e060fc52a13372dbfa5, but the changes since then were very cosmetic). > >> Hi @tschatzl : I witnessed some conflicts when applying this on latest jdk head. Maybe you can do another merge and rebase? I can help rerun the tests on RISC-V. (Note that there is one failing test: `test/hotspot/jtreg/gtest/GTestWrapper.java` if I use the same base as in your repo, and that issue has been resolved by [JDK-8366897](https://bugs.openjdk.org/browse/JDK-8366897)) > > Merged. Thanks for another round of testing (also to @TheRealMDoerr). Hi @tschatzl, I see one build failure on s390x, # # A fatal error has been detected by the Java Runtime Environment: # # SIGILL (0x4) at pc=0x000003ffaad14a42, pid=1801882, tid=1801914 # # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) # Problematic frame: # V [libjvm.so+0x414a46] BarrierSetNMethod::set_guard_value(nmethod*, int, int)+0xfe # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -F%F -- %E" (or dumping to /home/amit/jdk/make/core.1801882) I am working on this on priority, but just in case you want to merge it, I will open another issue and push the fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3274080889 From aph at openjdk.org Wed Sep 10 09:36:49 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Sep 2025 09:36:49 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:56:42 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: > > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * iwalulya: remove confusing comment > - * sort includes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * improve logging for refinement, making it similar to marking logging > - * commit merge changes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * fix merge error > - * forgot to actually save the files > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp line 95: > 93: Label loop; > 94: Label next; > 95: const Register end = count; This aliasing of register names is tricky and confusing. A trap for maintainers, of the kind that people have fallen into already. src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp line 104: > 102: __ lsr(start, start, CardTable::card_shift()); > 103: __ lsr(end, end, CardTable::card_shift()); > 104: __ sub(count, end, start); // Number of bytes to mark Because `end` is inclusive, `count` here is the number of bytes to mark - 1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336167617 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336164656 From stefank at openjdk.org Wed Sep 10 09:41:38 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Sep 2025 09:41:38 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Tue, 9 Sep 2025 06:29:51 GMT, Ioi Lam wrote: > There are 54 cases where we call as_Klass(JNIHandles::resolve_non_null(x)), so I think it would be nice to have a way to write less code. I think you should deal with that by adding a helper function inside jni.cpp instead of extending the *Klass functions to take a jobject. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3274150749 From dholmes at openjdk.org Wed Sep 10 10:34:41 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Sep 2025 10:34:41 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: Message-ID: <9KIc7fkGt1OyYGBjvqaE4PWdALnulp2hf1zNIT64lHo=.c7dc978f-9e0a-419c-883c-d0cf43ff9155@github.com> On Tue, 9 Sep 2025 22:09:06 GMT, Ioi Lam wrote: >> The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls. >> >> This PR is intended to be a strict clean-up that preserves existing behaviors. >> >> The following helper functions are added to simplify boilerplate code in JNI methods. >> >> >> static Klass* java_lang_Class::as_Klass(jobject java_class); >> static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); >> static InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class); >> >> >> Note: >> >> Before this PR, we have both patterns: >> >> >> java_lang_Class::as_Klass(JNIHandles::resolve(cls)); >> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(cls)); >> >> >> If `cls` is null, we would get an assert in both cases (`as_Klass()` requires a non-null input). Therefore, I am using `resolve_non_null()` in the `jobject` versions of `as_Klass()`. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into 8367142-simplify-java-mirror-handling-in-jni-methods > - @dholmes-ora comments - remove class_to_verify_considering_redefinition() changes, to be done in separate PR > - more fixes > - tmp: Clean up java mirror handling in JNI methods Looks good. A couple of minor nits. Thanks src/hotspot/share/prims/jvm.cpp line 844: > 842: ResourceMark rm; > 843: const char * from_name = java_lang_Class::as_Klass(from)->external_name(); > 844: const char * to_name = java_lang_Class::as_Klass(result)->external_name(); Suggestion: const char* from_name = java_lang_Class::as_Klass(from)->external_name(); const char* to_name = java_lang_Class::as_Klass(result)->external_name(); pre-existing nit src/hotspot/share/prims/jvm.cpp line 910: > 908: > 909: InstanceKlass* lookup_k = java_lang_Class::as_InstanceKlass(lookup); > 910: // Lookup class must not be a primitive class (whose mirror null Klass*) Suggestion: // Lookup class must not be a primitive class (whose mirror is a null Klass*) src/hotspot/share/prims/jvm.cpp line 912: > 910: // Lookup class must not be a primitive class (whose mirror null Klass*) > 911: if (lookup_k == nullptr) { > 912: THROW_MSG_NULL(vmSymbols::java_lang_IllegalArgumentException(), "Lookup class is primitive"); This is a behavioural change. src/hotspot/share/prims/whitebox.cpp line 2166: > 2164: > 2165: WB_ENTRY(void, WB_LinkClass(JNIEnv* env, jobject wb, jclass clazz)) > 2166: Klass *k = java_lang_Class::as_Klass(clazz); Suggestion: Klass* k = java_lang_Class::as_Klass(clazz); src/hotspot/share/prims/whitebox.cpp line 2168: > 2166: Klass *k = java_lang_Class::as_Klass(clazz); > 2167: if (k->is_instance_klass()) { > 2168: InstanceKlass *ik = InstanceKlass::cast(k); Suggestion: InstanceKlass* ik = InstanceKlass::cast(k); ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27158#pullrequestreview-3205473716 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336277816 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336280699 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336282079 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336308779 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336309246 From amitkumar at openjdk.org Wed Sep 10 11:10:00 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 10 Sep 2025 11:10:00 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:53:15 GMT, Thomas Schatzl wrote: >> We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. >> >> Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. >> >> Internally it went through Oracle's tier1-8 without any issue again (that is revision https://github.com/openjdk/jdk/pull/23739/commits/b3873d66cd43518d5dc71e060fc52a13372dbfa5, but the changes since then were very cosmetic). > >> Hi @tschatzl : I witnessed some conflicts when applying this on latest jdk head. Maybe you can do another merge and rebase? I can help rerun the tests on RISC-V. (Note that there is one failing test: `test/hotspot/jtreg/gtest/GTestWrapper.java` if I use the same base as in your repo, and that issue has been resolved by [JDK-8366897](https://bugs.openjdk.org/browse/JDK-8366897)) > > Merged. Thanks for another round of testing (also to @TheRealMDoerr). > Hi @tschatzl, I see one build failure on s390x, > > ``` > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGILL (0x4) at pc=0x000003ffaad14a42, pid=1801882, tid=1801914 > # > # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > # Problematic frame: > # V [libjvm.so+0x414a46] BarrierSetNMethod::set_guard_value(nmethod*, int, int)+0xfe > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -F%F -- %E" (or dumping to /home/amit/jdk/make/core.1801882) > ``` > > I am working on this on priority, but just in case you want to merge it, I will open another issue and push the fix. Seems like build failure came from one change merged yesterday- https://github.com/openjdk/jdk/pull/26399, I am working on it and created https://bugs.openjdk.org/browse/JDK-8367325 ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3274462944 From tschatzl at openjdk.org Wed Sep 10 11:31:24 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 11:31:24 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:53:15 GMT, Thomas Schatzl wrote: >> We would like to finally wrap up this PR as we are about to Propose To Target [JEP 522](openjdk.org/jeps/522) to JDK 26, so it would be nice to get re-reviews from the reviewers that already signed it off if you think it is useful. >> >> Nothing much changed actually for months now, particularly in the area of target-specific support, but maybe one more rerun on the more exotic platforms (AIX/PPC, RISC-V) just in case would be fine. >> >> Internally it went through Oracle's tier1-8 without any issue again (that is revision https://github.com/openjdk/jdk/pull/23739/commits/b3873d66cd43518d5dc71e060fc52a13372dbfa5, but the changes since then were very cosmetic). > >> Hi @tschatzl : I witnessed some conflicts when applying this on latest jdk head. Maybe you can do another merge and rebase? I can help rerun the tests on RISC-V. (Note that there is one failing test: `test/hotspot/jtreg/gtest/GTestWrapper.java` if I use the same base as in your repo, and that issue has been resolved by [JDK-8366897](https://bugs.openjdk.org/browse/JDK-8366897)) > > Merged. Thanks for another round of testing (also to @TheRealMDoerr). > > Hi @tschatzl, I see one build failure on s390x, > > ``` > > # > > # A fatal error has been detected by the Java Runtime Environment: > > # > > # SIGILL (0x4) at pc=0x000003ffaad14a42, pid=1801882, tid=1801914 > > # > > # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) > > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > > # Problematic frame: > > # V [libjvm.so+0x414a46] BarrierSetNMethod::set_guard_value(nmethod*, int, int)+0xfe [...] > > > > I am working on this on priority, but just in case you want to merge it, I will open another issue and push the fix. > > Seems like build failure came from one change merged yesterday- #26399, I am working on it and created https://bugs.openjdk.org/browse/JDK-8367325 This should definitely be caused by that other change. Apologies for the trouble. Thank you for working on this. Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3274535127 From tschatzl at openjdk.org Wed Sep 10 11:36:26 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 11:36:26 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v56] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 75 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * iwalulya: remove confusing comment - * sort includes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * improve logging for refinement, making it similar to marking logging - * commit merge changes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * fix merge error - * forgot to actually save the files - ... and 65 more: https://git.openjdk.org/jdk/compare/edae355e...de1469d6 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=55 Stats: 7114 lines in 112 files changed: 2590 ins; 3583 del; 941 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Wed Sep 10 11:36:29 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 11:36:29 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:56:42 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: > > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * iwalulya: remove confusing comment > - * sort includes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * improve logging for refinement, making it similar to marking logging > - * commit merge changes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * fix merge error > - * forgot to actually save the files > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 That windows-x64 failure also seems to be an issue caused by another change merged in today :( Already fixed in [JDK-8367309](https://bugs.openjdk.org/browse/JDK-8367309). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3274545641 From coleenp at openjdk.org Wed Sep 10 11:44:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 10 Sep 2025 11:44:29 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: <4YATeT1_12Md5mCSr3ICRHSCfXBu0Ps4ha-rdMz4u_o=.a7bb56a2-b020-49cd-ad25-7618f9ec672e@github.com> On Wed, 10 Sep 2025 09:39:28 GMT, Stefan Karlsson wrote: >> There are 54 cases where we call as_Klass(JNIHandles::resolve_non_null(x)), so I think it would be nice to have a way to write less code. > I think you should deal with that by adding a helper function inside jni.cpp instead of extending the *Klass functions to take a jobject. I agree with this. I don't think you should move any more jobjects into the runtime code. The jobjects should stop at jni/jvm. They don't everywhere but that's something that over time we should fix. For instance, the ci creates jobjects but it should use OopHandles instead, except per-thread OopStorage isn't implemented yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3274573970 From tschatzl at openjdk.org Wed Sep 10 11:44:30 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 11:44:30 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v57] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * aph review, fix some comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/de1469d6..d0ca9062 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=56 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=55-56 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Wed Sep 10 11:44:34 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 11:44:34 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 09:33:42 GMT, Andrew Haley wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: >> >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * iwalulya: remove confusing comment >> - * sort includes >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * improve logging for refinement, making it similar to marking logging >> - * commit merge changes >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * fix merge error >> - * forgot to actually save the files >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 > > src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp line 95: > >> 93: Label loop; >> 94: Label next; >> 95: const Register end = count; > > This aliasing of register names is tricky and confusing. A trap for maintainers, of the kind that people have fallen into already. I can argue I was following precedence :) I see your point though. What do you suggest to do here? Use `count` throughout instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336468727 From coleenp at openjdk.org Wed Sep 10 11:48:37 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 10 Sep 2025 11:48:37 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v2] In-Reply-To: References: Message-ID: <3-WDWqpmB2N5SCpzxV0IbpeluWxHBm6RhFE5mkuiiRw=.137799b4-9854-4719-ac9f-85de4937bff4@github.com> On Tue, 9 Sep 2025 22:09:06 GMT, Ioi Lam wrote: >> The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls. >> >> This PR is intended to be a strict clean-up that preserves existing behaviors. >> >> The following helper functions are added to simplify boilerplate code in JNI methods. >> >> >> static Klass* java_lang_Class::as_Klass(jobject java_class); >> static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); >> static InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class); >> >> >> Note: >> >> Before this PR, we have both patterns: >> >> >> java_lang_Class::as_Klass(JNIHandles::resolve(cls)); >> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(cls)); >> >> >> If `cls` is null, we would get an assert in both cases (`as_Klass()` requires a non-null input). Therefore, I am using `resolve_non_null()` in the `jobject` versions of `as_Klass()`. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into 8367142-simplify-java-mirror-handling-in-jni-methods > - @dholmes-ora comments - remove class_to_verify_considering_redefinition() changes, to be done in separate PR > - more fixes > - tmp: Clean up java mirror handling in JNI methods jfr also is a boundary between Java -> Native code. That can have jobjects too, but resolve them before calling javaClasses. src/hotspot/share/classfile/javaClasses.hpp line 1905: > 1903: > 1904: > 1905: InstanceKlass* klass() const { return vmClasses::klass_at(klass_id); } Can you fix the indentation? src/hotspot/share/classfile/javaClasses.inline.hpp line 297: > 295: inline Klass* java_lang_Class::as_Klass(jobject java_class) { > 296: return as_Klass(JNIHandles::resolve_non_null(java_class)); > 297: } The JNIHandles shouldn't be imported to this file. The resolve should happen in the caller. src/hotspot/share/classfile/javaClasses.inline.hpp line 305: > 303: } > 304: > 305: inline InstanceKlass* java_lang_Class::as_InstanceKlass(jobject java_class) { Same here. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27158#pullrequestreview-3205765862 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336482267 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336483940 PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2336484858 From iwalulya at openjdk.org Wed Sep 10 12:03:47 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 10 Sep 2025 12:03:47 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 09:09:49 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * sort includes src/hotspot/share/gc/g1/g1Arguments.cpp line 250: > 248: // Verify that the maximum parallelism isn't too high to eventually overflow > 249: // the refcount in G1CardSetContainer. > 250: uint max_parallel_refinement_threads = G1ConcRefinementThreads; Don't need to local variable `max_parallel_refinement_threads` any more, can use G1ConcRefinementThreads directly. src/hotspot/share/gc/g1/g1CardTableClaimTable.inline.hpp line 95: > 93: i_card++; > 94: } > 95: assert(false, "should have early-returned"); maybe use `ShouldNotReachHere();` src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 216: > 214: class G1SwapThreadCardTableClosure : public HandshakeClosure { > 215: public: > 216: G1SwapThreadCardTableClosure() : HandshakeClosure("G1 Swap JT card table") { } Above on L206 we use "Refine Java Thread CT", here we use "Swap JT card table" not sure which is better "Java Thread CT" vs. "JT card table", but lets pick one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2332539864 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2332881834 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2333281435 From iwalulya at openjdk.org Wed Sep 10 12:03:42 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 10 Sep 2025 12:03:42 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v57] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 11:44:30 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * aph review, fix some comment Changes requested by iwalulya (Reviewer). src/hotspot/share/gc/g1/g1ConcurrentRefine.hpp line 154: > 152: void assert_state(State expected); > 153: > 154: static void snapshot_heap_into(G1CardTableClaimTable* sweep_table); Any reason why this is static? Don't need to send `_sweep_table` as an argument if it wasn't static. ------------- PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3200132750 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336517161 From iwalulya at openjdk.org Wed Sep 10 12:03:51 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 10 Sep 2025 12:03:51 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 07:56:42 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: > > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * iwalulya: remove confusing comment > - * sort includes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * improve logging for refinement, making it similar to marking logging > - * commit merge changes > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * fix merge error > - * forgot to actually save the files > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 src/hotspot/share/gc/g1/g1CardTableClaimTable.hpp line 43: > 41: // Claiming works on full region (all cards in region) or a range of contiguous cards > 42: // (chunk). Chunk size is given at construction time. > 43: class G1CardTableClaimTable : public CHeapObj { Do we need the `Table` in the `G1CardTableClaimTable` or can just calling it `G1CardTableClaimer` suffice? src/hotspot/share/gc/g1/g1ConcurrentRefine.hpp line 301: > 299: // Indicate that last refinement adjustment had been deferred due to not > 300: // obtaining the heap lock. > 301: bool wait_for_heap_lock() const { return _heap_was_locked; } `wait_for_heap_lock()` does not do any waiting, maybe just maintain `heap_was_locked` as the method name. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336340738 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336332933 From tschatzl at openjdk.org Wed Sep 10 12:21:33 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 12:21:33 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 10:44:20 GMT, Ivan Walulya wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: >> >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * iwalulya: remove confusing comment >> - * sort includes >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * improve logging for refinement, making it similar to marking logging >> - * commit merge changes >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * fix merge error >> - * forgot to actually save the files >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - ... and 64 more: https://git.openjdk.org/jdk/compare/9e3fa321...e7c3a067 > > src/hotspot/share/gc/g1/g1CardTableClaimTable.hpp line 43: > >> 41: // Claiming works on full region (all cards in region) or a range of contiguous cards >> 42: // (chunk). Chunk size is given at construction time. >> 43: class G1CardTableClaimTable : public CHeapObj { > > Do we need the `Table` in the `G1CardTableClaimTable` or can just calling it `G1CardTableClaimer` suffice? This is the table the claimer below works on. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336568870 From tschatzl at openjdk.org Wed Sep 10 12:40:11 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 12:40:11 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v58] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai review * tried to remove "logged card" terminology for the current "pending card" one ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/d0ca9062..b47c7b01 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=57 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=56-57 Stats: 55 lines in 11 files changed: 0 ins; 1 del; 54 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From epeter at openjdk.org Wed Sep 10 12:55:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 10 Sep 2025 12:55:29 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v54] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 08:53:13 GMT, Christian Hagedorn wrote: >> Is this going to be GC independent? What if the VM swaps to another GC ergonomically? > > The test runs with `vm.flagless`. But I suggest to just go with `>= 1` instead to be on the safe side. The purpose of this IR rule in the context of this test is really just that it does not fail and not about catching real issues/verifying the IR. > > If we still want to test the improved loop unrolling opportunities, I suggest to create a separate IR test for it, possibly in a separate RFE. I suppose that's probably not possible, as far as I know we always run with G1GC, so it should be fine :) We could put an upper bound of 3, maybe 4 here though. Just see what passes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336663260 From iwalulya at openjdk.org Wed Sep 10 14:54:29 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 10 Sep 2025 14:54:29 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v58] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 12:40:11 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai review > * tried to remove "logged card" terminology for the current "pending card" one src/hotspot/share/gc/g1/g1ConcurrentRefineThread.hpp line 36: > 34: class G1ConcurrentRefine; > 35: > 36: // Concurrent refinement control thread watching card mark accrual on the card Suggestion: // Concurrent refinement control thread watching card mark accrual on the card table src/hotspot/share/gc/g1/g1GCPhaseTimes.hpp line 182: > 180: double _cur_optional_merge_heap_roots_time_ms; > 181: // Included in above merge and optional-merge time. > 182: double _cur_distribute_log_buffers_time_ms; No longer used. src/hotspot/share/gc/g1/g1HeapRegion.hpp line 41: > 39: class G1CardSet; > 40: class G1CardSetConfiguration; > 41: class G1CardTable; Do we need the Forward declaration here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336962845 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2336992289 PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2337004685 From tschatzl at openjdk.org Wed Sep 10 15:28:39 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Sep 2025 15:28:39 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/b47c7b01..c469c137 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=58 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=57-58 Stats: 9 lines in 3 files changed: 0 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From iklam at openjdk.org Wed Sep 10 22:46:00 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Sep 2025 22:46:00 GMT Subject: RFR: 8367142: Simplify java mirror handling in JNI methods [v3] In-Reply-To: References: Message-ID: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> > The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls by adding a new function: > > > static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); > > > This PR is intended to be a strict clean-up that preserves existing behaviors. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Removed the (jobject) version of as_Klass/as_InstanceKlass ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27158/files - new: https://git.openjdk.org/jdk/pull/27158/files/d943d2fe..f8634eff Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27158&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27158&range=01-02 Stats: 70 lines in 9 files changed: 4 ins; 13 del; 53 mod Patch: https://git.openjdk.org/jdk/pull/27158.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27158/head:pull/27158 PR: https://git.openjdk.org/jdk/pull/27158 From iklam at openjdk.org Wed Sep 10 22:50:30 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Sep 2025 22:50:30 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> References: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> Message-ID: On Wed, 10 Sep 2025 22:46:00 GMT, Ioi Lam wrote: >> The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls by adding a new function: >> >> >> static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); >> >> >> This PR is intended to be a strict clean-up that preserves existing behaviors. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Removed the (jobject) version of as_Klass/as_InstanceKlass I didn't realize that my attempt to remove the `JNIHandles::resolve()` boilerplate can be conversional. I can't put a helper function in jni.cpp because this pattern is used in several files. I've reverted to the old code that makes the explicit calls to `JNIHandles::resolve()`. I updated the JBS issue text as this PR is now only for reducing the number of InstanceKlass::cast() calls. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3276767203 From iklam at openjdk.org Wed Sep 10 22:50:33 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Sep 2025 22:50:33 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v2] In-Reply-To: <3-WDWqpmB2N5SCpzxV0IbpeluWxHBm6RhFE5mkuiiRw=.137799b4-9854-4719-ac9f-85de4937bff4@github.com> References: <3-WDWqpmB2N5SCpzxV0IbpeluWxHBm6RhFE5mkuiiRw=.137799b4-9854-4719-ac9f-85de4937bff4@github.com> Message-ID: On Wed, 10 Sep 2025 11:43:33 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' into 8367142-simplify-java-mirror-handling-in-jni-methods >> - @dholmes-ora comments - remove class_to_verify_considering_redefinition() changes, to be done in separate PR >> - more fixes >> - tmp: Clean up java mirror handling in JNI methods > > src/hotspot/share/classfile/javaClasses.hpp line 1905: > >> 1903: >> 1904: >> 1905: InstanceKlass* klass() const { return vmClasses::klass_at(klass_id); } > > Can you fix the indentation? I removed all indentation alignments from this class, as they no longer seem warranted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2338071887 From iklam at openjdk.org Wed Sep 10 22:54:44 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Sep 2025 22:54:44 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v2] In-Reply-To: <9KIc7fkGt1OyYGBjvqaE4PWdALnulp2hf1zNIT64lHo=.c7dc978f-9e0a-419c-883c-d0cf43ff9155@github.com> References: <9KIc7fkGt1OyYGBjvqaE4PWdALnulp2hf1zNIT64lHo=.c7dc978f-9e0a-419c-883c-d0cf43ff9155@github.com> Message-ID: On Wed, 10 Sep 2025 10:18:45 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' into 8367142-simplify-java-mirror-handling-in-jni-methods >> - @dholmes-ora comments - remove class_to_verify_considering_redefinition() changes, to be done in separate PR >> - more fixes >> - tmp: Clean up java mirror handling in JNI methods > > src/hotspot/share/prims/jvm.cpp line 912: > >> 910: // Lookup class must not be a primitive class (whose mirror null Klass*) >> 911: if (lookup_k == nullptr) { >> 912: THROW_MSG_NULL(vmSymbols::java_lang_IllegalArgumentException(), "Lookup class is primitive"); > > This is a behavioural change. I reverted the change to the error message. I don't know how we will ever get a primitive class in there and who would be reading the error message. I added a comment saying the error message is wrong, so people reading this code will not get confused. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27158#discussion_r2338076531 From dholmes at openjdk.org Thu Sep 11 00:41:22 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Sep 2025 00:41:22 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> References: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> Message-ID: On Wed, 10 Sep 2025 22:46:00 GMT, Ioi Lam wrote: >> The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls by adding a new function: >> >> >> static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); >> >> >> This PR is intended to be a strict clean-up that preserves existing behaviors. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Removed the (jobject) version of as_Klass/as_InstanceKlass Reduced set of changes still looks good. I was prepared to be accommodating on the broader change, but seems others agreed with my initial position. :) ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27158#pullrequestreview-3208294346 From kbarrett at openjdk.org Thu Sep 11 04:23:16 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 11 Sep 2025 04:23:16 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v2] In-Reply-To: <8HMzxvTwZd1uSZCs528eM4pHsJVeKmFGtplElc8vXpk=.643b3706-7af2-40aa-835c-c3f8a785dd0e@github.com> References: <8HMzxvTwZd1uSZCs528eM4pHsJVeKmFGtplElc8vXpk=.643b3706-7af2-40aa-835c-c3f8a785dd0e@github.com> Message-ID: On Wed, 10 Sep 2025 07:34:09 GMT, Kim Barrett wrote: >> Please review this change that renames the all-static class `Atomic` to >> `AtomicAccess`. The reason for this name change is to allow the introduction >> of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). >> >> The PR has several commits, according to the specific category of change being >> made. It may be easier to review the PR by studying these individual commits. >> >> Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose >> to not rename the various "atomic_.*" and "atomic__.*" files. >> >> There are a number of comments containing the word "Atomic" that I didn't >> change. They are generically about atomic operations, and will just as well >> serve as referring to the future `Atomic`. >> >> Testing: mach5 tier1, GHA sanity tests. >> This is one of those changes where successful builds indicate the change is good. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - rename recently added Atomic:: => AtomicAccess:: > - Merge branch 'master' into atomic-access > - fix prefiously missed arg misalignments > - rename test_atomic.cpp > - update copyrights > - misc cleanups > - fix indentation from rename > - rename Atomic => AtomicAccess in gtests > - rename Atomic => AtomicAccess > - change includes of atomic.hpp in gtests > - ... and 2 more: https://git.openjdk.org/jdk/compare/af9b9050...11007c45 Needs re-review because of updates to deal with merge conflicts and to update newly merged code that needed the renamings applied. @stefank @theRealAph @dholmes-ora Hm, now says there is a new merge conflict. Guess you folks should hold off on re-review. src/hotspot/cpu/aarch64/gc/shared/barrierSetNMethod_aarch64.cpp line 118: > 116: } > 117: > 118: void set_value(int value, int bit_mask) { This and similar changes for arm and riscv came from a merged changeset. I took that new version and updated it for `Atomic::` => `AtomicAccess::` renaming. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 394: > 392: last_recompute_check = os::javaTimeNanos(); > 393: } > 394: DEBUG_ONLY(if (AtomicAccess::load_acquire(&_out_of_stack_walking_enabled)) {) Another case of taking a merge update and then adjusting for `Atomic::` => `AtomicAccess::`. src/hotspot/share/runtime/basicLock.inline.hpp line 32: > 30: #include "runtime/objectMonitor.inline.hpp" > 31: > 32: inline markWord BasicLock::displaced_header() const { These functions were removed by a merge update. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3277397927 PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3277401578 PR Review Comment: https://git.openjdk.org/jdk/pull/27135#discussion_r2338471419 PR Review Comment: https://git.openjdk.org/jdk/pull/27135#discussion_r2338472405 PR Review Comment: https://git.openjdk.org/jdk/pull/27135#discussion_r2338473094 From kbarrett at openjdk.org Thu Sep 11 05:30:11 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 11 Sep 2025 05:30:11 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v3] In-Reply-To: References: Message-ID: <2uwnMiCfWvw9BT15Us6UZqTE_JD4-OGNIsjjzZotu9Y=.7efc037c-5d91-4b7b-88be-c48adb659ff6@github.com> > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'master' into atomic-access - rename recently added Atomic:: => AtomicAccess:: - Merge branch 'master' into atomic-access - fix prefiously missed arg misalignments - rename test_atomic.cpp - update copyrights - misc cleanups - fix indentation from rename - rename Atomic => AtomicAccess in gtests - rename Atomic => AtomicAccess - ... and 3 more: https://git.openjdk.org/jdk/compare/134c3ef4...00ecd55c ------------- Changes: https://git.openjdk.org/jdk/pull/27135/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27135&range=02 Stats: 5577 lines in 430 files changed: 1587 ins; 1585 del; 2405 mod Patch: https://git.openjdk.org/jdk/pull/27135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27135/head:pull/27135 PR: https://git.openjdk.org/jdk/pull/27135 From dholmes at openjdk.org Thu Sep 11 06:00:20 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Sep 2025 06:00:20 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v3] In-Reply-To: <2uwnMiCfWvw9BT15Us6UZqTE_JD4-OGNIsjjzZotu9Y=.7efc037c-5d91-4b7b-88be-c48adb659ff6@github.com> References: <2uwnMiCfWvw9BT15Us6UZqTE_JD4-OGNIsjjzZotu9Y=.7efc037c-5d91-4b7b-88be-c48adb659ff6@github.com> Message-ID: On Thu, 11 Sep 2025 05:30:11 GMT, Kim Barrett wrote: >> Please review this change that renames the all-static class `Atomic` to >> `AtomicAccess`. The reason for this name change is to allow the introduction >> of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). >> >> The PR has several commits, according to the specific category of change being >> made. It may be easier to review the PR by studying these individual commits. >> >> Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose >> to not rename the various "atomic_.*" and "atomic__.*" files. >> >> There are a number of comments containing the word "Atomic" that I didn't >> change. They are generically about atomic operations, and will just as well >> serve as referring to the future `Atomic`. >> >> Testing: mach5 tier1, GHA sanity tests. >> This is one of those changes where successful builds indicate the change is good. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Merge branch 'master' into atomic-access > - rename recently added Atomic:: => AtomicAccess:: > - Merge branch 'master' into atomic-access > - fix prefiously missed arg misalignments > - rename test_atomic.cpp > - update copyrights > - misc cleanups > - fix indentation from rename > - rename Atomic => AtomicAccess in gtests > - rename Atomic => AtomicAccess > - ... and 3 more: https://git.openjdk.org/jdk/compare/134c3ef4...00ecd55c Still good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27135#pullrequestreview-3208979512 From stefank at openjdk.org Thu Sep 11 06:47:16 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Sep 2025 06:47:16 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: References: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> Message-ID: <5c8zxQ36O0VJk5X7lu5QRuh7LhM49ZdpUTmDV-qc2l4=.9a4c2d95-9a0a-4ba6-8f73-b05e90030f6e@github.com> On Wed, 10 Sep 2025 22:47:27 GMT, Ioi Lam wrote: > I didn't realize that my attempt to remove the JNIHandles::resolve() boilerplate can be conversional. Removing boilerplate wasn't controversial. Spreading the j* types can be seen as controversial give that we have various efforts to push those types out to the boundaries of the JVM. Adding new convenience functions that accept j* goes in the opposite direction. > I can't put a helper function in jni.cpp because this pattern is used in several files. But almost all are in jni.cpp and jvm.cpp and you can get rid of most of the boilerplate code by adding local helpers there. The handfulish of other places could keep their explicit usage of JNIHandles::resolve* calls. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3278518160 From tschatzl at openjdk.org Thu Sep 11 07:26:25 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 11 Sep 2025 07:26:25 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 15:28:39 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai review That test failure in windows-x64 is a shenandoah timeout that looks unrelated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3278828076 From aph at openjdk.org Thu Sep 11 10:28:16 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Sep 2025 10:28:16 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 11:38:38 GMT, Thomas Schatzl wrote: > I can argue I was following precedent :) You were, but it's a precedent that needs to die. > I see your point though. What do you suggest to do here? Use `count` throughout instead? Yes, although it might need a couple more comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2340016401 From iklam at openjdk.org Thu Sep 11 18:34:06 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 11 Sep 2025 18:34:06 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: <5c8zxQ36O0VJk5X7lu5QRuh7LhM49ZdpUTmDV-qc2l4=.9a4c2d95-9a0a-4ba6-8f73-b05e90030f6e@github.com> References: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> <5c8zxQ36O0VJk5X7lu5QRuh7LhM49ZdpUTmDV-qc2l4=.9a4c2d95-9a0a-4ba6-8f73-b05e90030f6e@github.com> Message-ID: On Thu, 11 Sep 2025 06:44:51 GMT, Stefan Karlsson wrote: > > I didn't realize that my attempt to remove the JNIHandles::resolve() boilerplate can be conversional. > > Removing boilerplate wasn't controversial. Spreading the j* types can be seen as controversial give that we have various efforts to push those types out to the boundaries of the JVM. Adding new convenience functions that accept j* goes in the opposite direction. > > > I can't put a helper function in jni.cpp because this pattern is used in several files. > > But almost all are in jni.cpp and jvm.cpp and you can get rid of most of the boilerplate code by adding local helpers there. The handfulish of other places could keep their explicit usage of JNIHandles::resolve* calls. Maybe in a different PR. I want to keep the current PR simple. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3282181436 From kbarrett at openjdk.org Fri Sep 12 06:35:19 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 12 Sep 2025 06:35:19 GMT Subject: RFR: 8367014: Rename class Atomic to AtomicAccess [v3] In-Reply-To: References: Message-ID: <7AkpbeYv_skWN-uWtPAf4LpJ4stSozu2k0I75xUrAkI=.9aa5463f-4548-4a91-80e1-929f918c0676@github.com> On Wed, 10 Sep 2025 07:40:38 GMT, Andrew Haley wrote: >>> > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose >>> > to not rename the various "atomic_." and "atomic__." files. >>> >>> Could you motivate why you chose to not do that? >> >> I thought about it, and waffled back and forth. But I was trying to do as much >> as possible of this change mechanically. Renaming a file involves multiple >> steps that weren't all easily scriptable. (And I'd already messed up a part of >> the renaming of atomic.hpp during patch development.) Also, this change is >> going to be hard for backports as it is, and I think renamings might make that >> worse. Renamings can also be annoying for archeology. But if you think it's >> important... > >> Also, this change is >> going to be hard for backports as it is, and I think renamings might make that >> worse. Renamings can also be annoying for archeology. > > Speaking as an archaeologist and the lead of multiple backport projects, I agree with you, Kim. Thanks for reviews @theRealAph , @stefank , and @dholmes-ora ------------- PR Comment: https://git.openjdk.org/jdk/pull/27135#issuecomment-3283901596 From kbarrett at openjdk.org Fri Sep 12 06:39:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 12 Sep 2025 06:39:35 GMT Subject: Integrated: 8367014: Rename class Atomic to AtomicAccess In-Reply-To: References: Message-ID: On Mon, 8 Sep 2025 06:26:03 GMT, Kim Barrett wrote: > Please review this change that renames the all-static class `Atomic` to > `AtomicAccess`. The reason for this name change is to allow the introduction > of the new type `Atomic` ([JDK-8367013](https://bugs.openjdk.org/browse/JDK-8367013)). > > The PR has several commits, according to the specific category of change being > made. It may be easier to review the PR by studying these individual commits. > > Although the file "atomic.hpp" is being renamed to "atomicAccess.hpp", I chose > to not rename the various "atomic_.*" and "atomic__.*" files. > > There are a number of comments containing the word "Atomic" that I didn't > change. They are generically about atomic operations, and will just as well > serve as referring to the future `Atomic`. > > Testing: mach5 tier1, GHA sanity tests. > This is one of those changes where successful builds indicate the change is good. This pull request has now been integrated. Changeset: 9e843f56 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/9e843f56ec0e4126e8256dff44f47c56e5282d20 Stats: 5577 lines in 430 files changed: 1587 ins; 1585 del; 2405 mod 8367014: Rename class Atomic to AtomicAccess Reviewed-by: dholmes, aph, stefank ------------- PR: https://git.openjdk.org/jdk/pull/27135 From tschatzl at openjdk.org Fri Sep 12 07:26:30 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 Sep 2025 07:26:30 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: On Thu, 11 Sep 2025 10:24:57 GMT, Andrew Haley wrote: >> I can argue I was following precedent :) I see your point though. What do you suggest to do here? Use `count` throughout instead? > >> I can argue I was following precedent :) > > You were, but it's a precedent that needs to die. > >> I see your point though. What do you suggest to do here? Use `count` throughout instead? > > Yes, although it might need a couple more comments. What do you think of https://github.com/openjdk/jdk/commit/74e9240ba275986375d3e6f0ac9bfa4b5fbb78ce ? (not committed in this branch yet because I do not want all the back-and-forth here) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2343254077 From tschatzl at openjdk.org Fri Sep 12 08:29:59 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 Sep 2025 08:29:59 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v60] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 80 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * therealaph suggestion for avoiding the register aliasin in gen_write_ref_array_post - * walulyai review - * walulyai review * tried to remove "logged card" terminology for the current "pending card" one - * aph review, fix some comment - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * iwalulya: remove confusing comment - * sort includes - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 70 more: https://git.openjdk.org/jdk/compare/9e843f56...1ced9f98 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=59 Stats: 7162 lines in 112 files changed: 2594 ins; 3588 del; 980 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From mdoerr at openjdk.org Fri Sep 12 08:30:01 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 12 Sep 2025 08:30:01 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: References: Message-ID: <63y80KoZ7oXdnFLRmK28Z8wcOSSAMXHx91akDn0tcLc=.981add6e-0b3f-423f-9b5f-87bb7d9fad9a@github.com> On Fri, 12 Sep 2025 07:23:23 GMT, Thomas Schatzl wrote: >>> I can argue I was following precedent :) >> >> You were, but it's a precedent that needs to die. >> >>> I see your point though. What do you suggest to do here? Use `count` throughout instead? >> >> Yes, although it might need a couple more comments. > > What do you think of https://github.com/openjdk/jdk/commit/74e9240ba275986375d3e6f0ac9bfa4b5fbb78ce ? (not committed in this branch yet because I do not want all the back-and-forth here) Other idea: set count = noreg to prevent usage after it is used under the other name. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2343415421 From tschatzl at openjdk.org Mon Sep 15 07:18:14 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 15 Sep 2025 07:18:14 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v61] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * iwalulya review * documentation for a few PSS members * rename some member variables to contain _ct and _rt suffixes in remembered set verification ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/1ced9f98..bf8cab33 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=60 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=59-60 Stats: 25 lines in 3 files changed: 10 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From coleenp at openjdk.org Mon Sep 15 13:38:27 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 13:38:27 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace Message-ID: This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. Tested with tier1-4. ------------- Commit messages: - 8365823: Revert storing abstract and interface Klasses to non-class metaspace - 8365823: Revert storing abstract and interface Klasses to non-class metaspace Changes: https://git.openjdk.org/jdk/pull/27295/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365823 Stats: 85 lines in 18 files changed: 6 ins; 55 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/27295.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27295/head:pull/27295 PR: https://git.openjdk.org/jdk/pull/27295 From coleenp at openjdk.org Mon Sep 15 13:38:28 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 13:38:28 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 13:28:45 GMT, Coleen Phillimore wrote: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. to review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3292173729 From shade at openjdk.org Mon Sep 15 13:46:34 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Sep 2025 13:46:34 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 13:28:45 GMT, Coleen Phillimore wrote: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. First pass comments below. Also, I looked at original change, and I wonder if we want to revert other changes as well. It looks to me they are fairly innocuous, TBH, so I have no strong opinion about them. Changing `final` -> `abstract` in `InvokerBytecodeGenerator`: https://github.com/openjdk/jdk/commit/ad104932e6c26806c353ad048ce5cff7d2b4c29a?diff=unified#diff-3b05b61400e7766115409b3f508d839fb51e450423822252ab2e18543427c764L249-R249 JFR: https://github.com/openjdk/jdk/commit/ad104932e6c26806c353ad048ce5cff7d2b4c29a?diff=unified#diff-d58d6d9783cb29084a15c42ecd7f59860a48be8bcfd9be0ee15a9d50209b576fR1-R160 And to the test: https://github.com/openjdk/jdk/commit/ad104932e6c26806c353ad048ce5cff7d2b4c29a?diff=unified#diff-00138acd973f46c5f91674e5388ee82d2e7ed1b788ed551f34120cc761d228b7L1-R166 src/hotspot/share/memory/metaspace.cpp line 885: > 883: MetaspaceCriticalAllocation::block_if_concurrent_purge(); > 884: > 885: MetadataType mdtype = type == MetaspaceObj::ClassType ? ClassType: NonClassType; This matches the style of original hunk: Suggestion: MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : NonClassType; (see https://github.com/openjdk/jdk/commit/ad104932e6c26806c353ad048ce5cff7d2b4c29a?diff=unified#diff-d22e3e58e52d574f4277c0f89304d775b68833148a57c5af6760395b002b2b86L843-R843) src/hotspot/share/memory/metaspace.cpp line 920: > 918: > 919: if (result == nullptr) { > 920: MetadataType mdtype = type == MetaspaceObj::ClassType ? ClassType: NonClassType; Suggestion: MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : NonClassType; src/hotspot/share/oops/klass.cpp line 281: > 279: #ifdef _LP64 > 280: if (UseCompactObjectHeaders) { > 281: precond(CompressedKlassPointers::is_encodable(kls)); Sounds like we want to leave this comment in: // With compact object headers, the narrow Klass ID is part of the mark word. // We therefore seed the mark word with the narrow Klass ID. ------------- PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3224650389 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349034149 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349035742 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349020024 From liach at openjdk.org Mon Sep 15 14:17:51 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 15 Sep 2025 14:17:51 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace In-Reply-To: References: Message-ID: <-fecZtztO6hiM81X7rvfgiuqYRSx3JI6wdj_XvK7G4Q=.08a72d12-8b54-46a4-a2ae-b5af9febe652@github.com> On Mon, 15 Sep 2025 13:28:45 GMT, Coleen Phillimore wrote: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. Re InvokerBytecodeGenerator: it was rolled back almost immediately after the original patch due to performance regressions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3292388736 From coleenp at openjdk.org Mon Sep 15 14:34:45 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 14:34:45 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v2] In-Reply-To: References: Message-ID: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: - Update src/hotspot/share/memory/metaspace.cpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/memory/metaspace.cpp Co-authored-by: Aleksey Shipil?v - Restore comment. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27295/files - new: https://git.openjdk.org/jdk/pull/27295/files/864e938d..097c6b33 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=00-01 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27295.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27295/head:pull/27295 PR: https://git.openjdk.org/jdk/pull/27295 From coleenp at openjdk.org Mon Sep 15 14:34:47 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 14:34:47 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 13:33:47 GMT, Aleksey Shipilev wrote: >> Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update src/hotspot/share/memory/metaspace.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/memory/metaspace.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Restore comment. > > src/hotspot/share/oops/klass.cpp line 281: > >> 279: #ifdef _LP64 >> 280: if (UseCompactObjectHeaders) { >> 281: precond(CompressedKlassPointers::is_encodable(kls)); > > Sounds like we want to leave this comment in: > > > // With compact object headers, the narrow Klass ID is part of the mark word. > // We therefore seed the mark word with the narrow Klass ID. Little too aggressive with the delete button. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349206231 From coleenp at openjdk.org Mon Sep 15 14:38:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 14:38:03 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove CFP.is_abstract(). ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27295/files - new: https://git.openjdk.org/jdk/pull/27295/files/097c6b33..fe44431f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27295.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27295/head:pull/27295 PR: https://git.openjdk.org/jdk/pull/27295 From coleenp at openjdk.org Mon Sep 15 14:38:04 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 14:38:04 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace In-Reply-To: References: Message-ID: <4zUQMNnJKfrNy91JGdPH8u6Ev2pQOlgKMb-NBE4Qrok=.95b41cbd-1a75-45b7-a539-b804458afcee@github.com> On Mon, 15 Sep 2025 13:28:45 GMT, Coleen Phillimore wrote: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. The JFR changes were a slight refactoring, so I don't think they need to be changed back. The ClassFileParser.is_abstract() function is now unused. Well spoltted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3292499021 From shade at openjdk.org Mon Sep 15 15:09:38 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Sep 2025 15:09:38 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 14:38:03 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove CFP.is_abstract(). This looks good to me, I have a few more questions: src/hotspot/share/ci/ciKlass.hpp line 112: > 110: assert(is_in_encoding_range, "sanity"); > 111: return is_in_encoding_range; > 112: } I thought about this method, and it _looks_ to be just: bool is_in_encoding_range() { assert(CompressedKlassPointers::is_encodable(get_Klass()), "sanity"); return true; } But I think your version is better, since it is more paranoid and actually checks if we are dealing with the class within the class range. src/hotspot/share/memory/allocation.cpp line 76: > 74: // Klass has its own operator new > 75: assert(type != ClassType, "class has its own operator new"); > 76: return Metaspace::allocate(loader_data, word_size, type, /*use_class_space*/ false, THREAD); Question: now that `ArrayKlass` and `InstanceKlass` do not have `::new`, the assert above is still valid? I am guessing all these inherit `Klass::new`? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3225077178 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349317958 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349286698 From coleenp at openjdk.org Mon Sep 15 15:45:25 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 15:45:25 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> References: <3Xh3sS0I7pp0fslgQndt4CLqddxga7W8c11tuZkNgd8=.b5a28dba-05f1-49d9-80f8-6260a9772287@github.com> Message-ID: On Wed, 10 Sep 2025 22:46:00 GMT, Ioi Lam wrote: >> The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls by adding a new function: >> >> >> static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); >> >> >> This PR is intended to be a strict clean-up that preserves existing behaviors. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Removed the (jobject) version of as_Klass/as_InstanceKlass This looks great. Thanks! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27158#pullrequestreview-3225277530 From coleenp at openjdk.org Mon Sep 15 15:59:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Sep 2025 15:59:39 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: <7v3qSjT6T3crtTXTRCwHuire5p-HFHDp6LyO3YfuHvs=.5969244a-ec88-4e9f-af2f-8263711dc010@github.com> On Mon, 15 Sep 2025 15:06:04 GMT, Aleksey Shipilev wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove CFP.is_abstract(). > > src/hotspot/share/ci/ciKlass.hpp line 112: > >> 110: assert(is_in_encoding_range, "sanity"); >> 111: return is_in_encoding_range; >> 112: } > > I thought about this method, and it _looks_ to be just: > > > bool is_in_encoding_range() { > assert(CompressedKlassPointers::is_encodable(get_Klass()), "sanity"); > return true; > } > > > But I think your version is better, since it is more paranoid and actually checks if we are dealing with the class within the class range. I was on the verge of removing this but that snowballed into removing all sorts of other things that maybe can be removed later. This was sort of where I cut it off and I did like the variable name. > src/hotspot/share/memory/allocation.cpp line 76: > >> 74: // Klass has its own operator new >> 75: assert(type != ClassType, "class has its own operator new"); >> 76: return Metaspace::allocate(loader_data, word_size, type, /*use_class_space*/ false, THREAD); > > Question: now that `ArrayKlass` and `InstanceKlass` do not have `::new`, the assert above is still valid? I am guessing all these inherit `Klass::new`? Yes, all Klass allocation should call Klass::operator new(). Klass is derived from Metadata that's derived from MetaspaceObj. The Klass operator new _hides_ the one in MetaspaceObj. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349433048 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2349449119 From iklam at openjdk.org Tue Sep 16 01:08:35 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 16 Sep 2025 01:08:35 GMT Subject: RFR: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass [v3] In-Reply-To: References: <30w65N-VMzjkn9Tpy6cXrKz2F-gyoGSyCMpzLc2H8HI=.82760ff5-84b2-4adb-897e-ee8f94be6dad@github.com> Message-ID: On Wed, 10 Sep 2025 07:28:42 GMT, David Holmes wrote: >>> > So I am not sure if we really have that separation anymore. >>> >>> I think it is more that there are many bits of code that actually form the "boundary" (prims, services, some runtime, jvmci, interpreter-related). But I guess it is hard to argue this makes it markedly worse. >> >> Arguably the translation of Java mirrors to Klasses is also a boundary (from Java representation to VM representation) :-) >> >> In reality I think because jobjects are easy to use and are just another kind of handle (like Handle and OopHandle), the leakage from JNI code to other parts of VM just happened naturally. >> >>> > The code already assumes that it has an InstanceKlass, and I am not changing that. >>> >>> Okay. >> >> BTW I removed the JVMTI changes from this PR. > >> Arguably the translation of Java mirrors to Klasses is also a boundary (from Java representation to VM representation) :-) > > The mirror is an oop, both oop and klass are internal VM representations. Thanks @dholmes-ora @coleenp for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/27158#issuecomment-3294479460 From iklam at openjdk.org Tue Sep 16 01:08:36 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 16 Sep 2025 01:08:36 GMT Subject: Integrated: 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass In-Reply-To: References: Message-ID: On Tue, 9 Sep 2025 05:21:10 GMT, Ioi Lam wrote: > The purpose of this PR is to simplify JNI code and also to avoid unnecessary `InstanceKlass::cast()` calls by adding a new function: > > > static InstanceKlass* java_lang_Class::as_InstanceKlass(oop java_class); > > > This PR is intended to be a strict clean-up that preserves existing behaviors. This pull request has now been integrated. Changeset: 24255848 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/242558484985cb954b0e658776fd59cbca1be1db Stats: 110 lines in 15 files changed: 9 ins; 30 del; 71 mod 8367142: Avoid InstanceKlass::cast when converting java mirror to InstanceKlass Reviewed-by: dholmes, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/27158 From fyang at openjdk.org Tue Sep 16 06:06:23 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 16 Sep 2025 06:06:23 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: <-D3Oiz-fmDudrEt2zZDa40Xff-Z4DP5GBWhts-CxPsY=.ea2ea9d6-49e6-4973-b4d2-cc1e0201f3c8@github.com> On Thu, 11 Sep 2025 07:24:13 GMT, Thomas Schatzl wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> * walulyai review > > That test failure in windows-x64 is a shenandoah timeout that looks unrelated. @tschatzl : Hi, would you mind adding a small cleanup change for riscv? This also adds back the assertion about the registers. Still test good on linux-riscv64 platform. [riscv-addon.diff.txt](https://github.com/user-attachments/files/22356611/riscv-addon.diff.txt) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3295662846 From tschatzl at openjdk.org Tue Sep 16 07:36:40 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 16 Sep 2025 07:36:40 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: On Thu, 11 Sep 2025 07:24:13 GMT, Thomas Schatzl wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> * walulyai review > > That test failure in windows-x64 is a shenandoah timeout that looks unrelated. > @tschatzl : Hi, would you mind adding a small cleanup change for riscv? This also adds back the assertion about the registers. Still test good on linux-riscv64 platform. [riscv-addon.diff.txt](https://github.com/user-attachments/files/22356611/riscv-addon.diff.txt) This is the `end` -> `count` transformation in the barrier I suggested earlier for RISC-V, isn't it? Thanks for contributing that, but would you mind me holding off this until @theRealAph acks that similar change for aarch64? It would be unfortunate imo if the implementations diverge too much. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3296253540 From fyang at openjdk.org Tue Sep 16 07:41:46 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 16 Sep 2025 07:41:46 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 07:33:18 GMT, Thomas Schatzl wrote: > > @tschatzl : Hi, would you mind adding a small cleanup change for riscv? This also adds back the assertion about the registers. Still test good on linux-riscv64 platform. [riscv-addon.diff.txt](https://github.com/user-attachments/files/22356611/riscv-addon.diff.txt) > > This is the `end` -> `count` transformation in the barrier I suggested earlier for RISC-V, isn't it? Thanks for contributing that, but would you mind me holding off this until @theRealAph acks that similar change for aarch64? It would be unfortunate imo if the implementations diverge too much. Yes, sure! The purpose is to minimize the difference to avoid possible issues in the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3296286316 From fandreuzzi at openjdk.org Tue Sep 16 16:47:30 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 16 Sep 2025 16:47:30 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields Message-ID: #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: CiEnv* _env CompileTask* _task ciMethod* _method I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. ------------- Commit messages: - ops - remove indent - bring back fields Changes: https://git.openjdk.org/jdk/pull/27318/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8367689 Stats: 108 lines in 15 files changed: 68 ins; 0 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/27318.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27318/head:pull/27318 PR: https://git.openjdk.org/jdk/pull/27318 From sparasa at openjdk.org Tue Sep 16 18:12:36 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 16 Sep 2025 18:12:36 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present Message-ID: The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. ------------- Commit messages: - clean up - 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present Changes: https://git.openjdk.org/jdk/pull/27320/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8367780 Stats: 40 lines in 4 files changed: 33 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/27320.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27320/head:pull/27320 PR: https://git.openjdk.org/jdk/pull/27320 From sparasa at openjdk.org Tue Sep 16 19:14:12 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 16 Sep 2025 19:14:12 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v2] In-Reply-To: References: Message-ID: > The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. > > As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. > > This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. Srinivas Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - integrate latest changes - merge master - clean up - 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present ------------- Changes: https://git.openjdk.org/jdk/pull/27320/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=01 Stats: 40 lines in 4 files changed: 33 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/27320.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27320/head:pull/27320 PR: https://git.openjdk.org/jdk/pull/27320 From sparasa at openjdk.org Tue Sep 16 19:21:28 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 16 Sep 2025 19:21:28 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v3] In-Reply-To: References: Message-ID: > The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. > > As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. > > This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: update KNL check ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27320/files - new: https://git.openjdk.org/jdk/pull/27320/files/87e96cca..21560d0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27320.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27320/head:pull/27320 PR: https://git.openjdk.org/jdk/pull/27320 From fandreuzzi at openjdk.org Tue Sep 16 22:25:08 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 16 Sep 2025 22:25:08 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: > #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: > > CiEnv* _env > CompileTask* _task > ciMethod* _method > > > I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: - note - nn - nn - cc - nn - nn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27318/files - new: https://git.openjdk.org/jdk/pull/27318/files/5bc13c6c..e8b7849d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=00-01 Stats: 95 lines in 14 files changed: 0 ins; 54 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/27318.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27318/head:pull/27318 PR: https://git.openjdk.org/jdk/pull/27318 From coleenp at openjdk.org Tue Sep 16 22:25:10 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Sep 2025 22:25:10 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 22:22:13 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: > > - note > - nn > - nn > - cc > - nn > - nn Changes requested by coleenp (Reviewer). src/hotspot/share/runtime/vmStructs.cpp line 677: > 675: nonstatic_field(CompilerThread, _env, ciEnv*) \ > 676: nonstatic_field(ciEnv, _task, CompileTask*) \ > 677: c2_nonstatic_field(Compile, _method, ciMethod*) \ It would be really easy for someone to notice that these aren't used and delete them again, and it's unfortunate that this macro has to be propagated throughout all the vmStructs macros. I can see some external tool needing CompilerThread env and ciEnv task (the current compile task) but maybe this can not need Compile::_method? Instead of adding the c2 macros, can you change the code to have: COMPILER2_PRESENT(nontstatic_field(Compile, _method, ciMethod*)) COMPILER2_PRESENT(declare_toplevel_type(Compile)) etc. ------------- PR Review: https://git.openjdk.org/jdk/pull/27318#pullrequestreview-3231331939 PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353411773 From coleenp at openjdk.org Tue Sep 16 22:25:11 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Sep 2025 22:25:11 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 19:12:35 GMT, Coleen Phillimore wrote: >> Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: >> >> - note >> - nn >> - nn >> - cc >> - nn >> - nn > > src/hotspot/share/runtime/vmStructs.cpp line 677: > >> 675: nonstatic_field(CompilerThread, _env, ciEnv*) \ >> 676: nonstatic_field(ciEnv, _task, CompileTask*) \ >> 677: c2_nonstatic_field(Compile, _method, ciMethod*) \ > > It would be really easy for someone to notice that these aren't used and delete them again, and it's unfortunate that this macro has to be propagated throughout all the vmStructs macros. I can see some external tool needing CompilerThread env and ciEnv task (the current compile task) but maybe this can not need Compile::_method? > > Instead of adding the c2 macros, can you change the code to have: > > COMPILER2_PRESENT(nontstatic_field(Compile, _method, ciMethod*)) > COMPILER2_PRESENT(declare_toplevel_type(Compile)) > etc. Add a comment that these fields are needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353414090 From fandreuzzi at openjdk.org Tue Sep 16 22:25:11 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 16 Sep 2025 22:25:11 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 19:13:39 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/vmStructs.cpp line 677: >> >>> 675: nonstatic_field(CompilerThread, _env, ciEnv*) \ >>> 676: nonstatic_field(ciEnv, _task, CompileTask*) \ >>> 677: c2_nonstatic_field(Compile, _method, ciMethod*) \ >> >> It would be really easy for someone to notice that these aren't used and delete them again, and it's unfortunate that this macro has to be propagated throughout all the vmStructs macros. I can see some external tool needing CompilerThread env and ciEnv task (the current compile task) but maybe this can not need Compile::_method? >> >> Instead of adding the c2 macros, can you change the code to have: >> >> COMPILER2_PRESENT(nontstatic_field(Compile, _method, ciMethod*)) >> COMPILER2_PRESENT(declare_toplevel_type(Compile)) >> etc. > > Add a comment that these fields are needed. > I can see some external tool needing CompilerThread env and ciEnv task (the current compile task) but maybe this can not need Compile::_method? This is how we use `Compile::_method` in Async-Profiler: VMMethod* compiledMethod() { const char* env = *(const char**) at(_comp_env_offset); if (env != NULL) { const char* task = *(const char**) (env + _comp_task_offset); if (task != NULL) { return *(VMMethod**) (task + _comp_method_offset); } } return NULL; } so we can get a `jmethodID` to the method being compiled. That's the only use case we have for these fields. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353458447 From apangin at openjdk.org Tue Sep 16 22:25:12 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Tue, 16 Sep 2025 22:25:12 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 19:13:39 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/vmStructs.cpp line 677: >> >>> 675: nonstatic_field(CompilerThread, _env, ciEnv*) \ >>> 676: nonstatic_field(ciEnv, _task, CompileTask*) \ >>> 677: c2_nonstatic_field(Compile, _method, ciMethod*) \ >> >> It would be really easy for someone to notice that these aren't used and delete them again, and it's unfortunate that this macro has to be propagated throughout all the vmStructs macros. I can see some external tool needing CompilerThread env and ciEnv task (the current compile task) but maybe this can not need Compile::_method? >> >> Instead of adding the c2 macros, can you change the code to have: >> >> COMPILER2_PRESENT(nontstatic_field(Compile, _method, ciMethod*)) >> COMPILER2_PRESENT(declare_toplevel_type(Compile)) >> etc. > > Add a comment that these fields are needed. @coleenp Good catch, thanks. `Compile::_method` is redundant, indeed. `CompileTask::_method` is what async-profiler needs, but it's still in place. @fandreuz I think this PR can do without platform-dependent changes. For the context, it's [this feature](https://github.com/async-profiler/async-profiler/blob/master/docs/AdvancedStacktraceFeatures.md#display-jit-compilation-task) that embeds method name of the current compile task right in the C1/C2 stack traces. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353489167 From fandreuzzi at openjdk.org Tue Sep 16 22:25:13 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 16 Sep 2025 22:25:13 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 19:42:16 GMT, Andrei Pangin wrote: > Good catch, thanks. Compile::_method is redundant, I see, thanks! 3a263b3843f605eac286ec1c8611479da87f6b93 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353748289 From fandreuzzi at openjdk.org Tue Sep 16 22:25:13 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 16 Sep 2025 22:25:13 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 21:58:31 GMT, Francesco Andreuzzi wrote: >> @coleenp Good catch, thanks. `Compile::_method` is redundant, indeed. `CompileTask::_method` is what async-profiler needs, but it's still in place. >> @fandreuz I think this PR can do without platform-dependent changes. >> >> For the context, it's [this feature](https://github.com/async-profiler/async-profiler/blob/master/docs/AdvancedStacktraceFeatures.md#display-jit-compilation-task) that embeds method name of the current compile task right in the C1/C2 stack traces. > >> Good catch, thanks. Compile::_method is redundant, > > I see, thanks! 3a263b3843f605eac286ec1c8611479da87f6b93 > I think this PR can do without platform-dependent changes. Right, many things turned out to be unnecessary. Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2353778633 From coleenp at openjdk.org Tue Sep 16 22:34:18 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Sep 2025 22:34:18 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 22:25:08 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: > > - note > - nn > - nn > - cc > - nn > - nn Oh that is much better! Thanks. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27318#pullrequestreview-3231896761 From sviswanathan at openjdk.org Wed Sep 17 00:28:50 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 17 Sep 2025 00:28:50 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v3] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 19:21:28 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. >> >> As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. >> >> This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > update KNL check src/hotspot/cpu/x86/vm_version_x86.cpp line 1083: > 1081: if (!UseAPX) { > 1082: _features.clear_feature(CPU_APX_F); > 1083: _features.clear_feature(CPU_APX_NCI_NDD_NF); We don't need separate CPU_APX_NCI_NDD_NF feature and the related changes as CPU_APX_F feature is set only when both bits (sefsl1_cpuid7_edx.bits.apx_f and std_cpuid29_ebx.bits.apx_nci_ndd_nf) are set. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27320#discussion_r2353925684 From fandreuzzi at openjdk.org Wed Sep 17 08:22:46 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 17 Sep 2025 08:22:46 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 22:25:08 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: > > - note > - nn > - nn > - cc > - nn > - nn Test failure seems unrelated: https://github.com/fandreuz/jdk/actions/runs/17780614141/job/50541594076#step:10:2426 ------------- PR Comment: https://git.openjdk.org/jdk/pull/27318#issuecomment-3301871690 From ayang at openjdk.org Wed Sep 17 08:31:58 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 17 Sep 2025 08:31:58 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 22:25:08 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: > > - note > - nn > - nn > - cc > - nn > - nn src/hotspot/share/runtime/vmStructs.cpp line 670: > 668: \ > 669: /**************/ \ > 670: /* CI (NOTE: these fields should not be removed, they can be used by external tools) */ \ Can you also clarify what "external tools" are in the comment so that when/if those external tools stop using them, we can re-evaluate? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2354737188 From fandreuzzi at openjdk.org Wed Sep 17 08:43:06 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 17 Sep 2025 08:43:06 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Wed, 17 Sep 2025 08:28:53 GMT, Albert Mingkun Yang wrote: >> Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: >> >> - note >> - nn >> - nn >> - cc >> - nn >> - nn > > src/hotspot/share/runtime/vmStructs.cpp line 670: > >> 668: \ >> 669: /**************/ \ >> 670: /* CI (NOTE: these fields should not be removed, they can be used by external tools) */ \ > > Can you also clarify what "external tools" are in the comment so that when/if those external tools stop using them, we can re-evaluate? Do you mean I should explicitly mention Async-Profiler? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2354766869 From kevinw at openjdk.org Wed Sep 17 09:43:43 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 17 Sep 2025 09:43:43 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 22:25:08 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: > > - note > - nn > - nn > - cc > - nn > - nn We should clarify: These fields are not a public interface, but the JVM has to maintain them around due to other software's expectation? Async-Profiler is great, but is this change the right solution? Are there other fields which other software would have liked to stay around, and what else can not be removed? Curious how did Async-Profiler break when these are removed? Is it a crash that Async-Profiler needs to workaround, or is there less information in the collected profile. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27318#issuecomment-3302169803 From ayang at openjdk.org Wed Sep 17 12:55:53 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 17 Sep 2025 12:55:53 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v61] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 07:18:14 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * iwalulya review > * documentation for a few PSS members > * rename some member variables to contain _ct and _rt suffixes in remembered set verification Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3234308768 From apangin at openjdk.org Wed Sep 17 13:01:31 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 17 Sep 2025 13:01:31 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Wed, 17 Sep 2025 09:41:00 GMT, Kevin Walls wrote: >> Francesco Andreuzzi has updated the pull request incrementally with six additional commits since the last revision: >> >> - note >> - nn >> - nn >> - cc >> - nn >> - nn > > We should clarify: > These fields are not a public interface, but the JVM has to maintain them due to other software's expectations? > > Async-Profiler is great, but is this change the right solution? Are there other fields which other software would have liked to stay around, and what else can not be removed? > > Curious how did Async-Profiler break when these are removed? Is it a crash that Async-Profiler needs to workaround, or is there less information in the collected profile. @kevinjwalls Right, VMStructs is not a public supported interface, the JVM is not obliged to maintain it. Yet, some external tools (async-profiler and eBPF based profilers) softly rely on that in the lack of standard supported alternatives. At the same time, we are improving OpenJDK built-in capabilities that can eventually serve as such an alternative. A recent example is JEP 509: JFR CPU-Time Profiling. It's still a long path until JFR or other built-in tools can satisfy today's demand; in the meantime, VMStructs provides a temporary solution. VMStructs is a reasonable trade-off, very cheap from the maintenance perspective as opposed to AsyncGetCallTrace (async-profiler no longer depends on the latter). To emphasize, we do not expect others to maintain VM structures that async-profiler relies on, that is our (profiler + Corretto) responsibility. Infrequent changes will be small, localized and safe, like in this PR. As for the question how async-profiler breaks: if it is an optional feature (like here), it gracefully reduces functionality. Async-profiler attempts to detect essential VM changes and fail fast to prevent from crashing at runtime. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27318#issuecomment-3302889663 From kevinw at openjdk.org Wed Sep 17 14:05:35 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 17 Sep 2025 14:05:35 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v2] In-Reply-To: References: Message-ID: On Wed, 17 Sep 2025 12:59:13 GMT, Andrei Pangin wrote: >> We should clarify: >> These fields are not a public interface, but the JVM has to maintain them due to other software's expectations? >> >> Async-Profiler is great, but is this change the right solution? Are there other fields which other software would have liked to stay around, and what else can not be removed? >> >> Curious how did Async-Profiler break when these are removed? Is it a crash that Async-Profiler needs to workaround, or is there less information in the collected profile. > > @kevinjwalls Right, VMStructs is not a public supported interface, the JVM is not obliged to maintain it. Yet, some external tools (async-profiler and eBPF based profilers) softly rely on that in the lack of standard supported alternatives. At the same time, we are improving OpenJDK built-in capabilities that can eventually serve as such an alternative. A recent example is JEP 509: JFR CPU-Time Profiling. It's still a long path until JFR or other built-in tools can satisfy today's demand; in the meantime, VMStructs provides a temporary solution. > > VMStructs is a reasonable trade-off, very cheap from the maintenance perspective as opposed to AsyncGetCallTrace (async-profiler no longer depends on the latter). To emphasize, we do not expect others to maintain VM structures that async-profiler relies on, that is our (profiler + Corretto) responsibility. Infrequent changes will be small, localized and safe, like in this PR. > > As for the question how async-profiler breaks: if it is an optional feature (like here), it gracefully reduces functionality. Async-profiler attempts to detect essential VM changes and fail fast to prevent from crashing at runtime. Thanks @apangin Andrei I thought we should be open and explicit about this, good to have that exchange in the review. Async-Profiler knew what it was getting into. 8-) In the comment being added above, what is it appropriate to say... Saying "should not removed... they can be used by external tools" as an instruction seems like an overreach. Maybe more like: "These CI fields are retained in VMStructs for the benefit of external tools, to ease their migration to a future alternative." ------------- PR Comment: https://git.openjdk.org/jdk/pull/27318#issuecomment-3303157325 From kvn at openjdk.org Wed Sep 17 15:17:57 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Sep 2025 15:17:57 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 14:38:03 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove CFP.is_abstract(). Good. This is not 1-to-1 back-out of #19157 . I assume it is because there were several additional follow up fixes. Will 25u backport be the same? ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3234989383 From coleenp at openjdk.org Wed Sep 17 17:36:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Sep 2025 17:36:56 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 14:38:03 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove CFP.is_abstract(). Yes, this wasn't just a backout. I stayed away from the JFR code refactoring, and there were other fixes for hidden classes that are already in JDK 25. And we added then backported a diagnostic flag, that this change removes. Yes, this change should backport cleanly to JDK 25 pending approval. Thanks for reviewing Aleksey and Vladimir. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3303955655 PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3303957791 From sparasa at openjdk.org Wed Sep 17 17:54:49 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Wed, 17 Sep 2025 17:54:49 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: References: Message-ID: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> > The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. > > As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. > > This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. Srinivas Vamsi Parasa has updated the pull request incrementally with two additional commits since the last revision: - Update CPUInfoTest.java - Remove APX_NCI_NDD_NF as an explicit feature ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27320/files - new: https://git.openjdk.org/jdk/pull/27320/files/21560d0b..91f589ab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27320&range=02-03 Stats: 12 lines in 4 files changed: 0 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/27320.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27320/head:pull/27320 PR: https://git.openjdk.org/jdk/pull/27320 From sparasa at openjdk.org Wed Sep 17 17:54:51 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Wed, 17 Sep 2025 17:54:51 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v3] In-Reply-To: References: Message-ID: On Wed, 17 Sep 2025 00:26:32 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> update KNL check > > src/hotspot/cpu/x86/vm_version_x86.cpp line 1083: > >> 1081: if (!UseAPX) { >> 1082: _features.clear_feature(CPU_APX_F); >> 1083: _features.clear_feature(CPU_APX_NCI_NDD_NF); > > We don't need separate CPU_APX_NCI_NDD_NF feature and the related changes as CPU_APX_F feature is set only when both bits (sefsl1_cpuid7_edx.bits.apx_f and std_cpuid29_ebx.bits.apx_nci_ndd_nf) are set. Please see the updated changes which removes CPU_APX_NCI_NDD_NF as an explicit feature. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27320#discussion_r2356297066 From fandreuzzi at openjdk.org Wed Sep 17 18:54:19 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 17 Sep 2025 18:54:19 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v3] In-Reply-To: References: Message-ID: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> > #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: > > CiEnv* _env > CompileTask* _task > ciMethod* _method > > > I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27318/files - new: https://git.openjdk.org/jdk/pull/27318/files/e8b7849d..194a9d6f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=01-02 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/27318.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27318/head:pull/27318 PR: https://git.openjdk.org/jdk/pull/27318 From iwalulya at openjdk.org Thu Sep 18 05:25:46 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 18 Sep 2025 05:25:46 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v61] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 07:18:14 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * iwalulya review > * documentation for a few PSS members > * rename some member variables to contain _ct and _rt suffixes in remembered set verification LGTM! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3237327044 From tschatzl at openjdk.org Thu Sep 18 07:39:27 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 18 Sep 2025 07:39:27 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v59] In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 07:38:56 GMT, Fei Yang wrote: >>> @tschatzl : Hi, would you mind adding a small cleanup change for riscv? This also adds back the assertion about the registers. Still test good on linux-riscv64 platform. [riscv-addon.diff.txt](https://github.com/user-attachments/files/22356611/riscv-addon.diff.txt) >> >> This is the `end` -> `count` transformation in the barrier I suggested earlier for RISC-V, isn't it? Thanks for contributing that, but would you mind me holding off this until @theRealAph acks that similar change for aarch64? It would be unfortunate imo if the implementations diverge too much. > >> > @tschatzl : Hi, would you mind adding a small cleanup change for riscv? This also adds back the assertion about the registers. Still test good on linux-riscv64 platform. [riscv-addon.diff.txt](https://github.com/user-attachments/files/22356611/riscv-addon.diff.txt) >> >> This is the `end` -> `count` transformation in the barrier I suggested earlier for RISC-V, isn't it? Thanks for contributing that, but would you mind me holding off this until @theRealAph acks that similar change for aarch64? It would be unfortunate imo if the implementations diverge too much. > > Yes, sure! The purpose is to minimize the difference to avoid possible issues in the future. @RealFYang : going to wait for the response of @theRealAph about the `end->count` matter until early next week, otherwise I'll move this change to cleanups/further enhancements like (JDK-8352069)[https://bugs.openjdk.org/browse/JDK-8352069] already planned. Just want to be "done" at some point with this change :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3305851845 From aph at openjdk.org Thu Sep 18 08:03:17 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 18 Sep 2025 08:03:17 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v61] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 07:18:14 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * iwalulya review > * documentation for a few PSS members > * rename some member variables to contain _ct and _rt suffixes in remembered set verification Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3237979731 From kevinw at openjdk.org Thu Sep 18 08:03:04 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 18 Sep 2025 08:03:04 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v3] In-Reply-To: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> References: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> Message-ID: On Wed, 17 Sep 2025 18:54:19 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > comment src/hotspot/share/runtime/vmStructs.cpp line 673: > 671: /* to ease their migration to a future alternative. */ \ > 672: /******************************************************************************************/ \ > 673: \ oops just a missing close ), but yes other than that I think we're done 8-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2357942557 From aph at openjdk.org Thu Sep 18 08:03:18 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 18 Sep 2025 08:03:18 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v55] In-Reply-To: <63y80KoZ7oXdnFLRmK28Z8wcOSSAMXHx91akDn0tcLc=.981add6e-0b3f-423f-9b5f-87bb7d9fad9a@github.com> References: <63y80KoZ7oXdnFLRmK28Z8wcOSSAMXHx91akDn0tcLc=.981add6e-0b3f-423f-9b5f-87bb7d9fad9a@github.com> Message-ID: On Fri, 12 Sep 2025 08:27:01 GMT, Martin Doerr wrote: > Other idea: set count = noreg to prevent usage after it is used under the other name. That wouldn't have solved the aliasing problem, because count and end were being used as aliases for a register in _the same instruction_! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2357941045 From fandreuzzi at openjdk.org Thu Sep 18 08:18:49 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 18 Sep 2025 08:18:49 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v4] In-Reply-To: References: Message-ID: > #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: > > CiEnv* _env > CompileTask* _task > ciMethod* _method > > > I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: missing close ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27318/files - new: https://git.openjdk.org/jdk/pull/27318/files/194a9d6f..d7c4db00 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27318&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27318.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27318/head:pull/27318 PR: https://git.openjdk.org/jdk/pull/27318 From kevinw at openjdk.org Thu Sep 18 08:18:49 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 18 Sep 2025 08:18:49 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v4] In-Reply-To: References: Message-ID: On Thu, 18 Sep 2025 08:15:26 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > missing close OK, thanks! ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27318#pullrequestreview-3238053151 From duke at openjdk.org Thu Sep 18 08:18:51 2025 From: duke at openjdk.org (duke) Date: Thu, 18 Sep 2025 08:18:51 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v3] In-Reply-To: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> References: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> Message-ID: On Wed, 17 Sep 2025 18:54:19 GMT, Francesco Andreuzzi wrote: >> #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: >> >> CiEnv* _env >> CompileTask* _task >> ciMethod* _method >> >> >> I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > comment @fandreuz Your change (at version d7c4db008e15d00889ebacf9acc2d48689d2402e) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27318#issuecomment-3306071264 From fandreuzzi at openjdk.org Thu Sep 18 08:18:53 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 18 Sep 2025 08:18:53 GMT Subject: RFR: 8367689: Revert removal of several compilation-related vmStructs fields [v3] In-Reply-To: References: <7zcjZi3Kb4EoOhhH5hDHdAkJ_YvwbDNwn0VjveCwmno=.c8a860fa-ef49-474e-b7cb-cb82ed7f229b@github.com> Message-ID: On Thu, 18 Sep 2025 08:00:37 GMT, Kevin Walls wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> comment > > src/hotspot/share/runtime/vmStructs.cpp line 673: > >> 671: /* to ease their migration to a future alternative. */ \ >> 672: /******************************************************************************************/ \ >> 673: \ > > oops just a missing close ), but yes other than that I think we're done 8-) Thanks, fixed in d7c4db008e15d00889ebacf9acc2d48689d2402e ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27318#discussion_r2357974342 From fandreuzzi at openjdk.org Thu Sep 18 08:28:41 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 18 Sep 2025 08:28:41 GMT Subject: Integrated: 8367689: Revert removal of several compilation-related vmStructs fields In-Reply-To: References: Message-ID: <3VJwDsTsSPLGMQYO_0KlqXnL8HAttwGAQUm-1u315Uc=.b2ca96b1-c7ca-4a9e-890a-19b4283688b2@github.com> On Tue, 16 Sep 2025 16:40:05 GMT, Francesco Andreuzzi wrote: > #23782 ([JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488)) removed several vmStructs fields. A small subset was used in Async-Profiler: > > CiEnv* _env > CompileTask* _task > ciMethod* _method > > > I propose a small patch to bring them back. It helps third-party tools in building useful features on the JDK. For example, Async-Profiler uses these fields to display the current method being compiled in a compiler thread. This pull request has now been integrated. Changeset: 4c5e901c Author: Francesco Andreuzzi Committer: Kevin Walls URL: https://git.openjdk.org/jdk/commit/4c5e901c96dee3885e1b29a53d3400174f9bba09 Stats: 15 lines in 2 files changed: 15 ins; 0 del; 0 mod 8367689: Revert removal of several compilation-related vmStructs fields Reviewed-by: kevinw, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/27318 From stuefe at openjdk.org Thu Sep 18 14:29:41 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 18 Sep 2025 14:29:41 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v3] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 14:38:03 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove CFP.is_abstract(). This looks good. In hindsight kind of scary that there are no tests to change back. But all Metaspace tests sit either on a layer below Metaspace::allocate or test class loading, which sits above this layer. Some tests that check expected memory levels in class space and non-class metaspace could now run into problems, since the ratio between these numbers should shift (total consumption should be about identical). Maybe not. We'll see it when we see it. src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java line 102: > 100: // in compressible metaspace. > 101: return !t.isInterface() && !t.isAbstract(); > 102: } Can we remove this completely? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3240079100 PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2359574924 From coleenp at openjdk.org Thu Sep 18 17:50:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Sep 2025 17:50:54 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v4] In-Reply-To: References: Message-ID: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert JFR changes from JDK-8338526 also. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27295/files - new: https://git.openjdk.org/jdk/pull/27295/files/fe44431f..c333b8a1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=02-03 Stats: 8 lines in 1 file changed: 1 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27295.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27295/head:pull/27295 PR: https://git.openjdk.org/jdk/pull/27295 From mgronlun at openjdk.org Thu Sep 18 17:55:24 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 18 Sep 2025 17:55:24 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v4] In-Reply-To: References: Message-ID: <-F-_kJ5tErKjYKADQ4XDAxLsGKG5I7WqGpHdSErUSnE=.6cf5d41a-61b4-482f-a71d-0f69746e08d9@github.com> On Thu, 18 Sep 2025 17:50:54 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert JFR changes from JDK-8338526 also. Commit: "Revert JFR changes from JDK-8338526 also." looks good. Thanks ------------- PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3241262669 From coleenp at openjdk.org Thu Sep 18 18:04:08 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Sep 2025 18:04:08 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v5] In-Reply-To: References: Message-ID: On Thu, 18 Sep 2025 14:21:48 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Some limited additional cleanup to HotSpotMetaspaceConstantImpl.java on jvmci. Reran jvmci tests. > > src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java line 102: > >> 100: // in compressible metaspace. >> 101: return !t.isInterface() && !t.isAbstract(); >> 102: } > > Can we remove this completely? I suppose but we don't really test this code well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2360528357 From coleenp at openjdk.org Thu Sep 18 18:04:09 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Sep 2025 18:04:09 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v5] In-Reply-To: References: Message-ID: <1Onnx3yG5dKlycjRqKePin3Lqy_U6-HFSHBGYxrsZm8=.0fde1dbd-fd4f-47b4-9c11-4de165f5fa6a@github.com> On Thu, 18 Sep 2025 17:54:11 GMT, Coleen Phillimore wrote: >> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java line 102: >> >>> 100: // in compressible metaspace. >>> 101: return !t.isInterface() && !t.isAbstract(); >>> 102: } >> >> Can we remove this completely? > > I suppose but we don't really test this code well. I removed this and cleaned up or fixed the callers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27295#discussion_r2360553785 From coleenp at openjdk.org Thu Sep 18 18:04:05 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Sep 2025 18:04:05 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v5] In-Reply-To: References: Message-ID: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Some limited additional cleanup to HotSpotMetaspaceConstantImpl.java on jvmci. Reran jvmci tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27295/files - new: https://git.openjdk.org/jdk/pull/27295/files/c333b8a1..f1d08d63 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27295&range=03-04 Stats: 11 lines in 1 file changed: 0 ins; 10 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27295.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27295/head:pull/27295 PR: https://git.openjdk.org/jdk/pull/27295 From sviswanathan at openjdk.org Thu Sep 18 18:08:29 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 18 Sep 2025 18:08:29 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Wed, 17 Sep 2025 17:54:49 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. >> >> As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. >> >> This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. > > Srinivas Vamsi Parasa has updated the pull request incrementally with two additional commits since the last revision: > > - Update CPUInfoTest.java > - Remove APX_NCI_NDD_NF as an explicit feature Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27320#pullrequestreview-3241321050 From kvn at openjdk.org Thu Sep 18 21:41:01 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Sep 2025 21:41:01 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v5] In-Reply-To: References: Message-ID: On Thu, 18 Sep 2025 18:04:05 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Some limited additional cleanup to HotSpotMetaspaceConstantImpl.java on jvmci. Reran jvmci tests. Still good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27295#pullrequestreview-3242181368 From vpaprotski at openjdk.org Thu Sep 18 21:57:59 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Thu, 18 Sep 2025 21:57:59 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Wed, 17 Sep 2025 17:54:49 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. >> >> As per the latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ), when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. >> >> This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. > > Srinivas Vamsi Parasa has updated the pull request incrementally with two additional commits since the last revision: > > - Update CPUInfoTest.java > - Remove APX_NCI_NDD_NF as an explicit feature Thanks for answering my questions.. things we checked: - double-checked reg parameter values of cpuid against the spec - double-checked endianness of bitset variables in C grammar - double-checked how offset to the field std_cpuid29_ebx is computed Change looks good to me src/hotspot/cpu/x86/vm_version_x86.cpp line 2928: > 2926: if (sefsl1_cpuid7_edx.bits.apx_f != 0 && > 2927: xem_xcr0_eax.bits.apx_f != 0 && > 2928: std_cpuid29_ebx.bits.apx_nci_ndd_nf != 0) { was confused why the previous implementation was 'wrong'.. Please clarify that this was triggered "because" of the update to the spec (in the PR description). ------------- Marked as reviewed by vpaprotski (Committer). PR Review: https://git.openjdk.org/jdk/pull/27320#pullrequestreview-3242099089 PR Review Comment: https://git.openjdk.org/jdk/pull/27320#discussion_r2361188118 From coleenp at openjdk.org Fri Sep 19 11:57:13 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Sep 2025 11:57:13 GMT Subject: RFR: 8365823: Revert storing abstract and interface Klasses to non-class metaspace [v5] In-Reply-To: References: Message-ID: On Thu, 18 Sep 2025 18:04:05 GMT, Coleen Phillimore wrote: >> This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Some limited additional cleanup to HotSpotMetaspaceConstantImpl.java on jvmci. Reran jvmci tests. Thanks Vladimir for re-reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27295#issuecomment-3311902694 From coleenp at openjdk.org Fri Sep 19 11:57:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Sep 2025 11:57:15 GMT Subject: Integrated: 8365823: Revert storing abstract and interface Klasses to non-class metaspace In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 13:28:45 GMT, Coleen Phillimore wrote: > This change removes the optimization to not store abstract and interface Klass metadata to non-class metaspace. Now all Klass metadata is in the Klass metaspace. This is simpler and less bug prone, and didn't help with the limitation of classes that can be stored in class metaspace materially. > Tested with tier1-4. This pull request has now been integrated. Changeset: fa00b249 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/fa00b24954d63abed0093b696e5971c1918eec4d Stats: 104 lines in 19 files changed: 7 ins; 66 del; 31 mod 8365823: Revert storing abstract and interface Klasses to non-class metaspace Reviewed-by: kvn, shade, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/27295 From rcastanedalo at openjdk.org Fri Sep 19 13:12:36 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 19 Sep 2025 13:12:36 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Tue, 9 Sep 2025 11:27:50 GMT, Roland Westrelin wrote: >> An `Initialize` node for an `Allocate` node is created with a memory >> `Proj` of adr type raw memory. In order for stores to be captured, the >> memory state out of the allocation is a `MergeMem` with slices for the >> various object fields/array element set to the raw memory `Proj` of >> the `Initialize` node. If `Phi`s need to be created during later >> transformations from this memory state, The `Phi` for a particular >> slice gets its adr type from the type of the `Proj` which is raw >> memory. If during macro expansion, the `Allocate` is found to have no >> use and so can be removed, the `Proj` out of the `Initialize` is >> replaced by the memory state on input to the `Allocate`. A `Phi` for >> some slice for a field of an object will end up with the raw memory >> state on input to the `Allocate` node. As a result, memory state at >> the `Phi` is incorrect and incorrect execution can happen. >> >> The fix I propose is, rather than have a single `Proj` for the memory >> state out of the `Initialize` with adr type raw memory, to use one >> `Proj` per slice added to the memory state after the `Initalize`. Each >> of the `Proj` should return the right adr type for its slice. For that >> I propose having a new type of `Proj`: `NarrowMemProj` that captures >> the right adr type. >> >> Logic for the construction of the `Allocate`/`Initialize` subgraph is >> tweaked so the right adr type captured in is own `NarrowMemProj` is >> added to the memory sugraph. Code that removes an allocation or moves >> it also has to be changed so it correctly takes the multiple memory >> projections out of the `Initialize` node into account. >> >> One tricky issue is that when EA split types for a scalar replaceable >> `Allocate` node: >> >> 1- the adr type captured in the `NarrowMemProj` becomes out of sync >> with the type of the slices for the allocation >> >> 2- before EA, the memory state for one particular field out of the >> `Initialize` node can be used for a `Store` to the just allocated >> object or some other. So we can have a chain of `Store`s, some to >> the newly allocated object, some to some other objects, all of them >> using the state of `NarrowMemProj` out of the `Initialize`. After >> split unique types, the `NarrowMemProj` is for the slice of a >> particular allocation. So `Store`s to some other objects shouldn't >> use that memory state but the memory state before the `Allocate`. >> >> For that, I added logic to update the adr type of `NarrowMemProj` >> during split uni... > > Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: > > - more > - Merge branch 'master' into JDK-8327963 > - more > - more > - Merge branch 'master' into JDK-8327963 > - more > - more > - lambda return > - lambda clean up > - Merge branch 'master' into JDK-8327963 > - ... and 35 more: https://git.openjdk.org/jdk/compare/e16c5100...b701d03e Changes requested by rcastanedalo (Reviewer). src/hotspot/share/opto/escape.hpp line 567: > 565: // MemNode - new memory input for this node > 566: // CheckCastPP - allocation that this is a cast of > 567: // allocation - CheckCastPP of the allocation Please add a new entry here explaining how `_node_map` is used for `NarrowMemProjNode` nodes. src/hotspot/share/opto/graphKit.cpp line 3645: > 3643: assert(minit_out->is_Proj() && minit_out->in(0) == init, ""); > 3644: int mark_idx = C->get_alias_index(oop_type->add_offset(oopDesc::mark_offset_in_bytes())); > 3645: // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state Suggestion: // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state. src/hotspot/share/opto/graphKit.cpp line 3647: > 3645: // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state > 3646: // Use one NarrowMemProjNode per slice to properly record the adr type of each slice. The Initialize node will have > 3647: // multiple projection as a result. Suggestion: // multiple projections as a result. src/hotspot/share/opto/macro.cpp line 1606: > 1604: // elimination. Simply add the MemBarStoreStore after object > 1605: // initialization. > 1606: MemBarNode* mb = MemBarNode::make(C, Op_MemBarStoreStore, Compile::AliasIdxRaw); Does the same argument as below apply for relaxing the scope of this memory barrier? Please clarify in a similar comment for this case (if the same argument applies, a reference to the comment below would be enough). src/hotspot/share/opto/macro.cpp line 1623: > 1621: Node* init_ctrl = init->proj_out_or_null(TypeFunc::Control); > 1622: > 1623: // What we want is to prevent the compiler and the cpu from re-ordering the stores that initialize this object Suggestion: // What we want is to prevent the compiler and the CPU from re-ordering the stores that initialize this object src/hotspot/share/opto/macro.cpp line 1628: > 1626: // only captures/produces a partial memory state making it complicated to insert such a MemBar. Because > 1627: // re-ordering by the compiler can't happen by construction (a later Store that publishes the just allocated > 1628: // object reference is indirectly control dependent on the Initialize node), preventing reordering by the cpu is Suggestion: // object reference is indirectly control dependent on the Initialize node), preventing reordering by the CPU is src/hotspot/share/opto/memnode.hpp line 1383: > 1381: bool already_has_narrow_mem_proj_with_adr_type(const TypePtr* adr_type) const; > 1382: > 1383: MachProjNode* mem_mach_proj() const; Please add a brief comment above this function, possibly clarifying that we do not expect to find more than one Mach memory projection. src/hotspot/share/opto/multnode.cpp line 73: > 71: }; > 72: return apply_to_projs(filter, which_proj); > 73: } Consider moving this implementation to `multnode.hpp`, perhaps next to that of `MultiNode::apply_to_projs(DUIterator_Fast& imax, DUIterator_Fast& i, Callback callback, uint which_proj)`, for consistency. src/hotspot/share/opto/multnode.cpp line 279: > 277: void NarrowMemProjNode::dump_spec(outputStream *st) const { > 278: ProjNode::dump_spec(st); > 279: dump_adr_type(st); Do we need to define a special version of `NarrowMemProjNode::dump_adr_type` or could we just have the same effect calling `MemNode::dump_adr_type(this, _adr_type, st)` here? src/hotspot/share/opto/multnode.cpp line 284: > 282: void NarrowMemProjNode::dump_compact_spec(outputStream *st) const { > 283: ProjNode::dump_compact_spec(st); > 284: dump_adr_type(st); Same here. src/hotspot/share/opto/multnode.hpp line 71: > 69: } > 70: Node* current() { > 71: return _node->fast_out(_i);; Suggestion: return _node->fast_out(_i); src/hotspot/share/opto/multnode.hpp line 90: > 88: } > 89: Node* current() { > 90: return _node->out(_i);; Suggestion: return _node->out(_i); src/hotspot/share/opto/phaseX.cpp line 2621: > 2619: add_users_to_worklist0(proj, worklist); > 2620: return MultiNode::CONTINUE; > 2621: }; Consider defining `enqueue` only once and reusing it in both cases. test/hotspot/jtreg/compiler/escapeAnalysis/TestIterativeEA.java line 53: > 51: analyzer.shouldContain("++++ Eliminated: 26 Allocate"); > 52: analyzer.shouldContain("++++ Eliminated: 51 Allocate"); > 53: analyzer.shouldContain("++++ Eliminated: 84 Allocate"); Did you analyze why there are more allocations removed than before in this test case? I did not expect this changeset to have an effect on the number of removed allocations. test/hotspot/jtreg/compiler/macronodes/TestEarlyEliminationOfAllocationWithoutUse.java line 1: > 1: /* Please add a package declaration (and make the corresponding class names fully qualified in the `@run` directives). test/hotspot/jtreg/compiler/macronodes/TestEliminationOfAllocationWithoutUse.java line 30: > 28: * Now that array slice depends on the rawslice. And then when the Initialize MemBar gets > 29: * removed in expand_allocate_common, the rawslice sees that it has now no effect, looks > 30: * through the MergeMem and sees the initial stae. That way, also the linked array slice Suggestion: * through the MergeMem and sees the initial state. That way, also the linked array slice ------------- PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3244667543 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362830370 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362759304 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362760441 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362798596 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362800147 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362800934 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362782847 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362757140 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362743051 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362743403 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362746650 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362750245 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362767659 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362816473 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362810978 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362745517 From sparasa at openjdk.org Fri Sep 19 16:27:06 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 19 Sep 2025 16:27:06 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Thu, 18 Sep 2025 21:55:13 GMT, Volodymyr Paprotski wrote: > Change looks good to me > Thanks Vlad for going through the changes and reviewing the PR! > src/hotspot/cpu/x86/vm_version_x86.cpp line 2928: > >> 2926: if (sefsl1_cpuid7_edx.bits.apx_f != 0 && >> 2927: xem_xcr0_eax.bits.apx_f != 0 && >> 2928: std_cpuid29_ebx.bits.apx_nci_ndd_nf != 0) { > > was confused why the previous implementation was 'wrong'.. Please clarify that this was triggered "because" of the update to the spec (in the PR description). Please see the updated PR description which clarifies that this PR was triggered because of the update to the Intel APX spec. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27320#issuecomment-3312848532 PR Review Comment: https://git.openjdk.org/jdk/pull/27320#discussion_r2363522977 From sparasa at openjdk.org Fri Sep 19 18:21:57 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 19 Sep 2025 18:21:57 GMT Subject: Integrated: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present In-Reply-To: References: Message-ID: On Tue, 16 Sep 2025 18:04:33 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to enable APX on Intel CPUs (i.e. enable UseAPX) only when both the APX_F and APX_NCI_NDD_NF cpuid feature flags are present. > > The latest update to the Intel APX specification (https://www.intel.com/content/www/us/en/content-details/861610/intel-advanced-performance-extensions-intel-apx-architecture-specification.html ) has changed how APX features are detected on Intel CPUs. Because of this change, we need to update how the JVM enumerates CPU features. > > As per the new update, when APX_F is set, processors also provide CPUID leaf 0x29 (APX Advanced Performance Extensions Leaf). Any Intel processor that enumerates APX_F also enumerates APX_NCI_NDD_NF. > > This PR enhances the HotSpot x86 CPU feature detection to recognize the APX_NCI_NDD_NF sub-feature of Intel APX and update the enabling logic for UseAPX VM flag. This pull request has now been integrated. Changeset: 3d4e0491 Author: Srinivas Vamsi Parasa URL: https://git.openjdk.org/jdk/commit/3d4e0491940c4b4a05ac84006933d939370e7e2b Stats: 29 lines in 2 files changed: 26 ins; 0 del; 3 mod 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present Reviewed-by: sviswanathan, vpaprotski ------------- PR: https://git.openjdk.org/jdk/pull/27320 From vlivanov at openjdk.org Fri Sep 19 19:41:52 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 19 Sep 2025 19:41:52 GMT Subject: RFR: 8290892: C2: Intrinsify Reference.reachabilityFence [v13] In-Reply-To: References: Message-ID: > This PR introduces C2 support for `Reference.reachabilityFence()`. > > After [JDK-8199462](https://bugs.openjdk.org/browse/JDK-8199462) went in, it was discovered that C2 may break the invariant the fix relied upon [1]. So, this is an attempt to introduce proper support for `Reference.reachabilityFence()` in C2. C1 is left intact for now, because there are no signs yet it is affected. > > `Reference.reachabilityFence()` can be used in performance critical code, so the primary goal for C2 is to reduce its runtime overhead as much as possible. The ultimate goal is to ensure liveness information is attached to interfering safepoints, but it takes multiple steps to properly propagate the information through compilation pipeline without negatively affecting generated code quality. > > Also, I don't consider this fix as complete. It does fix the reported problem, but it doesn't provide any strong guarantees yet. In particular, since `ReachabilityFence` is CFG-only node, nothing explicitly forbids memory operations to float past `Reference.reachabilityFence()` and potentially reaching some other safepoints current analysis treats as non-interfering. Representing `ReachabilityFence` as memory barrier (e.g., `MemBarCPUOrder`) would solve the issue, but performance costs are prohibitively high. Alternatively, the optimization proposed in this PR can be improved to conservatively extend referent's live range beyond `ReachabilityFence` nodes associated with it. It would meet performance criteria, but I prefer to implement it as a followup fix. > > Another known issue relates to reachability fences on constant oops. If such constant is GCed (most likely, due to a bug in Java code), similar reachability issues may arise. For now, RFs on constants are treated as no-ops, but there's a diagnostic flag `PreserveReachabilityFencesOnConstants` to keep the fences. I plan to address it separately. > > [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/ref/Reference.java#L667 > "HotSpot JVM retains the ref and does not GC it before a call to this method, because the JIT-compilers do not have GC-only safepoints." > > Testing: > - [x] hs-tier1 - hs-tier8 > - [x] hs-tier1 - hs-tier6 w/ -XX:+StressReachabilityFences -XX:+VerifyLoopOptimizations > - [x] java/lang/foreign microbenchmarks Vladimir Ivanov has updated the pull request incrementally with one additional commit since the last revision: Remove comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25315/files - new: https://git.openjdk.org/jdk/pull/25315/files/dc37ccad..68150cc6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25315&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25315&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25315/head:pull/25315 PR: https://git.openjdk.org/jdk/pull/25315 From tschatzl at openjdk.org Mon Sep 22 08:57:23 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 08:57:23 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v62] In-Reply-To: References: Message-ID: <9QFGKuKT_g9DUQCDaZ3yMJv-SNXBULg_c5zVQxA3p5U=.9ad6d425-4168-46a2-9d9d-129690017725@github.com> > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 82 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * iwalulya review * documentation for a few PSS members * rename some member variables to contain _ct and _rt suffixes in remembered set verification - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * therealaph suggestion for avoiding the register aliasin in gen_write_ref_array_post - * walulyai review - * walulyai review * tried to remove "logged card" terminology for the current "pending card" one - * aph review, fix some comment - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * iwalulya: remove confusing comment - ... and 72 more: https://git.openjdk.org/jdk/compare/5efaa997...b5d22d52 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=61 Stats: 7178 lines in 113 files changed: 2606 ins; 3588 del; 984 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 22 09:31:58 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 09:31:58 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v63] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * improved gen_write_ref_array_post_barrier() for riscv, contributed by @realfyang ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/b5d22d52..53ef008a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=62 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=61-62 Stats: 24 lines in 1 file changed: 13 ins; 2 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 22 10:23:51 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 10:23:51 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v64] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * iwalulya: "Amount of" -> "Number of" in new flag description ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/53ef008a..6e37f8de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=63 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=62-63 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 22 10:40:53 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 10:40:53 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v65] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai: remove cost_per_pending_card_ms_default array since we only use one value ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/6e37f8de..311bb3e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=64 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=63-64 Stats: 5 lines in 1 file changed: 0 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 22 11:10:28 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 11:10:28 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v66] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai: remove unnecessarily introduced newline ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/311bb3e1..d80d6902 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=65 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=64-65 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Mon Sep 22 11:40:01 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 11:40:01 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v67] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai: bufferNodeList can be removed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/d80d6902..3c889e9f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=66 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=65-66 Stats: 81 lines in 3 files changed: 0 ins; 81 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From roland at openjdk.org Mon Sep 22 12:07:34 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 22 Sep 2025 12:07:34 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v13] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with seven additional commits since the last revision: - Update src/hotspot/share/opto/macro.cpp Co-authored-by: Roberto Casta?eda Lozano - Update src/hotspot/share/opto/macro.cpp Co-authored-by: Roberto Casta?eda Lozano - Update src/hotspot/share/opto/graphKit.cpp Co-authored-by: Roberto Casta?eda Lozano - Update src/hotspot/share/opto/graphKit.cpp Co-authored-by: Roberto Casta?eda Lozano - Update src/hotspot/share/opto/multnode.hpp Co-authored-by: Roberto Casta?eda Lozano - Update src/hotspot/share/opto/multnode.hpp Co-authored-by: Roberto Casta?eda Lozano - Update test/hotspot/jtreg/compiler/macronodes/TestEliminationOfAllocationWithoutUse.java Co-authored-by: Roberto Casta?eda Lozano ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/b701d03e..6ea8c811 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=11-12 Stats: 7 lines in 4 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From iwalulya at openjdk.org Mon Sep 22 12:18:30 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 22 Sep 2025 12:18:30 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v67] In-Reply-To: References: Message-ID: On Mon, 22 Sep 2025 11:40:01 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai: bufferNodeList can be removed Still Good! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23739#pullrequestreview-3252343247 From roland at openjdk.org Mon Sep 22 13:29:27 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 22 Sep 2025 13:29:27 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Fri, 19 Sep 2025 13:02:43 GMT, Roberto Casta?eda Lozano wrote: >> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - lambda return >> - lambda clean up >> - Merge branch 'master' into JDK-8327963 >> - ... and 35 more: https://git.openjdk.org/jdk/compare/e16c5100...b701d03e > > test/hotspot/jtreg/compiler/escapeAnalysis/TestIterativeEA.java line 53: > >> 51: analyzer.shouldContain("++++ Eliminated: 26 Allocate"); >> 52: analyzer.shouldContain("++++ Eliminated: 51 Allocate"); >> 53: analyzer.shouldContain("++++ Eliminated: 84 Allocate"); > > Did you analyze why there are more allocations removed than before in this test case? I did not expect this changeset to have an effect on the number of removed allocations. There are not more allocations removed. The message is confusing. "Eliminated: 84 Allocate" logs that node number 84 was eliminated (and not 84 nodes). This patch changes the number of nodes required at allocations so it also has an impact on node numbering. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2368395832 From roland at openjdk.org Mon Sep 22 13:37:54 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 22 Sep 2025 13:37:54 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v14] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/6ea8c811..9fd8dc1c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=12-13 Stats: 42 lines in 10 files changed: 10 ins; 21 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Mon Sep 22 13:38:02 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 22 Sep 2025 13:38:02 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Fri, 19 Sep 2025 12:41:06 GMT, Roberto Casta?eda Lozano wrote: >> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - lambda return >> - lambda clean up >> - Merge branch 'master' into JDK-8327963 >> - ... and 35 more: https://git.openjdk.org/jdk/compare/e16c5100...b701d03e > > src/hotspot/share/opto/multnode.cpp line 73: > >> 71: }; >> 72: return apply_to_projs(filter, which_proj); >> 73: } > > Consider moving this implementation to `multnode.hpp`, perhaps next to that of `MultiNode::apply_to_projs(DUIterator_Fast& imax, DUIterator_Fast& i, Callback callback, uint which_proj)`, for consistency. Isn't it better practice to leave the implementation in the cpp file? It's not always possible because of templates so some of the related methods' implementation is in the hpp file but wouldn't we want to keep that to a minimum? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2368422480 From roland at openjdk.org Mon Sep 22 13:37:55 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 22 Sep 2025 13:37:55 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> <4374L3lkQK90wLxxOA7POBmIKNX2DFK-4pO4vj1bkuQ=.5b8d7825-a7f1-497f-ab66-02a85a266659@github.com> Message-ID: On Thu, 11 Sep 2025 07:48:10 GMT, Roberto Casta?eda Lozano wrote: >>> @rose00 @robcasloz I updated the change with a new way to avoid redundant projections. At matching time, before a `NarrowMemProj` is matched into a `MachProj`, new logic checks whether a `MachProj` already exists. That guarantees that no redundant `MachProj` are ever added. It also performs the new normalization at a major cut-point. What do you think? >> >> That sounds good to me, thank you for enforcing this Roland! I will re-run testing and have a new look at the changeset within the next days. > >> That sounds good to me, thank you for enforcing this Roland! I will re-run testing and have a new look at the changeset within the next days. > > Test results of b701d03ed335286587c4d2539dde715b091d30bd on top of jdk-26+14 look good. Will have a look at the code within the next days. @robcasloz thanks for the review. New commit addresses most of your comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-3319071258 From tschatzl at openjdk.org Mon Sep 22 13:45:56 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 13:45:56 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v67] In-Reply-To: References: Message-ID: On Mon, 22 Sep 2025 11:40:01 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai: bufferNodeList can be removed Aaand, off it goes... Thanks @walulyai @albertnetymk @theRealAph @TheRealMDoerr @robcasloz @RealFYang @offamitkumar @tarsa @tstuefe for your help to complete this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3319127975 From tschatzl at openjdk.org Mon Sep 22 13:51:01 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Sep 2025 13:51:01 GMT Subject: Integrated: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization In-Reply-To: References: Message-ID: <9Q7wwYABAQAZ5qlfJ_hzlDw45cI3ckG2TGJU2SdzJBk=.82ed13fa-23fa-4941-876d-f4a9bc73f582@github.com> On Sun, 23 Feb 2025 18:53:33 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... This pull request has now been integrated. Changeset: 8d5c0056 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/8d5c0056420731cbbd83f2d23837bbb5cdc9e4cc Stats: 7273 lines in 115 files changed: 2616 ins; 3672 del; 985 mod 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization Co-authored-by: Amit Kumar Co-authored-by: Martin Doerr Co-authored-by: Carlo Refice Co-authored-by: Fei Yang Reviewed-by: iwalulya, rcastanedalo, aph, ayang ------------- PR: https://git.openjdk.org/jdk/pull/23739 From mdoerr at openjdk.org Mon Sep 22 14:50:40 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 22 Sep 2025 14:50:40 GMT Subject: RFR: 8342382: Implement JEP 522: G1 GC: Improve Throughput by Reducing Synchronization [v67] In-Reply-To: References: Message-ID: <6SnHJ_DA3BBYNyb5po8OWnRa38wnA-k_MeANsUpT21U=.2e9dad9d-9d53-4eb9-ae83-d740e5d1eaff@github.com> On Mon, 22 Sep 2025 11:40:01 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. >> >> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad; // synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. >> >> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching c... > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai: bufferNodeList can be removed Congratulations! And thanks for updating it for such a long time! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23739#issuecomment-3319517386 From jbhateja at openjdk.org Tue Sep 23 10:41:25 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 23 Sep 2025 10:41:25 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Fri, 19 Sep 2025 16:24:44 GMT, Srinivas Vamsi Parasa wrote: >> Thanks for answering my questions.. things we checked: >> - double-checked reg parameter values of cpuid against the spec >> - double-checked endianness of bitset variables in C grammar >> - double-checked how offset to the field std_cpuid29_ebx is computed >> >> Change looks good to me > >> Change looks good to me >> > Thanks Vlad for going through the changes and reviewing the PR! Hi @vamsi-parasa , Before this PR, we could validate APX support using the publicly available latest version 9.58 of Intel software development emulator. EMR>sde64 --version Intel(R) Software Development Emulator. Version: 9.58.0 external (0) Copyright (C) 2008-2025, Intel Corporation. All rights reserved. EMR>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep APX bool UseAPX = true {ARCH experimental} {command line} After this PR, UseAPX support is false, I think we should only upstream the support which can be validated. EMR>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep APX OpenJDK 64-Bit Server VM warning: UseAPX is not supported on this CPU, setting it to false bool UseAPX = false {ARCH experimental} {command line} Best Regards, Jatin ------------- PR Comment: https://git.openjdk.org/jdk/pull/27320#issuecomment-3323431153 From fbredberg at openjdk.org Tue Sep 23 12:17:28 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 23 Sep 2025 12:17:28 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code Message-ID: This is a general cleanup after removing `LockingMode` related code. It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). It includes: - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. - Removing or rewriting comments, arguments or functions that are related to displaced headers. - Remove "always true" parameter from `MonitorExitStub`. - Re-type/name metadata in `BasicLock`. Tier1-5 passes okay on supported platforms. All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. ------------- Commit messages: - Merge branch 'master' into 8365191_lockingmode_cleanup - 8365191: Cleanup after removing LockingMode related code Changes: https://git.openjdk.org/jdk/pull/27448/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27448&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365191 Stats: 168 lines in 34 files changed: 2 ins; 43 del; 123 mod Patch: https://git.openjdk.org/jdk/pull/27448.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27448/head:pull/27448 PR: https://git.openjdk.org/jdk/pull/27448 From fbredberg at openjdk.org Tue Sep 23 13:16:30 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 23 Sep 2025 13:16:30 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. @bulasevich, @TheRealMDoerr, @RealFYang, @offamitkumar I've run rudimentary tests using QEMU, but it would be nice if you guys (or any of your friends) could take it for a spin on real hardware. Thanks in advance. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27448#issuecomment-3323970915 From mdoerr at openjdk.org Tue Sep 23 13:59:43 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 23 Sep 2025 13:59:43 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. PPC64 and hotspot/share/c1 changes LGTM. Thanks for cleaning it up! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3258165227 From ayang at openjdk.org Tue Sep 23 16:47:09 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Sep 2025 16:47:09 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3258855405 From coleenp at openjdk.org Tue Sep 23 17:33:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Sep 2025 17:33:54 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. This looks really good with a couple of minor comments. src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 171: > 169: nonstatic_field(Array, _data[0], Klass*) \ > 170: \ > 171: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ I don't see any references to this in the JVMCI code either. I assume the compiler/jvmci tests all passed with this change without any change to jvmci code. Maybe @mur47x111 can confirm. src/hotspot/share/runtime/vmStructs.cpp line 685: > 683: volatile_nonstatic_field(ObjectMonitor, _owner, int64_t) \ > 684: volatile_nonstatic_field(ObjectMonitor, _next_om, ObjectMonitor*) \ > 685: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ Since nothing now refers to this, you can delete it from vmStructs. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3258983275 PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2373014053 PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2373006396 From dholmes at openjdk.org Tue Sep 23 22:54:04 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 23 Sep 2025 22:54:04 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: <4qcEZYuk3_U4nvbEh5EBhAtdJvJxjvvUwWiTlgaTaKk=.91630c2c-57c7-47e9-b4d9-55ea80b0a3f3@github.com> On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. LGTM2 ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3259822820 From fyang at openjdk.org Wed Sep 24 02:22:57 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 24 Sep 2025 02:22:57 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. @fbredber : Thanks for the ping. Tier1 test passes on linux-riscv64 platform. The RISC-V part of the change seems fine modulo one minor nit. src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.hpp line 71: > 69: // basic_lock: must be x10 & must point to the basic lock, contents destroyed > 70: // temp : temporary register, must not be scratch register t0 or t1 > 71: void unlock_object(Register swap, Register obj, Register lock, Register temp, Label& slow_case); You might want to rename the third param `lock` to `basic_lock`. void unlock_object(Register swap, Register obj, Register basic_lock, Register temp, Label& slow_case); ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3260355920 PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2373865694 From amitkumar at openjdk.org Wed Sep 24 06:22:16 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 24 Sep 2025 06:22:16 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. s390x Part looks good. tier1 tests with fastdebug and release vm seems stable as well. ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3261185513 From yzheng at openjdk.org Wed Sep 24 11:16:52 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 24 Sep 2025 11:16:52 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 17:30:05 GMT, Coleen Phillimore wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 171: > >> 169: nonstatic_field(Array, _data[0], Klass*) \ >> 170: \ >> 171: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ > > I don't see any references to this in the JVMCI code either. I assume the compiler/jvmci tests all passed with this change without any change to jvmci code. Maybe @mur47x111 can confirm. Correct, I dont think JVMCI tests will be affected. We only use this field (offset) in the actual monitorenter implementation to write ObjectMonitor cache. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375418781 From fbredberg at openjdk.org Wed Sep 24 11:47:49 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 11:47:49 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: <0xLQ9GAdAiDIXkO9rqULMOLK4oLe1_9nGwNKhxK-_7M=.57a07007-60e5-4d63-9bae-2b9442272a91@github.com> On Tue, 23 Sep 2025 17:26:36 GMT, Coleen Phillimore wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > src/hotspot/share/runtime/vmStructs.cpp line 685: > >> 683: volatile_nonstatic_field(ObjectMonitor, _owner, int64_t) \ >> 684: volatile_nonstatic_field(ObjectMonitor, _next_om, ObjectMonitor*) \ >> 685: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ > > Since nothing now refers to this, you can delete it from vmStructs. According to @mur47x111, they still need this line for their fast locking implementation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375512919 From rcastanedalo at openjdk.org Wed Sep 24 11:55:46 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 24 Sep 2025 11:55:46 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: <2rgLRKD7peDnD-efre0nNmYy_7xONt3R0jbnQ7Se47Q=.1df361f5-6c9d-4f4a-b93f-fa6fbbdf93a1@github.com> On Mon, 22 Sep 2025 13:32:13 GMT, Roland Westrelin wrote: >> src/hotspot/share/opto/multnode.cpp line 73: >> >>> 71: }; >>> 72: return apply_to_projs(filter, which_proj); >>> 73: } >> >> Consider moving this implementation to `multnode.hpp`, perhaps next to that of `MultiNode::apply_to_projs(DUIterator_Fast& imax, DUIterator_Fast& i, Callback callback, uint which_proj)`, for consistency. > > Isn't it better practice to leave the implementation in the cpp file? It's not always possible because of templates so some of the related methods' implementation is in the hpp file but wouldn't we want to keep that to a minimum? Fair enough. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2375531130 From mhaessig at openjdk.org Wed Sep 24 12:00:20 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 24 Sep 2025 12:00:20 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Tue, 2 Sep 2025 20:52:32 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with three additional commits since the last revision: > > - revert whitespace change > - undo debug changes > - cleanup Thank you again for this extensive cleanup. I did another, more thorough, pass and have a few questions and suggestions. src/hotspot/cpu/arm/arm_32.ad line 436: > 434: bool far = (_method == nullptr) ? maybe_far_call(this) : !cache_reachable(); > 435: return (far ? 3 : 1) * NativeInstruction::instruction_size; > 436: } Why do we still need the `instruction_size` offset? Are all static java calls now method handles? src/hotspot/cpu/arm/frame_arm.cpp line 365: > 363: DEBUG_ONLY(verify_deopt_original_pc(sender_nm, _unextended_sp)); > 364: } > 365: } All of this could be `NOT_PRODUCT` and the method `const` if I did not miss any side effects. src/hotspot/cpu/arm/frame_arm.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/cpu/arm/register_arm.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/share/code/debugInfoRec.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/share/code/nmethod.inline.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/share/code/pcDesc.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/share/jvmci/jvmciCodeInstaller.hpp line 1: > 1: /* Please update the copyright year. src/hotspot/share/opto/matcher.hpp line 1: > 1: /* Please update the copyright year. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/PCDesc.java line 1: > 1: /* Please update the copyright year. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/aarch64/AARCH64Frame.java line 1: > 1: /* Please update the copyright year. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/riscv64/RISCV64Frame.java line 1: > 1: /* Please update the copyright year. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java line 1: > 1: /* Please update the copyright year. ------------- Changes requested by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/27059#pullrequestreview-3262358336 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375411757 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375419504 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375518959 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375519168 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375519398 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375523797 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375524042 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375524330 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375524675 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375525018 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375525797 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375526227 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2375527000 From rcastanedalo at openjdk.org Wed Sep 24 12:01:28 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 24 Sep 2025 12:01:28 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Mon, 22 Sep 2025 13:26:05 GMT, Roland Westrelin wrote: >> test/hotspot/jtreg/compiler/escapeAnalysis/TestIterativeEA.java line 53: >> >>> 51: analyzer.shouldContain("++++ Eliminated: 26 Allocate"); >>> 52: analyzer.shouldContain("++++ Eliminated: 51 Allocate"); >>> 53: analyzer.shouldContain("++++ Eliminated: 84 Allocate"); >> >> Did you analyze why there are more allocations removed than before in this test case? I did not expect this changeset to have an effect on the number of removed allocations. > > There are not more allocations removed. The message is confusing. > "Eliminated: 84 Allocate" logs that node number 84 was eliminated (and not 84 nodes). > This patch changes the number of nodes required at allocations so it also has an impact on node numbering. I see, thanks. Expecting specific C2 node identifiers seems fragile. I understand it is a pre-existing issue, but since this changeset needs to address it anyway, please consider making it more robust by e.g. using regular expression matching. Here is a suggestion, feel free to incorporate it: https://github.com/openjdk/jdk/commit/9fd6378156187e497b1e4233d57282cad9ede29f. The ultimately improvement would be using the IR test framework, but that is out of scope here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2375544305 From fbredberg at openjdk.org Wed Sep 24 12:01:41 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 12:01:41 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v2] In-Reply-To: References: Message-ID: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: Update after review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27448/files - new: https://git.openjdk.org/jdk/pull/27448/files/b5d57851..85457638 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27448&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27448&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27448.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27448/head:pull/27448 PR: https://git.openjdk.org/jdk/pull/27448 From fbredberg at openjdk.org Wed Sep 24 12:01:43 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 12:01:43 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v2] In-Reply-To: References: Message-ID: On Wed, 24 Sep 2025 02:09:02 GMT, Fei Yang wrote: >> Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after review > > src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.hpp line 71: > >> 69: // basic_lock: must be x10 & must point to the basic lock, contents destroyed >> 70: // temp : temporary register, must not be scratch register t0 or t1 >> 71: void unlock_object(Register swap, Register obj, Register lock, Register temp, Label& slow_case); > > You might want to rename the third param `lock` to `basic_lock`. > > > void unlock_object(Register swap, Register obj, Register basic_lock, Register temp, Label& slow_case); Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375539612 From fbredberg at openjdk.org Wed Sep 24 12:01:46 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 12:01:46 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v2] In-Reply-To: References: Message-ID: <8fpIz-2CpbqESb-a1kb_o8io13ZPzLgyWF1acLUn-A0=.b4a88a33-c938-40c9-aab2-cae6a28b0a45@github.com> On Wed, 24 Sep 2025 11:13:42 GMT, Yudi Zheng wrote: >> src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 171: >> >>> 169: nonstatic_field(Array, _data[0], Klass*) \ >>> 170: \ >>> 171: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ >> >> I don't see any references to this in the JVMCI code either. I assume the compiler/jvmci tests all passed with this change without any change to jvmci code. Maybe @mur47x111 can confirm. > > Correct, I dont think JVMCI tests will be affected. We only use this field (offset) in the actual monitorenter implementation to write ObjectMonitor cache. > edit: Sorry I missed the `delete it from vmStructs` context. We need this line for our fast locking implementation Removed line 171 from `vmStructs_jvmci.cpp`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375544175 From rcastanedalo at openjdk.org Wed Sep 24 12:22:52 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 24 Sep 2025 12:22:52 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v14] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: <8-Rrpyw2hYDMyFFmFreO9lCQhCIH7oiqxxO3yUeDyI0=.5edf4b3d-6b35-4338-a053-6aba56a95133@github.com> On Mon, 22 Sep 2025 13:37:54 GMT, Roland Westrelin wrote: >> An `Initialize` node for an `Allocate` node is created with a memory >> `Proj` of adr type raw memory. In order for stores to be captured, the >> memory state out of the allocation is a `MergeMem` with slices for the >> various object fields/array element set to the raw memory `Proj` of >> the `Initialize` node. If `Phi`s need to be created during later >> transformations from this memory state, The `Phi` for a particular >> slice gets its adr type from the type of the `Proj` which is raw >> memory. If during macro expansion, the `Allocate` is found to have no >> use and so can be removed, the `Proj` out of the `Initialize` is >> replaced by the memory state on input to the `Allocate`. A `Phi` for >> some slice for a field of an object will end up with the raw memory >> state on input to the `Allocate` node. As a result, memory state at >> the `Phi` is incorrect and incorrect execution can happen. >> >> The fix I propose is, rather than have a single `Proj` for the memory >> state out of the `Initialize` with adr type raw memory, to use one >> `Proj` per slice added to the memory state after the `Initalize`. Each >> of the `Proj` should return the right adr type for its slice. For that >> I propose having a new type of `Proj`: `NarrowMemProj` that captures >> the right adr type. >> >> Logic for the construction of the `Allocate`/`Initialize` subgraph is >> tweaked so the right adr type captured in is own `NarrowMemProj` is >> added to the memory sugraph. Code that removes an allocation or moves >> it also has to be changed so it correctly takes the multiple memory >> projections out of the `Initialize` node into account. >> >> One tricky issue is that when EA split types for a scalar replaceable >> `Allocate` node: >> >> 1- the adr type captured in the `NarrowMemProj` becomes out of sync >> with the type of the slices for the allocation >> >> 2- before EA, the memory state for one particular field out of the >> `Initialize` node can be used for a `Store` to the just allocated >> object or some other. So we can have a chain of `Store`s, some to >> the newly allocated object, some to some other objects, all of them >> using the state of `NarrowMemProj` out of the `Initialize`. After >> split unique types, the `NarrowMemProj` is for the slice of a >> particular allocation. So `Store`s to some other objects shouldn't >> use that memory state but the memory state before the `Allocate`. >> >> For that, I added logic to update the adr type of `NarrowMemProj` >> during split uni... > > Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for addressing my comments, Roland. I have a couple of follow-up questions. I also realized that we need to adjust IGV's custom logic to schedule the new projection nodes more accurately and combine them into their parent nodes when using the "Condense graph" filter. Please consider incorporating the following patch into this changeset: https://github.com/openjdk/jdk/commit/63a536a1f83aaa10b938eff2d25aac3c68ed57a1. ------------- Changes requested by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3262606037 From rcastanedalo at openjdk.org Wed Sep 24 12:23:00 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 24 Sep 2025 12:23:00 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Fri, 19 Sep 2025 12:55:55 GMT, Roberto Casta?eda Lozano wrote: >> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - Merge branch 'master' into JDK-8327963 >> - more >> - more >> - lambda return >> - lambda clean up >> - Merge branch 'master' into JDK-8327963 >> - ... and 35 more: https://git.openjdk.org/jdk/compare/e16c5100...b701d03e > > src/hotspot/share/opto/macro.cpp line 1606: > >> 1604: // elimination. Simply add the MemBarStoreStore after object >> 1605: // initialization. >> 1606: MemBarNode* mb = MemBarNode::make(C, Op_MemBarStoreStore, Compile::AliasIdxRaw); > > Does the same argument as below apply for relaxing the scope of this memory barrier? Please clarify in a similar comment for this case (if the same argument applies, a reference to the comment below would be enough). Thanks for adding the comment. A follow-up question: the full comment below makes the argument that _re-ordering by the compiler can't happen by construction_ because _a later Store that publishes the just allocated object reference is indirectly control dependent on the Initialize node_. However, in this case, there may be no such Initialize node (`init == nullptr || init->req() < InitializeNode::RawStores`). I assume the memory barrier relaxation is still OK in this scenario because we cannot have later, publishing stores of the allocated object reference? That is, if there exists such a store then there must necessarily exist an Initialize node? Or is there any other reason I am missing? It would be good to clarify this point in the comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2375585788 From coleenp at openjdk.org Wed Sep 24 12:50:50 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Sep 2025 12:50:50 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v2] In-Reply-To: <0xLQ9GAdAiDIXkO9rqULMOLK4oLe1_9nGwNKhxK-_7M=.57a07007-60e5-4d63-9bae-2b9442272a91@github.com> References: <0xLQ9GAdAiDIXkO9rqULMOLK4oLe1_9nGwNKhxK-_7M=.57a07007-60e5-4d63-9bae-2b9442272a91@github.com> Message-ID: On Wed, 24 Sep 2025 11:44:58 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/vmStructs.cpp line 685: >> >>> 683: volatile_nonstatic_field(ObjectMonitor, _owner, int64_t) \ >>> 684: volatile_nonstatic_field(ObjectMonitor, _next_om, ObjectMonitor*) \ >>> 685: volatile_nonstatic_field(BasicLock, _monitor, ObjectMonitor*) \ >> >> Since nothing now refers to this, you can delete it from vmStructs. > > According to @mur47x111, they still need this line for their fast locking implementation. Oh you were supposed to leave the field in vmStructs_jvmci.cpp and remove it from this one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375681060 From fbredberg at openjdk.org Wed Sep 24 12:58:57 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 12:58:57 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v2] In-Reply-To: References: <0xLQ9GAdAiDIXkO9rqULMOLK4oLe1_9nGwNKhxK-_7M=.57a07007-60e5-4d63-9bae-2b9442272a91@github.com> Message-ID: On Wed, 24 Sep 2025 12:48:00 GMT, Coleen Phillimore wrote: >> According to @mur47x111, they still need this line for their fast locking implementation. > > Oh you were supposed to leave the field in vmStructs_jvmci.cpp and remove it from this one. Yea, I got it all mixed up in my head. Will fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375703603 From fbredberg at openjdk.org Wed Sep 24 13:05:23 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 13:05:23 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: Message-ID: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: Fixed a mixup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27448/files - new: https://git.openjdk.org/jdk/pull/27448/files/85457638..43f9c0af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27448&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27448&range=01-02 Stats: 3 lines in 2 files changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27448.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27448/head:pull/27448 PR: https://git.openjdk.org/jdk/pull/27448 From fbredberg at openjdk.org Wed Sep 24 13:05:24 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 13:05:24 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: <8fpIz-2CpbqESb-a1kb_o8io13ZPzLgyWF1acLUn-A0=.b4a88a33-c938-40c9-aab2-cae6a28b0a45@github.com> References: <8fpIz-2CpbqESb-a1kb_o8io13ZPzLgyWF1acLUn-A0=.b4a88a33-c938-40c9-aab2-cae6a28b0a45@github.com> Message-ID: On Wed, 24 Sep 2025 11:58:43 GMT, Fredrik Bredberg wrote: >> Correct, I dont think JVMCI tests will be affected. We only use this field (offset) in the actual monitorenter implementation to write ObjectMonitor cache. >> edit: Sorry I missed the `delete it from vmStructs` context. We need this line for our fast locking implementation > > Removed line 171 from `vmStructs_jvmci.cpp`. Reinstalled line 171 from vmStructs_jvmci.cpp. Sorry for the mixup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375719236 From fbredberg at openjdk.org Wed Sep 24 13:05:25 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 24 Sep 2025 13:05:25 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: <0xLQ9GAdAiDIXkO9rqULMOLK4oLe1_9nGwNKhxK-_7M=.57a07007-60e5-4d63-9bae-2b9442272a91@github.com> Message-ID: On Wed, 24 Sep 2025 12:55:51 GMT, Fredrik Bredberg wrote: >> Oh you were supposed to leave the field in vmStructs_jvmci.cpp and remove it from this one. > > Yea, I got it all mixed up in my head. Will fix it. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27448#discussion_r2375717181 From coleenp at openjdk.org Wed Sep 24 13:07:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Sep 2025 13:07:00 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: Message-ID: On Wed, 24 Sep 2025 13:05:23 GMT, Fredrik Bredberg wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > Fixed a mixup Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3262810994 From yzheng at openjdk.org Wed Sep 24 13:13:53 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 24 Sep 2025 13:13:53 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: Message-ID: <0-k1XZdfzcUtKM2YhXUXW7ySBnZUxyU4DZ-kUpzWSPM=.0cf78c01-e8db-4f55-94b1-a499b5b2fc85@github.com> On Wed, 24 Sep 2025 13:05:23 GMT, Fredrik Bredberg wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > Fixed a mixup LGTM ------------- Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3262839280 From dlong at openjdk.org Wed Sep 24 21:25:08 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Sep 2025 21:25:08 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v4] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'openjdk:master' into 8366461-mh-invoke - revert whitespace change - undo debug changes - cleanup - arm32 build - Merge branch 'openjdk:master' into 8366461-mh-invoke - first pass ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/eac482a5..a4f2383c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=02-03 Stats: 179654 lines in 2239 files changed: 141573 ins; 24044 del; 14037 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From dlong at openjdk.org Wed Sep 24 21:36:30 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Sep 2025 21:36:30 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v5] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: copyright year update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/a4f2383c..81d56860 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=03-04 Stats: 10 lines in 10 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From dlong at openjdk.org Wed Sep 24 21:40:44 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Sep 2025 21:40:44 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Wed, 24 Sep 2025 11:10:22 GMT, Manuel H?ssig wrote: >> Dean Long has updated the pull request incrementally with three additional commits since the last revision: >> >> - revert whitespace change >> - undo debug changes >> - cleanup > > src/hotspot/cpu/arm/arm_32.ad line 436: > >> 434: bool far = (_method == nullptr) ? maybe_far_call(this) : !cache_reachable(); >> 435: return (far ? 3 : 1) * NativeInstruction::instruction_size; >> 436: } > > Why do we still need the `instruction_size` offset? Are all static java calls now method handles? The offset is in bytes, so we still need to convert from instruction count to bytes with instruction_size. This change adjusts for the fact that method handle calls have 1 fewer instruction on arm32 now, because preserve_SP was removed. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java line 1: > >> 1: /* > > Please update the copyright year. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2377135523 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2377136479 From dlong at openjdk.org Wed Sep 24 21:47:29 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Sep 2025 21:47:29 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v6] In-Reply-To: References: Message-ID: <1ChaRPjjHEx-yM_RJR8JDfbq3v8mhNkkM4WomJstN_o=.ac7eeeed-250e-4511-bdeb-ae720aa5cc20@github.com> > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: copyright year update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/81d56860..10668bd7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From dlong at openjdk.org Wed Sep 24 22:19:03 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Sep 2025 22:19:03 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Wed, 24 Sep 2025 11:14:01 GMT, Manuel H?ssig wrote: >> Dean Long has updated the pull request incrementally with three additional commits since the last revision: >> >> - revert whitespace change >> - undo debug changes >> - cleanup > > src/hotspot/cpu/arm/frame_arm.cpp line 365: > >> 363: DEBUG_ONLY(verify_deopt_original_pc(sender_nm, _unextended_sp)); >> 364: } >> 365: } > > All of this could be `NOT_PRODUCT` and the method `const` if I did not miss any side effects. Right, there is no adjustment anymore on any platform. I think this function and verify_deopt_original_pc only ever existed to support code that is now getting removed. So I could change the name to verify_unextended_sp() and make it const, but it might make more sense to remove both this function and verify_deopt_original_pc now. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2377202692 From sparasa at openjdk.org Thu Sep 25 00:48:37 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Thu, 25 Sep 2025 00:48:37 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Tue, 23 Sep 2025 10:37:53 GMT, Jatin Bhateja wrote: > EMR>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep APX > OpenJDK 64-Bit Server VM warning: UseAPX is not supported on this CPU, setting it to false > Hi Jatin, Thank for informing about this issue! Sorry about the SDE issue. Will inform you once the public version of SDE which supports this feature is avalible. Thanks, Vamsi ------------- PR Comment: https://git.openjdk.org/jdk/pull/27320#issuecomment-3331286352 From dholmes at openjdk.org Thu Sep 25 02:09:35 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 25 Sep 2025 02:09:35 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: Message-ID: On Wed, 24 Sep 2025 13:05:23 GMT, Fredrik Bredberg wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > Fixed a mixup Still good. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27448#pullrequestreview-3265275100 From mhaessig at openjdk.org Thu Sep 25 06:51:09 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 25 Sep 2025 06:51:09 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Wed, 24 Sep 2025 22:16:17 GMT, Dean Long wrote: >> src/hotspot/cpu/arm/frame_arm.cpp line 365: >> >>> 363: DEBUG_ONLY(verify_deopt_original_pc(sender_nm, _unextended_sp)); >>> 364: } >>> 365: } >> >> All of this could be `NOT_PRODUCT` and the method `const` if I did not miss any side effects. > > Right, there is no adjustment anymore on any platform. I think this function and verify_deopt_original_pc only ever existed to support code that is now getting removed. So I could change the name to verify_unextended_sp() and make it const, but it might make more sense to remove both this function and verify_deopt_original_pc now. What do you think? I would rather keep this code as a debug only sanity check, but I would refactor it into a single function. Then the question remains what to do with the SA code, that still does nothing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2377932365 From fbredberg at openjdk.org Thu Sep 25 08:19:21 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 25 Sep 2025 08:19:21 GMT Subject: RFR: 8365191: Cleanup after removing LockingMode related code [v3] In-Reply-To: References: Message-ID: <1jgUzSlIIf8uo55h2Ai9cPZjmqQ6OyYs61j2TtgfH8w=.09872546-a1c0-44bc-afe8-d5b26fcf3889@github.com> On Wed, 24 Sep 2025 13:05:23 GMT, Fredrik Bredberg wrote: >> This is a general cleanup after removing `LockingMode` related code. >> It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). >> It includes: >> - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. >> - Removing or rewriting comments, arguments or functions that are related to displaced headers. >> - Remove "always true" parameter from `MonitorExitStub`. >> - Re-type/name metadata in `BasicLock`. >> >> Tier1-5 passes okay on supported platforms. >> >> All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > Fixed a mixup Thank you all for the reviews. Now let's... ------------- PR Comment: https://git.openjdk.org/jdk/pull/27448#issuecomment-3332749700 From fbredberg at openjdk.org Thu Sep 25 08:19:22 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 25 Sep 2025 08:19:22 GMT Subject: Integrated: 8365191: Cleanup after removing LockingMode related code In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 09:29:57 GMT, Fredrik Bredberg wrote: > This is a general cleanup after removing `LockingMode` related code. > It's a sub-task of [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261). > It includes: > - Removing asserts that are no longer necessary, since we removed legacy locking and monitor locking. > - Removing or rewriting comments, arguments or functions that are related to displaced headers. > - Remove "always true" parameter from `MonitorExitStub`. > - Re-type/name metadata in `BasicLock`. > > Tier1-5 passes okay on supported platforms. > > All other platforms (arm, ppc, riscv and s390) has been sanity checked using Qemu. This pull request has now been integrated. Changeset: 847b107d Author: Fredrik Bredberg URL: https://git.openjdk.org/jdk/commit/847b107df821e0c1d347383f1858d505137eb724 Stats: 169 lines in 34 files changed: 2 ins; 44 del; 123 mod 8365191: Cleanup after removing LockingMode related code Reviewed-by: coleenp, dholmes, yzheng, mdoerr, ayang, fyang, amitkumar ------------- PR: https://git.openjdk.org/jdk/pull/27448 From duke at openjdk.org Thu Sep 25 09:17:16 2025 From: duke at openjdk.org (erifan) Date: Thu, 25 Sep 2025 09:17:16 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v8] In-Reply-To: References: Message-ID: On Wed, 20 Aug 2025 10:11:47 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update callGenerator.hpp copyright year src/hotspot/share/classfile/vmIntrinsics.hpp line 1178: > 1176: "Ljdk/internal/vm/vector/VectorSupport$Vector;" \ > 1177: "Ljdk/internal/vm/vector/VectorSupport$VectorSliceOp;)" \ > 1178: "Ljdk/internal/vm/vector/VectorSupport$Vector;") \ Seems this `` is not aligned ? src/hotspot/share/classfile/vmIntrinsics.hpp line 1179: > 1177: "Ljdk/internal/vm/vector/VectorSupport$VectorSliceOp;)" \ > 1178: "Ljdk/internal/vm/vector/VectorSupport$Vector;") \ > 1179: do_name(vector_slice_name, "sliceOp") \ ditto test/hotspot/jtreg/compiler/vectorapi/TestSliceOptValueTransforms.java line 45: > 43: public static final VectorSpecies SSP = ShortVector.SPECIES_PREFERRED; > 44: public static final VectorSpecies ISP = IntVector.SPECIES_PREFERRED; > 45: public static final VectorSpecies LSP = LongVector.SPECIES_PREFERRED; The implementation supports floating point types, but why doesn't the test include fp types? test/hotspot/jtreg/compiler/vectorapi/TestSliceOptValueTransforms.java line 122: > 120: .intoArray(bdst, i); > 121: } > 122: } Since this optimization also benefits the slice variant with mask, could you add some tests for it as well? test/micro/org/openjdk/bench/jdk/incubator/vector/VectorSliceBenchmark.java line 59: > 57: static final VectorSpecies sspecies = ShortVector.SPECIES_PREFERRED; > 58: static final VectorSpecies ispecies = IntVector.SPECIES_PREFERRED; > 59: static final VectorSpecies lspecies = LongVector.SPECIES_PREFERRED; Ditto, no fp types ? test/micro/org/openjdk/bench/jdk/incubator/vector/VectorSliceBenchmark.java line 133: > 131: .intoArray(bdst, i); > 132: } > 133: } Ditto, add a benchmark for the slice variant with mask ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378092410 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378093047 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378310217 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378337340 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378312763 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2378342519 From duke at openjdk.org Thu Sep 25 12:44:33 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 12:44:33 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v15] In-Reply-To: References: Message-ID: > This PR will consolidate fixes of the following bugs: > > https://bugs.openjdk.org/browse/JDK-8361188 > https://bugs.openjdk.org/browse/JDK-8361189 > https://bugs.openjdk.org/browse/JDK-8361190 > https://bugs.openjdk.org/browse/JDK-8361191 > https://bugs.openjdk.org/browse/JDK-8361192 > https://bugs.openjdk.org/browse/JDK-8361193 > https://bugs.openjdk.org/browse/JDK-8361195 > > This PR depends on https://github.com/openjdk/jdk/pull/25971 > > For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 Khalid Boulanouare has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 74 commits: - Merge branch 'openjdk:master' into jdk-8360498 - Removes not needed changes - Revert "Removes not needed changes" This reverts commit e76780d50cc390e35443dccb193cfbc9a1cec1cb. - Removes not needed changes - Removes extra white lines - Merge branch 'pr/25971' into jdk-8360498 - Merge branch 'openjdk:master' into jdk-8158801 - Merge branch 'pr/25971' into jdk-8360498 - Merge branch 'openjdk:master' into jdk-8158801 - Centers missed frames in the middle of screen - ... and 64 more: https://git.openjdk.org/jdk/compare/26b5708c...e5753d14 ------------- Changes: https://git.openjdk.org/jdk/pull/26625/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=14 Stats: 185 lines in 14 files changed: 96 ins; 44 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/26625.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26625/head:pull/26625 PR: https://git.openjdk.org/jdk/pull/26625 From duke at openjdk.org Thu Sep 25 13:03:51 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 13:03:51 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v16] In-Reply-To: References: Message-ID: <5YzcczRe9VgQqc7POMiumcCnf6esXSQO6PpUbIuuxhA=.efcb29a6-3705-475d-901c-515838317bfe@github.com> > This PR will consolidate fixes of the following bugs: > > https://bugs.openjdk.org/browse/JDK-8361188 > https://bugs.openjdk.org/browse/JDK-8361189 > https://bugs.openjdk.org/browse/JDK-8361190 > https://bugs.openjdk.org/browse/JDK-8361191 > https://bugs.openjdk.org/browse/JDK-8361192 > https://bugs.openjdk.org/browse/JDK-8361193 > https://bugs.openjdk.org/browse/JDK-8361195 > > This PR depends on https://github.com/openjdk/jdk/pull/25971 > > For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 Khalid Boulanouare has updated the pull request incrementally with one additional commit since the last revision: Resolves confict for when there is a merge with jdk-8158801 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26625/files - new: https://git.openjdk.org/jdk/pull/26625/files/e5753d14..8794db9a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=14-15 Stats: 55 lines in 1 file changed: 36 ins; 3 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/26625.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26625/head:pull/26625 PR: https://git.openjdk.org/jdk/pull/26625 From alanb at openjdk.org Thu Sep 25 13:30:34 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 25 Sep 2025 13:30:34 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v16] In-Reply-To: <5YzcczRe9VgQqc7POMiumcCnf6esXSQO6PpUbIuuxhA=.efcb29a6-3705-475d-901c-515838317bfe@github.com> References: <5YzcczRe9VgQqc7POMiumcCnf6esXSQO6PpUbIuuxhA=.efcb29a6-3705-475d-901c-515838317bfe@github.com> Message-ID: On Thu, 25 Sep 2025 13:03:51 GMT, Khalid Boulanouare wrote: >> This PR will consolidate fixes of the following bugs: >> >> https://bugs.openjdk.org/browse/JDK-8361188 >> https://bugs.openjdk.org/browse/JDK-8361189 >> https://bugs.openjdk.org/browse/JDK-8361190 >> https://bugs.openjdk.org/browse/JDK-8361191 >> https://bugs.openjdk.org/browse/JDK-8361192 >> https://bugs.openjdk.org/browse/JDK-8361193 >> https://bugs.openjdk.org/browse/JDK-8361195 >> >> This PR depends on https://github.com/openjdk/jdk/pull/25971 >> >> For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) >> >> For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 >> >> For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 >> >> For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 >> >> For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 > > Khalid Boulanouare has updated the pull request incrementally with one additional commit since the last revision: > > Resolves confict for when there is a merge with jdk-8158801 What is this about? The PR suggests 500+ commits and 300+ files changed but I think it's just a change to some AWT tests. Can you sync up the branch so that it only contains the changes to the AWT tests that you want to change, and remove all the labels except "client" as it will otherwise broadcast to all the mailing lists. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26625#issuecomment-3334044893 From duke at openjdk.org Thu Sep 25 14:00:10 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 14:00:10 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v16] In-Reply-To: References: <5YzcczRe9VgQqc7POMiumcCnf6esXSQO6PpUbIuuxhA=.efcb29a6-3705-475d-901c-515838317bfe@github.com> Message-ID: On Thu, 25 Sep 2025 13:26:46 GMT, Alan Bateman wrote: >> Khalid Boulanouare has updated the pull request incrementally with one additional commit since the last revision: >> >> Resolves confict for when there is a merge with jdk-8158801 > > What is this about? The PR suggests 500+ commits and 300+ files changed but I think it's just a change to some AWT tests. Can you sync up the branch so that it only contains the changes to the AWT tests that you want to change, and remove all the labels except "client" as it will otherwise broadcast to all the mailing lists. @AlanBateman This PR is created based on PR https://github.com/openjdk/jdk/tree/pr/25971. My branch https://github.com/kboulanou/jdk/tree/jdk-8360498 is only 2 commits behind master. I am waiting for approval for PR https://github.com/openjdk/jdk/tree/pr/25971 for this PR to follow. Please let me know if there is anything I need to do. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26625#issuecomment-3334251324 From duke at openjdk.org Thu Sep 25 14:00:08 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 14:00:08 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v17] In-Reply-To: References: Message-ID: > This PR will consolidate fixes of the following bugs: > > https://bugs.openjdk.org/browse/JDK-8361188 > https://bugs.openjdk.org/browse/JDK-8361189 > https://bugs.openjdk.org/browse/JDK-8361190 > https://bugs.openjdk.org/browse/JDK-8361191 > https://bugs.openjdk.org/browse/JDK-8361192 > https://bugs.openjdk.org/browse/JDK-8361193 > https://bugs.openjdk.org/browse/JDK-8361195 > > This PR depends on https://github.com/openjdk/jdk/pull/25971 > > For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 Khalid Boulanouare has updated the pull request incrementally with three additional commits since the last revision: - Merge branch 'openjdk:master' into jdk-8360498 - 8359378: aarch64: crash when using -XX:+UseFPUForSpilling Reviewed-by: aph, rcastanedalo - 8367103: RISC-V: store cpu features in a bitmap Reviewed-by: fyang, luhenry ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26625/files - new: https://git.openjdk.org/jdk/pull/26625/files/8794db9a..69c087a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=15-16 Stats: 214 lines in 3 files changed: 142 ins; 4 del; 68 mod Patch: https://git.openjdk.org/jdk/pull/26625.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26625/head:pull/26625 PR: https://git.openjdk.org/jdk/pull/26625 From duke at openjdk.org Thu Sep 25 14:37:52 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 14:37:52 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v18] In-Reply-To: References: Message-ID: > This PR will consolidate fixes of the following bugs: > > https://bugs.openjdk.org/browse/JDK-8361188 > https://bugs.openjdk.org/browse/JDK-8361189 > https://bugs.openjdk.org/browse/JDK-8361190 > https://bugs.openjdk.org/browse/JDK-8361191 > https://bugs.openjdk.org/browse/JDK-8361192 > https://bugs.openjdk.org/browse/JDK-8361193 > https://bugs.openjdk.org/browse/JDK-8361195 > > This PR depends on https://github.com/openjdk/jdk/pull/25971 > > For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 > > For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 > > For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 Khalid Boulanouare has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: - Merge branch 'pr/25971' into jdk-8360498 - Merge branch 'openjdk:master' into jdk-8360498 - Resolves confict for when there is a merge with jdk-8158801 - Merge branch 'openjdk:master' into jdk-8360498 - Removes not needed changes - Revert "Removes not needed changes" This reverts commit e76780d50cc390e35443dccb193cfbc9a1cec1cb. - Removes not needed changes - Removes extra white lines - Merge branch 'pr/25971' into jdk-8360498 - Merge branch 'pr/25971' into jdk-8360498 - ... and 27 more: https://git.openjdk.org/jdk/compare/59a937ac...900a7943 ------------- Changes: https://git.openjdk.org/jdk/pull/26625/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26625&range=17 Stats: 149 lines in 9 files changed: 129 ins; 6 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/26625.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26625/head:pull/26625 PR: https://git.openjdk.org/jdk/pull/26625 From rriggs at openjdk.org Thu Sep 25 14:37:56 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Thu, 25 Sep 2025 14:37:56 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v17] In-Reply-To: References: Message-ID: <_mcHKcaqJnPlGC2UstGSgIAigoWLW1hG4g_mMhDNbEU=.0ffd876c-1d19-455e-ac2a-fa39c8c33703@github.com> On Thu, 25 Sep 2025 14:00:08 GMT, Khalid Boulanouare wrote: >> This PR will consolidate fixes of the following bugs: >> >> https://bugs.openjdk.org/browse/JDK-8361188 >> https://bugs.openjdk.org/browse/JDK-8361189 >> https://bugs.openjdk.org/browse/JDK-8361190 >> https://bugs.openjdk.org/browse/JDK-8361191 >> https://bugs.openjdk.org/browse/JDK-8361192 >> https://bugs.openjdk.org/browse/JDK-8361193 >> https://bugs.openjdk.org/browse/JDK-8361195 >> >> This PR depends on https://github.com/openjdk/jdk/pull/25971 >> >> For test : java/awt/Mixing/AWT_Mixing/JComboBoxOverlapping.java, the fix suggested is to return false in method isValidForPixelCheck for embedded frame, in which case the component is set to null. For more details see bug: [JDK-8361188](https://bugs.openjdk.org/browse/JDK-8361188) >> >> For test : test/jdk/java/awt/Mixing/AWT_Mixing/MixingPanelsResizing.java, I had to create a a tolerance color matching method for mac for the tests to pass. Also, the jbuttons needed to have different color than the color of the background frame, in order for test to pass. For more detail see bug: https://bugs.openjdk.org/browse/JDK-8361193 >> >> For test : test/jdk/java/awt/Mixing/AWT_Mixing/JSplitPaneOverlapping.java, it seems that color selected for lightweight component matches the background color of the frame. And this will cause the test to fail when matching colors. Choosing any color different than the background color will get the test to pass. For more details, see bug: https://bugs.openjdk.org/browse/JDK-8361192 >> >> For test test/jdk/java/awt/Mixing/AWT_Mixing/JPopupMenuOverlapping.java, it looks like the frame when visible, the popup test does not work properly. The frame needs to be hidden for the test to click on popup. For more details see bug: https://bugs.openjdk.org/browse/JDK-8361191 >> >> For test test/jdk/java/awt/Mixing/AWT_Mixing/JMenuBarOverlapping.java, the test runs successfully but it times out after the default 2 minutes of jtreg. increasing the timeout to 3 minutes get the test to pass. For more details please refer to bug: https://bugs.openjdk.org/browse/JDK-8361190 > > Khalid Boulanouare has updated the pull request incrementally with three additional commits since the last revision: > > - Merge branch 'openjdk:master' into jdk-8360498 > - 8359378: aarch64: crash when using -XX:+UseFPUForSpilling > > Reviewed-by: aph, rcastanedalo > - 8367103: RISC-V: store cpu features in a bitmap > > Reviewed-by: fyang, luhenry (typo fix) ------------- PR Comment: https://git.openjdk.org/jdk/pull/26625#issuecomment-3334445631 From duke at openjdk.org Thu Sep 25 14:37:53 2025 From: duke at openjdk.org (Khalid Boulanouare) Date: Thu, 25 Sep 2025 14:37:53 GMT Subject: RFR: 8360498: [TEST_BUG] Some Mixing test continue to fail [v16] In-Reply-To: References: <5YzcczRe9VgQqc7POMiumcCnf6esXSQO6PpUbIuuxhA=.efcb29a6-3705-475d-901c-515838317bfe@github.com> Message-ID: On Thu, 25 Sep 2025 13:26:46 GMT, Alan Bateman wrote: >> Khalid Boulanouare has updated the pull request incrementally with one additional commit since the last revision: >> >> Resolves confict for when there is a merge with jdk-8158801 > > What is this about? The PR suggests 500+ commits and 300+ files changed but I think it's just a change to some AWT tests. Can you sync up the branch so that it only contains the changes to the AWT tests that you want to change, and remove all the labels except "client" as it will otherwise broadcast to all the mailing lists. @AlanBateman There was a merge conflict that I have not noticed. I have resolved it and now the PR is back to 37 commits only. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26625#issuecomment-3334382946 From jbhateja at openjdk.org Thu Sep 25 17:48:14 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 25 Sep 2025 17:48:14 GMT Subject: RFR: 8367780: Enable UseAPX on Intel CPUs only when both APX_F and APX_NCI_NDD_NF cpuid features are present [v4] In-Reply-To: References: <-cYOL5wwp8oSisK5utj0B7mHi0D_Ne0i_N_RI-bsbLk=.87c1bc5f-a6a3-4d4e-9530-fc91e676656f@github.com> Message-ID: On Thu, 25 Sep 2025 00:45:37 GMT, Srinivas Vamsi Parasa wrote: > > EMR>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep APX > > OpenJDK 64-Bit Server VM warning: UseAPX is not supported on this CPU, setting it to false > > Hi Jatin, > > Thank for informing about this issue! Sorry about the SDE issue. Will inform you once the public version of SDE which supports this feature is avalible. > > Thanks, Vamsi Thanks @vamsi-parasa, BTW, there should not be any urgency to push a patch for future enhancement in the absence of software emulation :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/27320#issuecomment-3335233685 From dlong at openjdk.org Thu Sep 25 22:08:00 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 25 Sep 2025 22:08:00 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Thu, 25 Sep 2025 06:48:52 GMT, Manuel H?ssig wrote: > I would rather keep this code as a debug only sanity check, but I would refactor it into a single function. Then the question remains what to do with the SA code, that still does nothing. I think we can do even better and get rid of adjust_unextended_sp, moving the debug check into get_deopt_original_pc. This moves the SA changes into shared code, and fixes the anomaly that s390 never called adjust_unextended_sp and thus never called the debug code to do the sanity check. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2380422777 From dlong at openjdk.org Fri Sep 26 00:23:28 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 26 Sep 2025 00:23:28 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v7] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: more cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/10668bd7..814693fa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=05-06 Stats: 361 lines in 18 files changed: 21 ins; 338 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From dlong at openjdk.org Fri Sep 26 01:22:50 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 26 Sep 2025 01:22:50 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v8] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: copyright year update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/814693fa..6a1062c3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=06-07 Stats: 8 lines in 8 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From duke at openjdk.org Fri Sep 26 03:13:52 2025 From: duke at openjdk.org (erifan) Date: Fri, 26 Sep 2025 03:13:52 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v8] In-Reply-To: References: Message-ID: On Wed, 20 Aug 2025 10:11:47 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update callGenerator.hpp copyright year test/micro/org/openjdk/bench/jdk/incubator/vector/VectorSliceBenchmark.java line 137: > 135: @Benchmark > 136: public void shortVectorSliceWithConstantIndex1() { > 137: for (int i = 0; i < sspecies.loopBound(sdst.length); i += bspecies.length()) { Typo ? `bspecies` -> `sspecies` and the following cases. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2380745327 From mhaessig at openjdk.org Fri Sep 26 07:17:20 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 26 Sep 2025 07:17:20 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v8] In-Reply-To: References: Message-ID: On Fri, 26 Sep 2025 01:22:50 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > copyright year update I have one last nit below. Otherwise, this looks good to me. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Frame.java line 85: > 83: > 84: protected void adjustForDeopt() { > 85: if ( pc != null) { Suggestion: if (pc != null) { ------------- Marked as reviewed by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/27059#pullrequestreview-3270357612 PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2381017332 From mhaessig at openjdk.org Fri Sep 26 07:17:21 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 26 Sep 2025 07:17:21 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v3] In-Reply-To: References: <_pqvEs0LIlAc7RjFUwg-bpxS3D2v5U7c6In2sG8XLhQ=.57e3aead-6ac4-4a42-89d2-385d7e6ecedf@github.com> Message-ID: On Thu, 25 Sep 2025 22:05:10 GMT, Dean Long wrote: >> I would rather keep this code as a debug only sanity check, but I would refactor it into a single function. Then the question remains what to do with the SA code, that still does nothing. > >> I would rather keep this code as a debug only sanity check, but I would refactor it into a single function. Then the question remains what to do with the SA code, that still does nothing. > > I think we can do even better and get rid of adjust_unextended_sp, moving the debug check into get_deopt_original_pc. This moves the SA changes into shared code, and fixes the anomaly that s390 never called adjust_unextended_sp and thus never called the debug code to do the sanity check. That is indeed even better. Nice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27059#discussion_r2381117772 From dlong at openjdk.org Fri Sep 26 20:07:36 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 26 Sep 2025 20:07:36 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v9] In-Reply-To: References: Message-ID: > At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). Dean Long has updated the pull request incrementally with one additional commit since the last revision: Update src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Frame.java Co-authored-by: Manuel H?ssig ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27059/files - new: https://git.openjdk.org/jdk/pull/27059/files/6a1062c3..d25872bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27059&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27059/head:pull/27059 PR: https://git.openjdk.org/jdk/pull/27059 From dlong at openjdk.org Fri Sep 26 20:07:38 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 26 Sep 2025 20:07:38 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v8] In-Reply-To: References: Message-ID: <1CWxUjvyzz0_hlfrZW59H39VziO7_AyhCw4ZfIyUZbA=.1b88957c-2a85-4082-b210-c193340b2e9b@github.com> On Fri, 26 Sep 2025 07:14:08 GMT, Manuel H?ssig wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> copyright year update > > I have one last nit below. Otherwise, this looks good to me. Thanks @mhaessig for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27059#issuecomment-3340311259 From duke at openjdk.org Mon Sep 29 03:27:17 2025 From: duke at openjdk.org (erifan) Date: Mon, 29 Sep 2025 03:27:17 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v8] In-Reply-To: References: Message-ID: On Wed, 20 Aug 2025 10:11:47 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update callGenerator.hpp copyright year test/hotspot/jtreg/compiler/vectorapi/TestSliceOptValueTransforms.java line 101: > 99: .slice(0, ByteVector.fromArray(BSP, bsrc2, i)) > 100: .intoArray(bdst, i); > 101: } Would you mind adding a correctness check for these tests, for byte type, like: @DontInline static void verifyVectorSliceByte(int origin) { for (int i = 0; i < BSP.loopBound(SIZE); i += BSP.length()) { int index = i; for (int j = i + origin; j < i + BSP.length(); j++) { Asserts.assertEquals(bsrc1[j], bdst[index++]); } for (int j = i; j < i + origin; j++) { Asserts.assertEquals(bsrc2[j], bdst[index++]); } } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2386593970 From duke at openjdk.org Mon Sep 29 04:09:19 2025 From: duke at openjdk.org (erifan) Date: Mon, 29 Sep 2025 04:09:19 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v8] In-Reply-To: References: Message-ID: On Wed, 20 Aug 2025 10:11:47 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update callGenerator.hpp copyright year test/hotspot/jtreg/compiler/vectorapi/TestSliceOptValueTransforms.java line 130: > 128: for (int i = 0; i < BSP.loopBound(SIZE); i += BSP.length()) { > 129: ByteVector.fromArray(BSP, bsrc1, i) > 130: .slice(16, ByteVector.fromArray(BSP, bsrc2, i)) `16` may out of bounds when this test is run with option `-XX:MaxVectorSize=8` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2386633254 From jbhateja at openjdk.org Mon Sep 29 06:06:17 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 29 Sep 2025 06:06:17 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v8] In-Reply-To: References: Message-ID: On Wed, 20 Aug 2025 10:11:47 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update callGenerator.hpp copyright year Hi @erifan , Thanks for your comments. I will address them soon, please keep reviewing in the meantime :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24104#issuecomment-3345152738 From mhaessig at openjdk.org Mon Sep 29 06:07:19 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Mon, 29 Sep 2025 06:07:19 GMT Subject: RFR: 8366461: Remove obsolete method handle invoke logic [v9] In-Reply-To: References: Message-ID: On Fri, 26 Sep 2025 20:07:36 GMT, Dean Long wrote: >> At one time, JSR292 support needed special logic to save and restore SP across method handle instrinsic calls, but that is no longer the case. The only platform that still does the save/restore is arm32, which is no longer necessary. The save/restore can be removed along with related APIs and logic. Note that the arm32 port is largely based on the x86 port, which stopped doing the save/restore in jdk9 ([JDK-8068945](https://bugs.openjdk.org/browse/JDK-8068945)). > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > Update src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Frame.java > > Co-authored-by: Manuel H?ssig Marked as reviewed by mhaessig (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27059#pullrequestreview-3278052544 From roland at openjdk.org Mon Sep 29 08:44:51 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 29 Sep 2025 08:44:51 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v15] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision: - review - Roberto's patches ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/9fd8dc1c..48257c91 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=13-14 Stats: 14 lines in 4 files changed: 9 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Mon Sep 29 08:44:52 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 29 Sep 2025 08:44:52 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> <4374L3lkQK90wLxxOA7POBmIKNX2DFK-4pO4vj1bkuQ=.5b8d7825-a7f1-497f-ab66-02a85a266659@github.com> Message-ID: <4UN1z9fhxeUqUGagnZVEIFOyDb_mP8WaWUBwWO2HjFA=.93b7c9ad-443c-4fff-810d-7fe805ccbfaa@github.com> On Thu, 11 Sep 2025 07:48:10 GMT, Roberto Casta?eda Lozano wrote: >>> @rose00 @robcasloz I updated the change with a new way to avoid redundant projections. At matching time, before a `NarrowMemProj` is matched into a `MachProj`, new logic checks whether a `MachProj` already exists. That guarantees that no redundant `MachProj` are ever added. It also performs the new normalization at a major cut-point. What do you think? >> >> That sounds good to me, thank you for enforcing this Roland! I will re-run testing and have a new look at the changeset within the next days. > >> That sounds good to me, thank you for enforcing this Roland! I will re-run testing and have a new look at the changeset within the next days. > > Test results of b701d03ed335286587c4d2539dde715b091d30bd on top of jdk-26+14 look good. Will have a look at the code within the next days. @robcasloz Thanks for the patches. I added them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-3345711479 From roland at openjdk.org Mon Sep 29 08:44:54 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 29 Sep 2025 08:44:54 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Wed, 24 Sep 2025 12:14:50 GMT, Roberto Casta?eda Lozano wrote: >> src/hotspot/share/opto/macro.cpp line 1606: >> >>> 1604: // elimination. Simply add the MemBarStoreStore after object >>> 1605: // initialization. >>> 1606: MemBarNode* mb = MemBarNode::make(C, Op_MemBarStoreStore, Compile::AliasIdxRaw); >> >> Does the same argument as below apply for relaxing the scope of this memory barrier? Please clarify in a similar comment for this case (if the same argument applies, a reference to the comment below would be enough). > > Thanks for adding the comment. A follow-up question: the full comment below makes the argument that _re-ordering by the compiler can't happen by construction_ because _a later Store that publishes the just allocated object reference is indirectly control dependent on the Initialize node_. However, in this case, there may be no such Initialize node (`init == nullptr || init->req() < InitializeNode::RawStores`). I assume the memory barrier relaxation is still OK in this scenario because we cannot have later, publishing stores of the allocated object reference? That is, if there exists such a store then there must necessarily exist an Initialize node? Or is there any other reason I am missing? It would be good to clarify this point in the comment. I updated the comment. Can you have a look? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2387147877 From myankelevich at openjdk.org Mon Sep 29 10:28:19 2025 From: myankelevich at openjdk.org (Mikhail Yankelevich) Date: Mon, 29 Sep 2025 10:28:19 GMT Subject: RFR: 8365072: Refactor tests to use PEM API (Phase 2) [v3] In-Reply-To: <_Qf2f6cwWoaNPHpm8TfYeWQTiiqhn-z291PeGY7uP6U=.8e77e560-d233-4232-86e8-4e0da5180947@github.com> References: <_Qf2f6cwWoaNPHpm8TfYeWQTiiqhn-z291PeGY7uP6U=.8e77e560-d233-4232-86e8-4e0da5180947@github.com> Message-ID: > Tests changed in this PR: > 1. test/jdk/java/security/cert/CertPathBuilder/selfIssued/StatusLoopDependency.java > 2. test/jdk/java/security/cert/CertPathValidator/indirectCRL/CircularCRLTwoLevel.java > 3. test/jdk/java/security/cert/CertPathValidator/indirectCRL/CircularCRLTwoLevelRevoked.java > 6. test/jdk/sun/security/ssl/ClientHandshaker/RSAExport.java > 7. test/jdk/javax/net/ssl/ServerName/SSLSocketSNISensitive.java > 9. test/jdk/sun/security/ssl/X509TrustManagerImpl/BasicConstraints.java > 10. test/jdk/sun/security/ssl/X509TrustManagerImpl/ComodoHacker.java > 11. test/jdk/javax/net/ssl/interop/ClientHelloInterOp.java > 12. test/jdk/sun/security/rsa/InvalidBitString.java > 14. test/jdk/java/security/cert/CertPathBuilder/NoExtensions.java > 17. test/jdk/sun/security/provider/certpath/DisabledAlgorithms/CPValidatorTrustAnchor.java > 19. test/jdk/sun/security/x509/X509CRLImpl/Verify.java > > PEMRecord tests will be done under a subtask [JDK-8367326](https://bugs.openjdk.org/browse/JDK-8367326) Mikhail Yankelevich has updated the pull request incrementally with 364 additional commits since the last revision: - removed pemrecord usage - 8365190: Remove LockingMode related code from share Reviewed-by: aboldtch, dholmes, ayang, coleenp, lmesnik, rcastanedalo - 8367025: zIndexDistributor.hpp uses angle-bracket inclusion of globalDefinitions.hpp Reviewed-by: aboldtch, tschatzl, jwaters - 8360219: [AIX] assert(locals_base >= l2) failed: bad placement Reviewed-by: dlong, mdoerr - 8366864: Sort os/linux includes Reviewed-by: ayang, dholmes - 8366874: Test gc/arguments/TestParallelGCErgo.java fails with UseTransparentHugePages Reviewed-by: ayang, shade, stefank, tschatzl - 8351260: java.lang.AssertionError: Unexpected type tree: (ERROR) = (ERROR) Reviewed-by: vromero - 8365776: Convert JShell tests to use JUnit instead of TestNG Reviewed-by: vromero - 8354871: Replace stack map frame type magics with constants Reviewed-by: liach - 8361533: Apply java.io.Serial annotations in java.logging Reviewed-by: rriggs - ... and 354 more: https://git.openjdk.org/jdk/compare/d44cb277...e0be5eaa ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27194/files - new: https://git.openjdk.org/jdk/pull/27194/files/d44cb277..e0be5eaa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27194&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27194&range=01-02 Stats: 53711 lines in 1891 files changed: 31453 ins; 14034 del; 8224 mod Patch: https://git.openjdk.org/jdk/pull/27194.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27194/head:pull/27194 PR: https://git.openjdk.org/jdk/pull/27194 From rcastanedalo at openjdk.org Mon Sep 29 14:17:44 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 29 Sep 2025 14:17:44 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Mon, 29 Sep 2025 08:39:26 GMT, Roland Westrelin wrote: >> Thanks for adding the comment. A follow-up question: the full comment below makes the argument that _re-ordering by the compiler can't happen by construction_ because _a later Store that publishes the just allocated object reference is indirectly control dependent on the Initialize node_. However, in this case, there may be no such Initialize node (`init == nullptr || init->req() < InitializeNode::RawStores`). I assume the memory barrier relaxation is still OK in this scenario because we cannot have later, publishing stores of the allocated object reference? That is, if there exists such a store then there must necessarily exist an Initialize node? Or is there any other reason I am missing? It would be good to clarify this point in the comment. > > I updated the comment. Can you have a look? Thanks for the clarification and the pointer to the comment in `initialize_object`, that helps. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2388155433 From rcastanedalo at openjdk.org Mon Sep 29 14:17:41 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 29 Sep 2025 14:17:41 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v15] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Mon, 29 Sep 2025 08:44:51 GMT, Roland Westrelin wrote: >> An `Initialize` node for an `Allocate` node is created with a memory >> `Proj` of adr type raw memory. In order for stores to be captured, the >> memory state out of the allocation is a `MergeMem` with slices for the >> various object fields/array element set to the raw memory `Proj` of >> the `Initialize` node. If `Phi`s need to be created during later >> transformations from this memory state, The `Phi` for a particular >> slice gets its adr type from the type of the `Proj` which is raw >> memory. If during macro expansion, the `Allocate` is found to have no >> use and so can be removed, the `Proj` out of the `Initialize` is >> replaced by the memory state on input to the `Allocate`. A `Phi` for >> some slice for a field of an object will end up with the raw memory >> state on input to the `Allocate` node. As a result, memory state at >> the `Phi` is incorrect and incorrect execution can happen. >> >> The fix I propose is, rather than have a single `Proj` for the memory >> state out of the `Initialize` with adr type raw memory, to use one >> `Proj` per slice added to the memory state after the `Initalize`. Each >> of the `Proj` should return the right adr type for its slice. For that >> I propose having a new type of `Proj`: `NarrowMemProj` that captures >> the right adr type. >> >> Logic for the construction of the `Allocate`/`Initialize` subgraph is >> tweaked so the right adr type captured in is own `NarrowMemProj` is >> added to the memory sugraph. Code that removes an allocation or moves >> it also has to be changed so it correctly takes the multiple memory >> projections out of the `Initialize` node into account. >> >> One tricky issue is that when EA split types for a scalar replaceable >> `Allocate` node: >> >> 1- the adr type captured in the `NarrowMemProj` becomes out of sync >> with the type of the slices for the allocation >> >> 2- before EA, the memory state for one particular field out of the >> `Initialize` node can be used for a `Store` to the just allocated >> object or some other. So we can have a chain of `Store`s, some to >> the newly allocated object, some to some other objects, all of them >> using the state of `NarrowMemProj` out of the `Initialize`. After >> split unique types, the `NarrowMemProj` is for the slice of a >> particular allocation. So `Store`s to some other objects shouldn't >> use that memory state but the memory state before the `Allocate`. >> >> For that, I added logic to update the adr type of `NarrowMemProj` >> during split uni... > > Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision: > > - review > - Roberto's patches Thanks for and for addressing all comments and questions, Roland. Looks good! ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3280097136 From fbredberg at openjdk.org Tue Sep 30 12:44:11 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 30 Sep 2025 12:44:11 GMT Subject: RFR: 8367601: Remove held_monitor_count Message-ID: Since we have removed all other locking modes than lightweight locking (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)), we no longer need: - `_held_monitor_count` - `_parent_held_monitor_count` - `_jni_monitor_count` This PR removes them from shared code as well as from `X86`, `AArch64`, `PowerPC` and `RISC-V`. They are not present in other platforms. Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. `PowerPC` and `RISC-V` has been sanity checked using QEMU. ------------- Commit messages: - 8367601: Remove held_monitor_count Changes: https://git.openjdk.org/jdk/pull/27570/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27570&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8367601 Stats: 385 lines in 25 files changed: 0 ins; 378 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/27570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27570/head:pull/27570 PR: https://git.openjdk.org/jdk/pull/27570 From fbredberg at openjdk.org Tue Sep 30 12:44:11 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 30 Sep 2025 12:44:11 GMT Subject: RFR: 8367601: Remove held_monitor_count In-Reply-To: References: Message-ID: On Tue, 30 Sep 2025 09:43:51 GMT, Fredrik Bredberg wrote: > Since we have removed all other locking modes than lightweight locking (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)), we no longer need: > - `_held_monitor_count` > - `_parent_held_monitor_count` > - `_jni_monitor_count` > > This PR removes them from shared code as well as from `X86`, `AArch64`, `PowerPC` and `RISC-V`. > They are not present in other platforms. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > `PowerPC` and `RISC-V` has been sanity checked using QEMU. @TheRealMDoerr, @RealFYang I've run rudimentary tests using QEMU, but it would be nice if you guys (or any of your friends) could take it for a spin on real hardware. Thanks in advance. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27570#issuecomment-3351884174 From mdoerr at openjdk.org Tue Sep 30 13:35:28 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 30 Sep 2025 13:35:28 GMT Subject: RFR: 8367601: Remove held_monitor_count In-Reply-To: References: Message-ID: On Tue, 30 Sep 2025 09:43:51 GMT, Fredrik Bredberg wrote: > Since we have removed all other locking modes than lightweight locking (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)), we no longer need: > - `_held_monitor_count` > - `_parent_held_monitor_count` > - `_jni_monitor_count` > > This PR removes them from shared code as well as from `X86`, `AArch64`, `PowerPC` and `RISC-V`. > They are not present in other platforms. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > `PowerPC` and `RISC-V` has been sanity checked using QEMU. Looks correct (PPC64 and shared code changes) and tier1 has passed. Would be nice to clean up unused temp registers diff --git a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp index 0bcc24a23bf..9fe7e1f22ff 100644 --- a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp +++ b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp @@ -1639,7 +1639,6 @@ static void fill_continuation_entry(MacroAssembler* masm, Register reg_cont_obj, assert_different_registers(reg_cont_obj, reg_flags); Register zero = R8_ARG6; Register tmp2 = R9_ARG7; - Register tmp3 = R10_ARG8; DEBUG_ONLY(__ block_comment("fill {")); #ifdef ASSERT @@ -1678,7 +1677,6 @@ static void fill_continuation_entry(MacroAssembler* masm, Register reg_cont_obj, static void continuation_enter_cleanup(MacroAssembler* masm) { Register tmp1 = R8_ARG6; Register tmp2 = R9_ARG7; - Register tmp3 = R10_ARG8; #ifdef ASSERT __ block_comment("clean {"); @@ -1689,8 +1687,8 @@ static void continuation_enter_cleanup(MacroAssembler* masm) { __ ld_ptr(tmp1, ContinuationEntry::parent_cont_fastpath_offset(), R1_SP); __ st_ptr(tmp1, JavaThread::cont_fastpath_offset(), R16_thread); - __ ld_ptr(tmp3, ContinuationEntry::parent_offset(), R1_SP); - __ st_ptr(tmp3, JavaThread::cont_entry_offset(), R16_thread); + __ ld_ptr(tmp2, ContinuationEntry::parent_offset(), R1_SP); + __ st_ptr(tmp2, JavaThread::cont_entry_offset(), R16_thread); DEBUG_ONLY(__ block_comment("} clean")); } Thanks for doing it for all platforms! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27570#issuecomment-3352190916 From pchilanomate at openjdk.org Tue Sep 30 15:53:49 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 30 Sep 2025 15:53:49 GMT Subject: RFR: 8367601: Remove held_monitor_count In-Reply-To: References: Message-ID: On Tue, 30 Sep 2025 09:43:51 GMT, Fredrik Bredberg wrote: > Since we have removed all other locking modes than lightweight locking (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)), we no longer need: > - `_held_monitor_count` > - `_parent_held_monitor_count` > - `_jni_monitor_count` > > This PR removes them from shared code as well as from `X86`, `AArch64`, `PowerPC` and `RISC-V`. > They are not present in other platforms. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > `PowerPC` and `RISC-V` has been sanity checked using QEMU. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1742: > 1740: log_develop_debug(continuations)("PINNED due to critical section"); > 1741: verify_continuation(cont.continuation()); > 1742: freeze_result res = entry->is_pinned() ? freeze_pinned_cs : freeze_pinned_monitor; We can remove this and always return freeze_pinned_cs. We should remove freeze_pinned_monitor (there is a matching definition in Continuation.java). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27570#discussion_r2392098950