From roland at openjdk.org Mon Nov 3 15:43:50 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 3 Nov 2025 15:43:50 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v17] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/957be06e..755bb766 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=15-16 Stats: 50 lines in 3 files changed: 3 ins; 39 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Mon Nov 3 15:43:50 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 3 Nov 2025 15:43:50 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v16] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Tue, 28 Oct 2025 17:10:14 GMT, Emanuel Peter wrote: > I had a few minutes to look over the `apply_..` solutions. I left a few comments, and hope that we can make the code just a little slicker still ;) @eme64 I pushed an update based on your comments. Can you have another look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-3481179679 From fbredberg at openjdk.org Mon Nov 3 16:24:32 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 3 Nov 2025 16:24:32 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 15:54:18 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment and assert msg A truly useful enhancement! Just had a few questions / suggestions. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 762: > 760: load_const_optimized(bad, 0xbad0101babe00000); > 761: for (uint32_t i = 1; i < (sizeof(regs) / sizeof(Register)); i++) { > 762: addi(regs[i], regs[0], regs[i]->encoding()); Guess it's a question for @reinrich, but why set up `bad = regs[0]` and then still use `regs[0]` instead of `bad`? I think using `bad` would make the code easier to understand than using `regs[0]`. Suggestion: addi(regs[i], bad, regs[i]->encoding()); src/hotspot/share/interpreter/linkResolver.cpp line 1689: > 1687: EXCEPTION_MARK; > 1688: CallInfo info; > 1689: resolve_static_call(info, link_info, ClassInitMode::dont_init, THREAD); Couldn't you just do `CHECK_AND_CLEAR_NULL` and skip the following `if (HAS_PENDING_EXCEPTION)` statement? Suggestion: resolve_static_call(info, link_info, ClassInitMode::dont_init, CHECK_AND_CLEAR_NULL); I see the same in functions both above and below this one, is there any reason for this? ------------- Marked as reviewed by fbredberg (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3411212227 PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487050754 PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2486618754 From rrich at openjdk.org Mon Nov 3 18:33:52 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 3 Nov 2025 18:33:52 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 16:11:38 GMT, Fredrik Bredberg wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment and assert msg > > src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 762: > >> 760: load_const_optimized(bad, 0xbad0101babe00000); >> 761: for (uint32_t i = 1; i < (sizeof(regs) / sizeof(Register)); i++) { >> 762: addi(regs[i], regs[0], regs[i]->encoding()); > > Guess it's a question for @reinrich, but why set up `bad = regs[0]` and then still use `regs[0]` instead of `bad`? > I think using `bad` would make the code easier to understand than using `regs[0]`. > Suggestion: > > addi(regs[i], bad, regs[i]->encoding()); Thanks for looking at the ppc part @fbredber Your suggestion is good. I think the loop should be reversed too. Then the addi after it can be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487475002 From rrich at openjdk.org Mon Nov 3 18:38:06 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 3 Nov 2025 18:38:06 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 15:54:18 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment and assert msg src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 764: > 762: addi(regs[i], regs[0], regs[i]->encoding()); > 763: } > 764: addi(regs[0], regs[0], regs[0]->encoding()); Based on @fbredber's suggestion: Suggestion: load_const_optimized(bad, 0xbad0101babe00000); for (int i = (sizeof(regs) / sizeof(Register)) - 1; i >= 0; i--) { addi(regs[i], bad, regs[i]->encoding()); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487483352 From pchilanomate at openjdk.org Mon Nov 3 18:54:15 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 3 Nov 2025 18:54:15 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: References: Message-ID: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com> On Mon, 3 Nov 2025 16:22:13 GMT, Fredrik Bredberg wrote: > A truly useful enhancement! Just had a few questions / suggestions. > Thanks for the review Fred! > src/hotspot/share/interpreter/linkResolver.cpp line 1689: > >> 1687: EXCEPTION_MARK; >> 1688: CallInfo info; >> 1689: resolve_static_call(info, link_info, ClassInitMode::dont_init, THREAD); > > Couldn't you just do `CHECK_AND_CLEAR_NULL` and skip the following `if (HAS_PENDING_EXCEPTION)` statement? > > Suggestion: > > resolve_static_call(info, link_info, ClassInitMode::dont_init, CHECK_AND_CLEAR_NULL); > > I see the same in functions both above and below this one, is there any reason for this? Yes, I agree. I see there are a couple of instances of this pattern in this file as you point out, so if you are okay I?d prefer to file a separate bug to clean them all up together. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3482021139 PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487524814 From pchilanomate at openjdk.org Mon Nov 3 19:03:07 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 3 Nov 2025 19:03:07 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Suggested fix in macroAssembler_ppc.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27802/files - new: https://git.openjdk.org/jdk/pull/27802/files/ffcd92a6..4dff05a8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=11-12 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802 PR: https://git.openjdk.org/jdk/pull/27802 From pchilanomate at openjdk.org Mon Nov 3 19:03:08 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 3 Nov 2025 19:03:08 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 18:34:59 GMT, Richard Reingruber wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment and assert msg > > src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 764: > >> 762: addi(regs[i], regs[0], regs[i]->encoding()); >> 763: } >> 764: addi(regs[0], regs[0], regs[0]->encoding()); > > Based on @fbredber's suggestion: > Suggestion: > > load_const_optimized(bad, 0xbad0101babe00000); > for (int i = (sizeof(regs) / sizeof(Register)) - 1; i >= 0; i--) { > addi(regs[i], bad, regs[i]->encoding()); > } Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487538760 From epeter at openjdk.org Tue Nov 4 07:38:44 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Nov 2025 07:38:44 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v17] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Mon, 3 Nov 2025 15:43:50 GMT, Roland Westrelin wrote: >> An `Initialize` node for an `Allocate` node is created with a memory >> `Proj` of adr type raw memory. In order for stores to be captured, the >> memory state out of the allocation is a `MergeMem` with slices for the >> various object fields/array element set to the raw memory `Proj` of >> the `Initialize` node. If `Phi`s need to be created during later >> transformations from this memory state, The `Phi` for a particular >> slice gets its adr type from the type of the `Proj` which is raw >> memory. If during macro expansion, the `Allocate` is found to have no >> use and so can be removed, the `Proj` out of the `Initialize` is >> replaced by the memory state on input to the `Allocate`. A `Phi` for >> some slice for a field of an object will end up with the raw memory >> state on input to the `Allocate` node. As a result, memory state at >> the `Phi` is incorrect and incorrect execution can happen. >> >> The fix I propose is, rather than have a single `Proj` for the memory >> state out of the `Initialize` with adr type raw memory, to use one >> `Proj` per slice added to the memory state after the `Initalize`. Each >> of the `Proj` should return the right adr type for its slice. For that >> I propose having a new type of `Proj`: `NarrowMemProj` that captures >> the right adr type. >> >> Logic for the construction of the `Allocate`/`Initialize` subgraph is >> tweaked so the right adr type captured in is own `NarrowMemProj` is >> added to the memory sugraph. Code that removes an allocation or moves >> it also has to be changed so it correctly takes the multiple memory >> projections out of the `Initialize` node into account. >> >> One tricky issue is that when EA split types for a scalar replaceable >> `Allocate` node: >> >> 1- the adr type captured in the `NarrowMemProj` becomes out of sync >> with the type of the slices for the allocation >> >> 2- before EA, the memory state for one particular field out of the >> `Initialize` node can be used for a `Store` to the just allocated >> object or some other. So we can have a chain of `Store`s, some to >> the newly allocated object, some to some other objects, all of them >> using the state of `NarrowMemProj` out of the `Initialize`. After >> split unique types, the `NarrowMemProj` is for the slice of a >> particular allocation. So `Store`s to some other objects shouldn't >> use that memory state but the memory state before the `Allocate`. >> >> For that, I added logic to update the adr type of `NarrowMemProj` >> during split uni... > > Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: > > review @rwestrel Thanks a lot for the updates! It now looks much better to me. I'll run internal testing again before approval :) src/hotspot/share/opto/memnode.hpp line 1418: > 1416: } > 1417: return res->as_NarrowMemProj(); > 1418: } Nit, optional: Could we not remove some fluff here with a `isa_NarrowMemProj`? You would not have to check for `res == nullptr`. src/hotspot/share/opto/memnode.hpp line 1422: > 1420: public: > 1421: > 1422: template void for_each_narrow_mem_proj_with_new_uses(Callback callback) const { Can you add a code comment what the "with new uses" part means? Probably that if we add more uses during iteration, we will eventually also visit those? ------------- PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3414372906 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2488962166 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2488966664 From epeter at openjdk.org Tue Nov 4 07:38:48 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Nov 2025 07:38:48 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v15] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Fri, 24 Oct 2025 13:18:46 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision: >> >> - review >> - Roberto's patches > > src/hotspot/share/opto/multnode.hpp line 215: > >> 213: } >> 214: public: >> 215: NarrowMemProjNode(Node* src, const TypePtr* adr_type) > > Can you feed it any other `src` than a `InitializeNode*`? > Suggestion: > > NarrowMemProjNode(InitializeNode* src, const TypePtr* adr_type) @rwestrel Do you not like this suggestion? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2488969574 From duke at openjdk.org Tue Nov 4 08:34:34 2025 From: duke at openjdk.org (walkertest) Date: Tue, 4 Nov 2025 08:34:34 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com> References: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com> Message-ID: On Mon, 3 Nov 2025 18:51:52 GMT, Patricio Chilano Mateo wrote: > > A truly useful enhancement! Just had a few questions / suggestions. > > Thanks for the review Fred! Hello, I have meet a simlar question as: [https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env](https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env) I want to know if this quesiton is same as https://bugs.openjdk.org/browse/JDK-8369238 or not. How to temporarily solve this problem? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3484149341 From roland at openjdk.org Tue Nov 4 08:35:52 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 08:35:52 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v17] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Tue, 4 Nov 2025 07:17:02 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/opto/memnode.hpp line 1418: > >> 1416: } >> 1417: return res->as_NarrowMemProj(); >> 1418: } > > Nit, optional: Could we not remove some fluff here with a `isa_NarrowMemProj`? You would not have to check for `res == nullptr`. I don't think we can. Wouldn't we have to call a method on a possibly `null` res? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2489235039 From roland at openjdk.org Tue Nov 4 08:47:47 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 08:47:47 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v18] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/755bb766..7da2da1e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=16-17 Stats: 10 lines in 3 files changed: 6 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Tue Nov 4 08:47:50 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 08:47:50 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v17] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Tue, 4 Nov 2025 07:19:17 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/opto/memnode.hpp line 1422: > >> 1420: public: >> 1421: >> 1422: template void for_each_narrow_mem_proj_with_new_uses(Callback callback) const { > > Can you add a code comment what the "with new uses" part means? Probably that if we add more uses during iteration, we will eventually also visit those? Done in new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2489256617 From roland at openjdk.org Tue Nov 4 08:47:52 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 08:47:52 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v15] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: <2LRgq5mLyp30XxD4td3sgnXN30VNkh06xRofZBV68i8=.1435d91d-cbaa-4a8e-ad77-ae4ff0506037@github.com> On Tue, 4 Nov 2025 07:20:49 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/multnode.hpp line 215: >> >>> 213: } >>> 214: public: >>> 215: NarrowMemProjNode(Node* src, const TypePtr* adr_type) >> >> Can you feed it any other `src` than a `InitializeNode*`? >> Suggestion: >> >> NarrowMemProjNode(InitializeNode* src, const TypePtr* adr_type) > > @rwestrel Do you not like this suggestion? I thought I took care of that one but obviously not. Done now. It requires moving the definition in the cpp file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2489266971 From epeter at openjdk.org Tue Nov 4 08:47:49 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Nov 2025 08:47:49 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v17] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: <7_ThGEBEjmk6FUxRgLbkuKe3HHDcG20VKUBfQtqPMZI=.5876df6c-e036-4525-bd2c-0257086c12e4@github.com> On Tue, 4 Nov 2025 08:32:25 GMT, Roland Westrelin wrote: >> src/hotspot/share/opto/memnode.hpp line 1418: >> >>> 1416: } >>> 1417: return res->as_NarrowMemProj(); >>> 1418: } >> >> Nit, optional: Could we not remove some fluff here with a `isa_NarrowMemProj`? You would not have to check for `res == nullptr`. > > I don't think we can. Wouldn't we have to call a method on a possibly `null` res? You are right. Ignore. My post-jogging brain saw things that were not there ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2489261245 From epeter at openjdk.org Tue Nov 4 11:21:01 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Nov 2025 11:21:01 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v18] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Tue, 4 Nov 2025 08:47:47 GMT, Roland Westrelin wrote: >> An `Initialize` node for an `Allocate` node is created with a memory >> `Proj` of adr type raw memory. In order for stores to be captured, the >> memory state out of the allocation is a `MergeMem` with slices for the >> various object fields/array element set to the raw memory `Proj` of >> the `Initialize` node. If `Phi`s need to be created during later >> transformations from this memory state, The `Phi` for a particular >> slice gets its adr type from the type of the `Proj` which is raw >> memory. If during macro expansion, the `Allocate` is found to have no >> use and so can be removed, the `Proj` out of the `Initialize` is >> replaced by the memory state on input to the `Allocate`. A `Phi` for >> some slice for a field of an object will end up with the raw memory >> state on input to the `Allocate` node. As a result, memory state at >> the `Phi` is incorrect and incorrect execution can happen. >> >> The fix I propose is, rather than have a single `Proj` for the memory >> state out of the `Initialize` with adr type raw memory, to use one >> `Proj` per slice added to the memory state after the `Initalize`. Each >> of the `Proj` should return the right adr type for its slice. For that >> I propose having a new type of `Proj`: `NarrowMemProj` that captures >> the right adr type. >> >> Logic for the construction of the `Allocate`/`Initialize` subgraph is >> tweaked so the right adr type captured in is own `NarrowMemProj` is >> added to the memory sugraph. Code that removes an allocation or moves >> it also has to be changed so it correctly takes the multiple memory >> projections out of the `Initialize` node into account. >> >> One tricky issue is that when EA split types for a scalar replaceable >> `Allocate` node: >> >> 1- the adr type captured in the `NarrowMemProj` becomes out of sync >> with the type of the slices for the allocation >> >> 2- before EA, the memory state for one particular field out of the >> `Initialize` node can be used for a `Store` to the just allocated >> object or some other. So we can have a chain of `Store`s, some to >> the newly allocated object, some to some other objects, all of them >> using the state of `NarrowMemProj` out of the `Initialize`. After >> split unique types, the `NarrowMemProj` is for the slice of a >> particular allocation. So `Store`s to some other objects shouldn't >> use that memory state but the memory state before the `Allocate`. >> >> For that, I added logic to update the adr type of `NarrowMemProj` >> during split uni... > > Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: > > review Nice, thanks for the updates @rwestrel ! And thanks for working on this issue, it was a tricky one :) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3415820558 From roland at openjdk.org Tue Nov 4 11:21:02 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 11:21:02 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> <4374L3lkQK90wLxxOA7POBmIKNX2DFK-4pO4vj1bkuQ=.5b8d7825-a7f1-497f-ab66-02a85a266659@github.com> <4UN1z9fhxeUqUGagnZVEIFOyDb_mP8WaWUBwWO2HjFA=.93b7c9ad-443c-4fff-810d-7fe805ccbfaa@github.com> Message-ID: <3eK8oOLGNpXu9jhHYACrh_FmWErkLN6IbWiHxSGYeqc=.82d0dc91-970a-4c24-8742-db4e2656a418@github.com> On Tue, 21 Oct 2025 13:49:12 GMT, Roberto Casta?eda Lozano wrote: >>> Hi @rwestrel, could you please have a look at the merge conflicts of this PR so that we can progress further with the review of this work? >> >> The conflict is caused by the integration of [JDK-8360031](https://bugs.openjdk.org/browse/JDK-8360031), which relaxes the assertion in https://github.com/openjdk/jdk/blob/430041d366ddf450c2480c81608dde980dfa6d41/src/hotspot/share/opto/memnode.cpp#L4232 which is also touched by this changeset. Is the current assertion in mainline (after JDK-8360031) still valid in the context of this changeset? > >> Is the current assertion in mainline (after JDK-8360031) still valid in the context of this changeset? > > I did a bit of testing and updating the asserted invariant to `(outcnt() > 0 && outcnt() <= 2) || Opcode() == Op_Initialize` seems to work. @robcasloz @eme64 thanks for the reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-3485415006 From roland at openjdk.org Tue Nov 4 11:21:04 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 4 Nov 2025 11:21:04 GMT Subject: Integrated: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: On Thu, 10 Apr 2025 11:39:36 GMT, Roland Westrelin wrote: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... This pull request has now been integrated. Changeset: e6546683 Author: Roland Westrelin URL: https://git.openjdk.org/jdk/commit/e6546683a8dd9a64255ce4c5606089068ec92e5d Stats: 922 lines in 24 files changed: 831 ins; 25 del; 66 mod 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed Co-authored-by: Emanuel Peter Co-authored-by: Roberto Casta?eda Lozano Reviewed-by: epeter, rcastanedalo ------------- PR: https://git.openjdk.org/jdk/pull/24570 From pchilanomate at openjdk.org Tue Nov 4 15:47:31 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:47:31 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: fix to JvmtiHideSingleStepping ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27802/files - new: https://git.openjdk.org/jdk/pull/27802/files/4dff05a8..55c89ad0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=12-13 Stats: 182 lines in 4 files changed: 164 ins; 5 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/27802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802 PR: https://git.openjdk.org/jdk/pull/27802 From pchilanomate at openjdk.org Tue Nov 4 15:47:33 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:47:33 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: <1m9Kbr2qq1hRl4Sc4YG39hxJ0WFS5aAx-BDmiAaZ_Xk=.d9ba0f21-f06d-46b3-8b08-cf5cc3520906@github.com> On Mon, 3 Nov 2025 19:03:07 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Suggested fix in macroAssembler_ppc.cpp > Hello, I have meet a simlar question as: https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env > > I want to know if this quesiton is same as https://bugs.openjdk.org/browse/JDK-8369238 or not. How to temporarily solve this problem? > In the stacktrace posted, virtual thread #157 is pinned not because of the `static synchronized` but because there are native frames in the stack due to initializing class `CoyoteOutputStream`. This PR doesn?t address that pinning case. It addresses the case of virtual threads pinned waiting for a class to be initialized by another thread. Feel free to send any related questions to the loom-dev mailing list instead, it?s the best place to discuss this. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3486642618 From pchilanomate at openjdk.org Tue Nov 4 15:48:00 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:48:00 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 19:03:07 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Suggested fix in macroAssembler_ppc.cpp I pushed a small fix to `JvmtiHideSingleStepping` for an issue found during pre-integration testing where we hit this assert: https://github.com/openjdk/jdk/blob/50bb92a33b32778a96b1823ff995889892bef890/src/hotspot/share/prims/jvmtiThreadState.hpp#L337 The problem is that for a preempting vthread, the `JvmtiThreadState` of the current `JavaThread` has already been rebinded to the state of the carrier when executing `~JvmtiHideSingleStepping`. The fix is to remember the `JvmtiThreadState` used originally in the `JvmtiHideSingleStepping` constructor. The commit includes a new test that reproduces the issue. @sspitsyn maybe you could take a look at this please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3486666711 From sspitsyn at openjdk.org Tue Nov 4 20:45:37 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 20:45:37 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: <0Z0x2IAQ1TayXTP7kAm9U3Yyx6A4rw88m7Kjgen6bAY=.15663b42-96db-444f-9e2d-2efcbe4dd94d@github.com> On Tue, 4 Nov 2025 15:45:19 GMT, Patricio Chilano Mateo wrote: > @sspitsyn maybe you could take a look at this please? It looks good in general. I'm still looking at it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3487943741 From sspitsyn at openjdk.org Tue Nov 4 20:49:19 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 20:49:19 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 15:47:31 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > fix to JvmtiHideSingleStepping test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: > 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, > 37: jmethodID method, jlocation location) { > 38: } Q: Would it make sense to verify that `SingleStep` events are posted or not? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2491981248 From sspitsyn at openjdk.org Tue Nov 4 21:12:51 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 21:12:51 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 15:47:31 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > fix to JvmtiHideSingleStepping I've reviewed the latest incremental SVC related update. It is good and nice to have in general. Thank you for adding the test! ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3418644332 From pchilanomate at openjdk.org Tue Nov 4 21:28:14 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 21:28:14 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - Add check for SingleStep events - Merge branch 'master' into JDK-8369238 - fix to JvmtiHideSingleStepping - Suggested fix in macroAssembler_ppc.cpp - Improve comment and assert msg - More fixes from David's comments - Merge branch 'master' into JDK-8369238 - add const to references - Improve comment in anchor_mark_set_pd - More comments from Coleen - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 ------------- Changes: https://git.openjdk.org/jdk/pull/27802/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=14 Stats: 2373 lines in 102 files changed: 1928 ins; 125 del; 320 mod Patch: https://git.openjdk.org/jdk/pull/27802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802 PR: https://git.openjdk.org/jdk/pull/27802 From pchilanomate at openjdk.org Tue Nov 4 21:28:15 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 21:28:15 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:10:03 GMT, Serguei Spitsyn wrote: > I've reviewed the latest incremental SVC related update. It is good and nice to have in general. Thank you for adding the test! > Great, thanks for the review Serguei! > test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: > >> 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, >> 37: jmethodID method, jlocation location) { >> 38: } > > Q: Would it make sense to verify that `SingleStep` events are posted or not? Good idea, I added a check for it. Let me know if that works. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3488063416 PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2492084640 From sspitsyn at openjdk.org Tue Nov 4 22:41:18 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:41:18 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:28:14 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Add check for SingleStep events > - Merge branch 'master' into JDK-8369238 > - fix to JvmtiHideSingleStepping > - Suggested fix in macroAssembler_ppc.cpp > - Improve comment and assert msg > - More fixes from David's comments > - Merge branch 'master' into JDK-8369238 > - add const to references > - Improve comment in anchor_mark_set_pd > - More comments from Coleen > - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3418852425 From sspitsyn at openjdk.org Tue Nov 4 22:41:20 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:41:20 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:23:06 GMT, Patricio Chilano Mateo wrote: >> test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: >> >>> 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, >>> 37: jmethodID method, jlocation location) { >>> 38: } >> >> Q: Would it make sense to verify that `SingleStep` events are posted or not? > > Good idea, I added a check for it. Let me know if that works. Thanks! It is good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2492228658 From pchilanomate at openjdk.org Tue Nov 4 23:35:55 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 23:35:55 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:28:14 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Add check for SingleStep events > - Merge branch 'master' into JDK-8369238 > - fix to JvmtiHideSingleStepping > - Suggested fix in macroAssembler_ppc.cpp > - Improve comment and assert msg > - More fixes from David's comments > - Merge branch 'master' into JDK-8369238 > - add const to references > - Improve comment in anchor_mark_set_pd > - More comments from Coleen > - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 Thanks everyone for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3488372018 From pchilanomate at openjdk.org Tue Nov 4 23:35:58 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 23:35:58 GMT Subject: Integrated: 8369238: Allow virtual thread preemption on some common class initialization paths In-Reply-To: References: Message-ID: On Tue, 14 Oct 2025 13:42:18 GMT, Patricio Chilano Mateo wrote: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... This pull request has now been integrated. Changeset: c6a88155 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/c6a88155b519a5d0b22f6009e75a0e6388601756 Stats: 2373 lines in 102 files changed: 1928 ins; 125 del; 320 mod 8369238: Allow virtual thread preemption on some common class initialization paths Co-authored-by: Alan Bateman Co-authored-by: Fei Yang Co-authored-by: Richard Reingruber Reviewed-by: sspitsyn, dholmes, coleenp, fbredberg ------------- PR: https://git.openjdk.org/jdk/pull/27802 From duke at openjdk.org Wed Nov 5 05:08:57 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:08:57 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - Fix test failed - 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/ea83736e..6d122039 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=06-07 Stats: 526337 lines in 7522 files changed: 349612 ins; 122587 del; 54138 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Wed Nov 5 05:09:05 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:09:05 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v6] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 13:04:12 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > src/hotspot/share/gc/shared/c2/barrierSetC2.cpp line 223: > >> 221: MergeMemNode* mm = opt_access.mem(); >> 222: PhaseGVN& gvn = opt_access.gvn(); >> 223: Node* mem = mm->memory_at(gvn.C->get_alias_index(access.addr().type())); > > Can we get rid of all uses of `access.addr().type()`? Get rid of all access.addr().type() > src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp line 105: > >> 103: // stores. In theory we could relax the load from ctrl() to >> 104: // no_ctrl, but that doesn't buy much latitude. >> 105: Node* card_val = __ load( __ ctrl(), card_adr, TypeInt::BYTE, T_BYTE); > > We could asssert that `C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw`, that is that computed slice is the same as hardcoded slide. Similar asserts could be added for every location where a slice/address type is removed in this patch. Sure, I add more assert for this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2484816831 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2492987998 From roland at openjdk.org Wed Nov 5 13:23:18 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 5 Nov 2025 13:23:18 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 05:08:57 GMT, Zihao Lin wrote: >> This patch remove slice parameter from LoadNode::make >> >> I have done more work which remove slice paramater from StoreNode::make. >> >> Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 >> >> Hi team, I am new, I'd appreciate any guidance. Thank a lot! > > Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: > > - fix assert > - add more assert > - rid of access.addr().type() > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Fix build > - Fix test failed > - 8344116: C2: remove slice parameter from LoadNode::make Can we remove `C2AccessValuePtr` entirely and use: Node* _addr; where, currently, there's: C2AccessValuePtr& _addr; ? src/hotspot/share/opto/callnode.cpp line 1740: > 1738: Node* klass_node = in(AllocateNode::KlassNode); > 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); > 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw ------------- PR Review: https://git.openjdk.org/jdk/pull/24258#pullrequestreview-3421940817 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2494424924 From duke at openjdk.org Thu Nov 6 13:43:33 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:43:33 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v9] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: remove C2AccessValuePtr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/6d122039..e89910c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07-08 Stats: 58 lines in 8 files changed: 0 ins; 21 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Thu Nov 6 13:58:53 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:58:53 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v10] In-Reply-To: References: Message-ID: <1zyQq98OPsZ-2nzYz21X_5v2RgKhWaZrZaJQevDMzo4=.138599b1-4797-42b0-a48a-829a112dfbe7@github.com> > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - ... and 2 more: https://git.openjdk.org/jdk/compare/c173d416...36e024db ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=09 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Tue Nov 11 06:12:28 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 11 Nov 2025 06:12:28 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v11] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'openjdk:master' into 8344116 - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - ... and 3 more: https://git.openjdk.org/jdk/compare/76a1109d...42b17827 ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=10 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Tue Nov 11 17:38:35 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Tue, 11 Nov 2025 17:38:35 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC Message-ID: [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. --- #### 1. Test Bug It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. --- #### 2. Implementation Bug `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. The fix ensures that all call sites are patched **before** the `nmethod` is registered. In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. ------------- Commit messages: - Clear inline caches before calling post_init - Fix relocations before registering nmethod - Add is_unloading() check before aquiring ic lock Changes: https://git.openjdk.org/jdk/pull/28241/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28241&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371046 Stats: 53 lines in 6 files changed: 28 ins; 21 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28241.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28241/head:pull/28241 PR: https://git.openjdk.org/jdk/pull/28241 From epeter at openjdk.org Wed Nov 12 08:33:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:29 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: On Thu, 2 Oct 2025 09:08:06 GMT, Roland Westrelin wrote: >> This is a variant of 8332827. In 8332827, an array access becomes >> dependent on a range check `CastII` for another array access. When, >> after loop opts are over, that RC `CastII` was removed, the array >> access could float and an out of bound access happened. With the fix >> for 8332827, RC `CastII`s are no longer removed. >> >> With this one what happens is that some transformations applied after >> loop opts are over widen the type of the RC `CastII`. As a result, the >> type of the RC `CastII` is no longer narrower than that of its input, >> the `CastII` is removed and the dependency is lost. >> >> There are 2 transformations that cause this to happen: >> >> - after loop opts are over, the type of the `CastII` nodes are widen >> so nodes that have the same inputs but a slightly different type can >> common. >> >> - When pushing a `CastII` through an `Add`, if of the type both inputs >> of the `Add`s are non constant, then we end up widening the type >> (the resulting `Add` has a type that's wider than that of the >> initial `CastII`). >> >> There are already 3 types of `Cast` nodes depending on the >> optimizations that are allowed. Either the `Cast` is floating >> (`depends_only_test()` returns `true`) or pinned. Either the `Cast` >> can be removed if it no longer narrows the type of its input or >> not. We already have variants of the `CastII`: >> >> - if the Cast can float and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and can't be removed when it doesn't narrow >> the type of its input. >> >> What we need here, I think, is the 4th combination: >> >> - if the Cast can float and can't be removed when it doesn't narrow >> the type of its input. >> >> Anyway, things are becoming confusing with all these different >> variants named in ways that don't always help figure out what >> constraints one of them operate under. So I refactored this and that's >> the biggest part of this change. The fix consists in marking `Cast` >> nodes when their type is widen in a way that prevents them from being >> optimized out. >> >> Tobias ran performance testing with a slightly different version of >> this change and there was no regression. > > Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: > > - review > - infinite loop in gvn fix > - renaming @rwestrel Sorry I dropped the review on this one for a long time :/ I left quite a few comments. But on the whole I'm really happy with the direction you are taking. It's getting much clearer. I would still see some more clear explanations/comments. That way, we can make our previously implicit assumptions even more explicit :) src/hotspot/share/opto/castnode.cpp line 47: > 45: Node* ConstraintCastNode::Identity(PhaseGVN* phase) { > 46: if (!_dependency.narrows_type()) { > 47: return this; Can you please add a code comment? I don't understand it right away :/ src/hotspot/share/opto/castnode.cpp line 153: > 151: if (!_dependency.narrows_type()) { > 152: return nullptr; > 153: } Interesting, we already check that at at least some of the use sites. If it turns out we already do it at all use sites, why not just assert? (maybe not possible or desirable, just an idea) A comment here would also be great. src/hotspot/share/opto/castnode.cpp line 277: > 275: > 276: CastIINode* CastIINode::pin_array_access_node() const { > 277: assert(depends_only_on_test(), "already pinned"); Would this not be more readable? Suggestion: assert(is_dependency_floating(), "already pinned"); src/hotspot/share/opto/castnode.cpp line 588: > 586: > 587: // If both inputs are not constant then, with the Cast pushed through the Add/Sub, the cast gets less precised types, > 588: // and the resulting Add/Sub's type is wider than that of the Cast before pushing. I find this long sentence a bit complicated to read. Can you reformulate and maybe break it into smaller sentences? It would also be good to explicitly say why that may require changing the dependency constraint. src/hotspot/share/opto/castnode.cpp line 615: > 613: // Widening the type of the Cast (to allow some commoning) causes the Cast to change how it can be optimized (if > 614: // type of its input is narrower than the Cast's type, we can't remove it to not loose the dependency). > 615: return make_with(in(1), wide_t, _dependency.widen_type_dependency()); Suggestion: return make_with(in(1), wide_t, _dependency.with_non_narrowing()); This may be clearer here, since non-narrowing prevents folding the cast away if the input is narrower. I like the code comment you already have though :) src/hotspot/share/opto/castnode.cpp line 625: > 623: if (!phase->C->post_loop_opts_phase()) { > 624: return this_type; > 625: } Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? src/hotspot/share/opto/castnode.hpp line 46: > 44: // 1- and 2- are not always applied depending on what constraint are applied to the Cast: there are cases where 1- > 45: // and 2- apply, where neither 1- nor 2- apply and where one or the other apply. This class abstract away these > 46: // details. Can you spell it out a little more? Right now it feels a little bit like an "exercise for the reader". For each optimization, what is required of the constraints? I think that would help the reader. Equally: you could name why those constraints are required in the first place. Or is there some other place we could link to that already has those explanations? src/hotspot/share/opto/castnode.hpp line 53: > 51: _narrows_type(narrows_type), > 52: _desc(desc) { > 53: } Could you make the constructor private, and only expose the 4 static fields? That way, nobody comes to the strange idea to construct one of these themselves ;) src/hotspot/share/opto/castnode.hpp line 62: > 60: bool narrows_type() const { > 61: return _narrows_type; > 62: } Nits about naming: I would prefer `is_` for boolean queries. Otherwise, if I look at the names `floating` and `pinned_dependency`, I don't immediately know which one converts to a floating/non-floating, and which one is a boolean query. Maybe `pinned_dependency` should be renamed to `with_pinned_dependency`. src/hotspot/share/opto/castnode.hpp line 65: > 63: void dump_on(outputStream *st) const { > 64: st->print("%s", _desc); > 65: } Suggestion: bool narrows_type() const { return _narrows_type; } void dump_on(outputStream *st) const { st->print("%s", _desc); } Newline for consistency with surrounding code. src/hotspot/share/opto/castnode.hpp line 92: > 90: const bool _floating; // Does this Cast depends on its control input or is it pinned? > 91: const bool _narrows_type; // Does this Cast narrows the type i.e. if input type is narrower can it be removed? > 92: const char* _desc; I thought the hotspot convention was to usually put the fields first, at the top of the class? src/hotspot/share/opto/castnode.hpp line 104: > 102: // NonFloatingNarrowingDependency is used when an array access is no longer dependent on a single range check (range > 103: // check smearing for instance) > 104: // FloatingNonNarrowingDependency is used after loop opts when Cast nodes' types are widen so Casts that only differ Suggestion: // FloatingNonNarrowingDependency is used after loop opts when Cast nodes' types are widened so Casts that only differ src/hotspot/share/opto/castnode.hpp line 110: > 108: static const DependencyType FloatingNonNarrowingDependency; > 109: static const DependencyType NonFloatingNarrowingDependency; > 110: static const DependencyType NonFloatingNonNarrowingDependency; Why not put the example at each definition? Would prevent repeating the names :) It would be good if we could have this section earlier up, so the code comments of the `DependencyType` class and this form a unit. At least link them. `NonFloatingNonNarrowingDependency` example: can you spell out the why? What could go wrong otherwise? Would the node float back into the loop maybe? What's wrong with that? `NonFloatingNarrowingDependency` more detail would be helpful. I would like to know why non floating, and why narrowing? Because that's what these examples are for, right? `FloatingNonNarrowingDependency` ah, maybe that answers one of my questions further up somewhere. If we don't have narrowing, then we should not fold away the cast because of the type, right? I think if we spell out which optimizations require which constraints, that could help a lot here. src/hotspot/share/opto/castnode.hpp line 122: > 120: ShouldNotReachHere(); > 121: return nullptr; > 122: } This always smells like a messed up class hierarchy, when I see default methods with "not implemented". But maybe we can't do much better, and I've done similar things recently ? . A short code comment could be helpful though. Suggestion: virtual ConstraintCastNode* make_with(Node* parent, const TypeInteger* type, const DependencyType& dependency) const { ShouldNotReachHere(); // Only implemented for CastII and CastLL return nullptr; } src/hotspot/share/opto/castnode.hpp line 146: > 144: virtual uint ideal_reg() const = 0; > 145: bool carry_dependency() const { return !_dependency.cmp(FloatingNarrowingDependency); } > 146: virtual bool depends_only_on_test() const { return _dependency.floating(); } Why not rename it to `is_dependency_floating`? That may be more helpful at the use site. test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java line 95: > 93: j += Objects.checkIndex(i - 1, length); > 94: return j; > 95: } Why not add an additional IR rule that checks that there are more casts before they get commoned? Just for completenes ;) ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24575#pullrequestreview-3451986831 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517197209 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517271796 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517301300 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517315011 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517336133 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517344615 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517236142 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517203781 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517366170 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517205971 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517200829 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517251068 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517260839 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517355725 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517299467 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517370224 From epeter at openjdk.org Wed Nov 12 08:33:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:29 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> On Wed, 12 Nov 2025 07:24:01 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: >> >> - review >> - infinite loop in gvn fix >> - renaming > > src/hotspot/share/opto/castnode.cpp line 47: > >> 45: Node* ConstraintCastNode::Identity(PhaseGVN* phase) { >> 46: if (!_dependency.narrows_type()) { >> 47: return this; > > Can you please add a code comment? I don't understand it right away :/ Maybe I'm slowly starting to understand... but a code comment would still help a lot here. We are trying to find a dominating cast that has the same or narrower type, and replace with that one. We are only allowed to do that if we have a narrowing cast, because ... > src/hotspot/share/opto/castnode.cpp line 277: > >> 275: >> 276: CastIINode* CastIINode::pin_array_access_node() const { >> 277: assert(depends_only_on_test(), "already pinned"); > > Would this not be more readable? > > Suggestion: > > assert(is_dependency_floating(), "already pinned"); Because it seems we are talking about floating vs pinned here. Adding yet another concept of "depending only on test" would require further explanation / definition. > src/hotspot/share/opto/castnode.cpp line 588: > >> 586: >> 587: // If both inputs are not constant then, with the Cast pushed through the Add/Sub, the cast gets less precised types, >> 588: // and the resulting Add/Sub's type is wider than that of the Cast before pushing. > > I find this long sentence a bit complicated to read. Can you reformulate and maybe break it into smaller sentences? > It would also be good to explicitly say why that may require changing the dependency constraint. I wonder if you renamed `widen_type_dependency` to `with_non_narrowing`, and explained that this now prevents folding away the cast if input types are narrower, etc... that would maybe be more straight forward? I suppose your approach was to just "notify" the dependency that we have widened the type, and then the dependency manages what the implications are. But I find that approach a bit less straight forward, because we are not talking about widening the exact same cast, but a cast that has been pushed through an add/sub. Maybe you can manage to make a coherent argument though, up to you. > src/hotspot/share/opto/castnode.cpp line 625: > >> 623: if (!phase->C->post_loop_opts_phase()) { >> 624: return this_type; >> 625: } > > Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? But maybe that is a refactoring for a separate RFE, and then not really worth it. > src/hotspot/share/opto/castnode.hpp line 53: > >> 51: _narrows_type(narrows_type), >> 52: _desc(desc) { >> 53: } > > Could you make the constructor private, and only expose the 4 static fields? That way, nobody comes to the strange idea to construct one of these themselves ;) That would probably require moving the 4 static fields into this class here. Example: `ConstraintCastNode::DependencyType::FloatingNarrowing` Just an idea. Maybe you have a different solution. But a private constructor would be great for sure. > src/hotspot/share/opto/castnode.hpp line 146: > >> 144: virtual uint ideal_reg() const = 0; >> 145: bool carry_dependency() const { return !_dependency.cmp(FloatingNarrowingDependency); } >> 146: virtual bool depends_only_on_test() const { return _dependency.floating(); } > > Why not rename it to `is_dependency_floating`? That may be more helpful at the use site. Otherwise you have to give an explanation/code comment about the concept "depending on test", and define it in terms of floating / non-floating. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517268181 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517304372 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517331973 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517345703 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517217941 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517358981 From epeter at openjdk.org Wed Nov 12 08:33:31 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:31 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> References: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> Message-ID: On Wed, 12 Nov 2025 08:19:21 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/castnode.cpp line 625: >> >>> 623: if (!phase->C->post_loop_opts_phase()) { >>> 624: return this_type; >>> 625: } >> >> Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? > > But maybe that is a refactoring for a separate RFE, and then not really worth it. But conceptually, we want to say: if we are in post loop opts, then widen the types. Now it looks like we want to widen always ... but then we check for post loop opts inside the method and bail out anyway. Not very transparent. Another idea: rename the method to `widen_type_in_post_loop_opts`. Totally up to you though. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517350982 From mdoerr at openjdk.org Thu Nov 13 16:56:36 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 13 Nov 2025 16:56:36 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation Message-ID: This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. ------------- Commit messages: - 8371820: Further AES performance improvements for key schedule generation Changes: https://git.openjdk.org/jdk/pull/28299/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371820 Stats: 33 lines in 2 files changed: 12 ins; 8 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From mdoerr at openjdk.org Thu Nov 13 16:56:37 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 13 Nov 2025 16:56:37 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 16:48:28 GMT, Martin Doerr wrote: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. @smemery: I've seen your recent improvements and performance measurements. It would be great if you could take a look at this proposal and check the performance results in your environment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3528746436 From duke at openjdk.org Fri Nov 14 07:19:06 2025 From: duke at openjdk.org (Shawn M Emery) Date: Fri, 14 Nov 2025 07:19:06 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: <7ilzxthduL8v18I-SAyihrSyNxkep_mEYkxRBL3lHAY=.41ce1a68-928d-4fb3-a312-f2b8e4b907fd@github.com> On Thu, 13 Nov 2025 16:49:34 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > @smemery: I've seen your recent improvements and performance measurements. It would be great if you could take a look at this proposal and check the performance results in your environment. @TheRealMDoerr: I've ran your update of the init key schedule w/intrinsics logic and obtained the following results for AESReinit: x86_64: 19.51% improvement arm64: 3.11% improvement Changes in performance for the other AES-related benchmarks (AES[Decrypt].testBaseline and AESBench) had the expected nominal changes. AES regression tests (Cipher/AES and hotspot/*/aes) have passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3531238034 From duke at openjdk.org Fri Nov 14 07:31:13 2025 From: duke at openjdk.org (Shawn M Emery) Date: Fri, 14 Nov 2025 07:31:13 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: <6lsTW4mcptKcVAuFHu3h39LMajICZZVDhHwrkxM6Rl8=.787cc24a-ae13-49ed-bc56-9c71ad8659b0@github.com> On Thu, 13 Nov 2025 16:48:28 GMT, Martin Doerr wrote: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 61: > 59: // used for everything else. > 60: private int[] sessionKe = null; // key for encryption > 61: private int[] sessionKd = null; // preprocessed key for decryption We really don't need sessionKd, since it's just assigned to K, but I'm fine leaving it as is. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2526136351 From duke at openjdk.org Fri Nov 14 07:35:09 2025 From: duke at openjdk.org (Shawn M Emery) Date: Fri, 14 Nov 2025 07:35:09 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 16:48:28 GMT, Martin Doerr wrote: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. src/hotspot/share/opto/library_call.cpp line 7483: > 7481: // However, ppc64 vncipher processes MixColumns and requires the same round keys with encryption. > 7482: // The ppc64 and riscv64 stubs of encryption and decryption use the same round keys (sessionK[0]). > 7483: Node* objSessionK = load_field_from_object(aescrypt_object, "sessionK", "[[I"); Good catch, as I didn't see that intrinsics wasn't using the second array element (inverse key schedule) for these platforms all this time! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2526153490 From duke at openjdk.org Fri Nov 14 07:39:05 2025 From: duke at openjdk.org (Shawn M Emery) Date: Fri, 14 Nov 2025 07:39:05 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 16:48:28 GMT, Martin Doerr wrote: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 941: > 939: if (decrypting) { > 940: if (sessionKd == null) { > 941: sessionKd = genInvRoundKeys(sessionKe, rounds); Good catch, as this is more efficient given that the inverse key schedule is dependent upon the (encryption) key schedule in the code's current state. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2526173217 From duke at openjdk.org Fri Nov 14 07:46:31 2025 From: duke at openjdk.org (Shawn M Emery) Date: Fri, 14 Nov 2025 07:46:31 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 16:48:28 GMT, Martin Doerr wrote: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Good catch in eliminating the unnecessary construction of both key schedules on the PPC64, S390, and RISCV64 architectures. src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 59: > 57: // Following attribute is specific to Intrinsics where the unprocessed > 58: // key is used for PPC64, S390, and RISCV64 architectures, whereas K is > 59: // used for everything else. I would change this to: // Following attributes (sessionKe and K) are specific to Intrinsics, where sessionKe // is the unprocessed key that is used for PPC64, S390, and RISCV64 architectures, // whereas K is used for everything else. ------------- PR Review: https://git.openjdk.org/jdk/pull/28299#pullrequestreview-3463343453 PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2526196244 From mdoerr at openjdk.org Fri Nov 14 12:13:25 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 14 Nov 2025 12:13:25 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v2] In-Reply-To: References: Message-ID: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Improve comment and minor cleanup. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28299/files - new: https://git.openjdk.org/jdk/pull/28299/files/477a3dda..b03e6b43 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=00-01 Stats: 8 lines in 2 files changed: 2 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From mdoerr at openjdk.org Fri Nov 14 12:13:27 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 14 Nov 2025 12:13:27 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v2] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 07:41:05 GMT, Shawn M Emery wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment and minor cleanup. > > src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 59: > >> 57: // Following attribute is specific to Intrinsics where the unprocessed >> 58: // key is used for PPC64, S390, and RISCV64 architectures, whereas K is >> 59: // used for everything else. > > I would change this to: > // Following attributes (sessionKe and K) are specific to Intrinsics, where sessionKe > // is the unprocessed key that is used for PPC64, S390, and RISCV64 architectures, > // whereas K is used for everything else. Updated. I have also cleaned up the hotspot part a bit. > src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 61: > >> 59: // used for everything else. >> 60: private int[] sessionKe = null; // key for encryption >> 61: private int[] sessionKd = null; // preprocessed key for decryption > > We really don't need sessionKd, since it's just assigned to K, but I'm fine leaving it as is. Currently, `sessionKd` is needed if we switch between encryption and decryption while using the same key. We could easier remove `K` and pass the information to `LibraryCallKit::get_key_start_from_aescrypt_object` if we are doing encryption or decryption. I can change that if you want, but I'm not sure if it's worth the effort. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2527275801 PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2527271643 From mdoerr at openjdk.org Fri Nov 14 17:21:50 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 14 Nov 2025 17:21:50 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: References: Message-ID: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: More minor cleanup. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28299/files - new: https://git.openjdk.org/jdk/pull/28299/files/b03e6b43..621616a4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From eastigeevich at openjdk.org Mon Nov 17 12:29:08 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 17 Nov 2025 12:29:08 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. LGTM ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/28241#pullrequestreview-3472469212 From mdoerr at openjdk.org Mon Nov 17 15:12:46 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 17 Nov 2025 15:12:46 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <6lsTW4mcptKcVAuFHu3h39LMajICZZVDhHwrkxM6Rl8=.787cc24a-ae13-49ed-bc56-9c71ad8659b0@github.com> References: <6lsTW4mcptKcVAuFHu3h39LMajICZZVDhHwrkxM6Rl8=.787cc24a-ae13-49ed-bc56-9c71ad8659b0@github.com> Message-ID: On Fri, 14 Nov 2025 07:28:10 GMT, Shawn M Emery wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> More minor cleanup. > > src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 61: > >> 59: // used for everything else. >> 60: private int[] sessionKe = null; // key for encryption >> 61: private int[] sessionKd = null; // preprocessed key for decryption > > We really don't need sessionKd, since it's just assigned to K, but I'm fine leaving it as is. @smemery: I have made a proposal to remove `K`: https://github.com/TheRealMDoerr/jdk/commit/2907475958806cad6b5fc83541f66065475a93ec Please take a look! I think it's a bit better readable, but makes the change a bit larger and will probably require a Graal update. What do you prefer? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2534468284 From pchilanomate at openjdk.org Mon Nov 17 21:53:04 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 17 Nov 2025 21:53:04 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent Message-ID: When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by using a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. - The code was structured in terms of mount and unmount cases, and a variable was used to differentiate between start or end of the transition. With the changes to make the mechanism independent of JVMTI it becomes simpler to invert this and structure the code in terms of start transition and end transition, and use a variable to differentiate between mount and unmount cases. - All JVMTI code required during start/end transitions has been encapsulated in classes `JVMTIStartTransition` and `JVMTIEndTransition`. I kept the ordering of event posting as it is today. - Global variables `_sync_protocol_enabled_count` and `_sync_protocol_enabled_permanently` were removed. Variable `_VTMS_transition_disable_for_all_count` was renamed to `_global_start_transition_disable_count`, `_SR_mode` to `_exclusive_operation_ongoing` and `_VTMS_notify_jvmti_events` to `_notify_jvmti_events`. New global variable `_active_disablers` replaces the functionality of `_VTMS_transition_disable_for_one_count`. - Now, when the first agent attaches we not only set `_notify_jvmti_events` but we also increase global counter `_global_start_transition_disable_count`. This has the effect of always forcing the slow path when starting and ending a transition as we do today when `_VTMS_notify_jvmti_events` is set. A new `Handshake::execute` variant to handshake a virtual thread is introduced with this patch, which makes use of the new `MountUnmountDisabler` class. Method `ThreadSnapshotFactory::get_thread_snapshot` has been simplified to use this handshake variant to capture the snapshot of a virtual thread. The changes include new test `DumpThreadsWhenParking.java` from @AlanBateman which reliably reproduces the issue. I also verified the changes in Mach5 tiers1-7. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/28361/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364343 Stats: 1848 lines in 40 files changed: 844 ins; 824 del; 180 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From duke at openjdk.org Tue Nov 18 15:13:16 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 18 Nov 2025 15:13:16 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v12] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - ... and 4 more: https://git.openjdk.org/jdk/compare/dcba014a...329e290a ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=11 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From coleenp at openjdk.org Tue Nov 18 15:43:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 15:43:00 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass Message-ID: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). Tested with tier1-4. 5-7 in progress. ------------- Commit messages: - Fix C2 to test for array first. - Move AccessFlags to InstanceKlass - array classes don't set access flags so don't look for them there. Changes: https://git.openjdk.org/jdk/pull/28371/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372098 Stats: 155 lines in 29 files changed: 62 ins; 54 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From liach at openjdk.org Tue Nov 18 15:43:03 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 18 Nov 2025 15:43:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 13:27:06 GMT, Coleen Phillimore wrote: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. src/hotspot/share/oops/constantPool.cpp line 1228: > 1226: > 1227: // Check constant pool method consistency > 1228: InstanceKlass* callee = InstanceKlass::cast(k); I know a MethodRef can be `[I`, `clone`, `()Ljava/lang/Object;` for `intArray.clone()` Java calls translated by javac. I wonder if this new code would break for such an array callee class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2538494116 From coleenp at openjdk.org Tue Nov 18 15:58:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 15:58:15 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 14:48:37 GMT, Chen Liang wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > src/hotspot/share/oops/constantPool.cpp line 1228: > >> 1226: >> 1227: // Check constant pool method consistency >> 1228: InstanceKlass* callee = InstanceKlass::cast(k); > > I know a MethodRef can be `[I`, `clone`, `()Ljava/lang/Object;` for `intArray.clone()` Java calls translated by javac. I wonder if this new code would break for such an array callee class. At one point, I removed is_interface() from class Klass, but then restored it because dependencies uses this a lot and has many Klass parameter types, instead of InstanceKlass. I'll revert this change, but I'm curious why none of the tests failed with this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2538747727 From mdoerr at openjdk.org Tue Nov 18 17:37:35 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 18 Nov 2025 17:37:35 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> Message-ID: On Fri, 14 Nov 2025 17:21:50 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > More minor cleanup. @valeriep, @jnimeh: I've seen that you have reviewed other changes in this area and I need reviews from Security Group people. I will certainly find reviewers for the hotspot part. May I ask you to take a look at the Java part? I would slightly prefer doing a bit more changes, but wanted to check with you, first: https://github.com/TheRealMDoerr/jdk/commit/2907475958806cad6b5fc83541f66065475a93ec ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3548795174 From duke at openjdk.org Tue Nov 18 18:21:35 2025 From: duke at openjdk.org (ExE Boss) Date: Tue, 18 Nov 2025 18:21:35 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 13:27:06 GMT, Coleen Phillimore wrote: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. src/hotspot/share/classfile/classFileParser.cpp line 815: > 813: interface_index, CHECK); > 814: if (cp->tag_at(interface_index).is_klass()) { > 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539226107 From duke at openjdk.org Tue Nov 18 18:23:21 2025 From: duke at openjdk.org (Shawn M Emery) Date: Tue, 18 Nov 2025 18:23:21 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> Message-ID: <_rE68-lxqQMg8l1T9Mj2zl3vf2eXCSNc3SpIKRNOFvA=.6a9f3cbb-ef76-4079-bdec-a37f12d337fa@github.com> On Fri, 14 Nov 2025 17:21:50 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > More minor cleanup. @valeriepeng or @jnimeh are good choices for review and someone besides myself will need to be a reviewer given that I don't have reviewer privileges. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3548990626 From valeriep at openjdk.org Tue Nov 18 18:43:07 2025 From: valeriep at openjdk.org (Valerie Peng) Date: Tue, 18 Nov 2025 18:43:07 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> Message-ID: <8NWK4oQBOjqg1Z7D-NsWuWn_ZhU9E6jWSWedJJQSJ08=.b0c9ef61-7880-4300-a90a-9b89d0a1ec8f@github.com> On Fri, 14 Nov 2025 17:21:50 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > More minor cleanup. src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 62: > 60: private int[] sessionKe = null; // key for encryption > 61: private int[] sessionKd = null; // preprocessed key for decryption > 62: private int[] K = null; // preprocessed key in case of decryption I find the comment confusing as `K` is sometimes assigned with `sessionKe`, so it can't be used only for decryption? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2539296610 From coleenp at openjdk.org Tue Nov 18 18:55:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 18:55:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 18:15:40 GMT, ExE Boss wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > src/hotspot/share/classfile/classFileParser.cpp line 815: > >> 813: interface_index, CHECK); >> 814: if (cp->tag_at(interface_index).is_klass()) { >> 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); > > Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539318553 From valeriep at openjdk.org Tue Nov 18 18:55:05 2025 From: valeriep at openjdk.org (Valerie Peng) Date: Tue, 18 Nov 2025 18:55:05 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> Message-ID: On Fri, 14 Nov 2025 17:21:50 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > More minor cleanup. I can review until COB this Thursday, then I will be on vacation and return on Dec 2nd. Just FYI. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3549101690 From liach at openjdk.org Tue Nov 18 20:05:14 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 18 Nov 2025 20:05:14 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 18:50:55 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/classFileParser.cpp line 815: >> >>> 813: interface_index, CHECK); >>> 814: if (cp->tag_at(interface_index).is_klass()) { >>> 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); >> >> Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. > > There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). I thought the cast at line 839 would have handled this. Turns out it has a `!interf->is_interface()` check before so this cast is problematic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539538759 From matsaave at openjdk.org Tue Nov 18 20:20:56 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 18 Nov 2025 20:20:56 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters Message-ID: The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. ------------- Commit messages: - 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters Changes: https://git.openjdk.org/jdk/pull/28380/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28380&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347248 Stats: 14 lines in 6 files changed: 9 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28380.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28380/head:pull/28380 PR: https://git.openjdk.org/jdk/pull/28380 From iklam at openjdk.org Tue Nov 18 21:43:31 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 21:43:31 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 857: > 855: void CodeInstaller::initialize_fields(HotSpotCompiledCodeStream* stream, u1 code_flags, methodHandle& method, CodeBuffer& buffer, JVMCI_TRAPS) { > 856: if (!method.is_null()) { > 857: _parameter_count = method->number_of_parameters(); _parameter_count is used to create a OopMap, so it must be in the number of words that are ocupied by all the arguments, not the number of parameters. https://github.com/openjdk/jdk/blob/27a38d9093958ae4851bc61b8d3f0d71dc780823/src/hotspot/share/jvmci/jvmciCodeInstaller.cpp#L263 src/hotspot/share/prims/jni.cpp line 871: > 869: ResourceMark rm(THREAD); > 870: int size_of_parameters = method->size_of_parameters(); > 871: JavaCallArguments java_args(size_of_parameters); The original code was harmless. It creates the JavaCallArguments with more space than necessary, but doesn't affect the actual number of parameters that are passed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539653885 PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539631698 From mdoerr at openjdk.org Tue Nov 18 21:48:12 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 18 Nov 2025 21:48:12 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: References: Message-ID: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Remove K from AES_Crypt ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28299/files - new: https://git.openjdk.org/jdk/pull/28299/files/621616a4..2b981288 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=02-03 Stats: 23 lines in 3 files changed: 6 ins; 4 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From mdoerr at openjdk.org Tue Nov 18 21:48:15 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 18 Nov 2025 21:48:15 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <8NWK4oQBOjqg1Z7D-NsWuWn_ZhU9E6jWSWedJJQSJ08=.b0c9ef61-7880-4300-a90a-9b89d0a1ec8f@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> <8NWK4oQBOjqg1Z7D-NsWuWn_ZhU9E6jWSWedJJQSJ08=.b0c9ef61-7880-4300-a90a-9b89d0a1ec8f@github.com> Message-ID: <4RdGgD3PA7s5RYgaZsHA-V2pgqh8BrP19FczkmVYDbM=.f9cb08ec-2fb7-4af6-9ad2-f232fb4a9004@github.com> On Tue, 18 Nov 2025 18:40:12 GMT, Valerie Peng wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> More minor cleanup. > > src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 62: > >> 60: private int[] sessionKe = null; // key for encryption >> 61: private int[] sessionKd = null; // preprocessed key for decryption >> 62: private int[] K = null; // preprocessed key in case of decryption > > I find the comment confusing as `K` is sometimes assigned with `sessionKe`, so it can't be used only for decryption? Thanks for looking at it! I've merged my additional proposal. `K` is removed, now. Does the Java part look ok? I'll ask for a hotspot review once the Java part is fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2539714325 From valeriep at openjdk.org Tue Nov 18 21:48:16 2025 From: valeriep at openjdk.org (Valerie Peng) Date: Tue, 18 Nov 2025 21:48:16 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: <4RdGgD3PA7s5RYgaZsHA-V2pgqh8BrP19FczkmVYDbM=.f9cb08ec-2fb7-4af6-9ad2-f232fb4a9004@github.com> References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> <8NWK4oQBOjqg1Z7D-NsWuWn_ZhU9E6jWSWedJJQSJ08=.b0c9ef61-7880-4300-a90a-9b89d0a1ec8f@github.com> <4RdGgD3PA7s5RYgaZsHA-V2pgqh8BrP19FczkmVYDbM=.f9cb08ec-2fb7-4af6-9ad2-f232fb4a9004@github.com> Message-ID: On Tue, 18 Nov 2025 21:38:00 GMT, Martin Doerr wrote: >> src/java.base/share/classes/com/sun/crypto/provider/AES_Crypt.java line 62: >> >>> 60: private int[] sessionKe = null; // key for encryption >>> 61: private int[] sessionKd = null; // preprocessed key for decryption >>> 62: private int[] K = null; // preprocessed key in case of decryption >> >> I find the comment confusing as `K` is sometimes assigned with `sessionKe`, so it can't be used only for decryption? > > Thanks for looking at it! I've merged my additional proposal. `K` is removed, now. Does the Java part look ok? I'll ask for a hotspot review once the Java part is fine. Java part looks fine to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2539716623 From coleenp at openjdk.org Tue Nov 18 21:50:35 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 21:50:35 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v2] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert two InstanceKlass::cast() calls that might not be InstanceKlass. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/80079012..e8973f59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From coleenp at openjdk.org Tue Nov 18 21:50:38 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 21:50:38 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v2] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 20:02:09 GMT, Chen Liang wrote: >> There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). > > I thought the cast at line 839 would have handled this. Turns out it has a `!interf->is_interface()` check before so this cast is problematic. There's a reason this isn't tested. The constant pool reference for JVM_CONSTANT_Class is tested to be resolved at line 814 and because it's an interface, it's not easy to be resolved by this point (unless it's a duplicate class in which case it will be an InstanceKlass). The array case goes through the else part of this and throws a ClassFormatError. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539645039 From dholmes at openjdk.org Tue Nov 18 22:11:03 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Nov 2025 22:11:03 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/oops/method.cpp line 736: > 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { > 735: count++; > 736: } Shouldn't we do this at construction time and store into a field and return that here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539756839 From iklam at openjdk.org Tue Nov 18 22:19:06 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 22:19:06 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:01:21 GMT, David Holmes wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/oops/method.cpp line 736: > >> 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { >> 735: count++; >> 736: } > > Shouldn't we do this at construction time and store into a field and return that here? That would increase footprint. It looks like we don't need this function after all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539793227 From iklam at openjdk.org Tue Nov 18 22:19:08 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 22:19:08 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:28:38 GMT, Ioi Lam wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/prims/jni.cpp line 871: > >> 869: ResourceMark rm(THREAD); >> 870: int size_of_parameters = method->size_of_parameters(); >> 871: JavaCallArguments java_args(size_of_parameters); > > The original code was harmless. It creates the JavaCallArguments with more space than necessary, but doesn't affect the actual number of parameters that are passed. My previous comment is wrong. The original code is correct. `JavaCallArguments::_max_size` is number of words. A `long` argument counts as two words. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539791878 From matsaave at openjdk.org Tue Nov 18 22:24:11 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 18 Nov 2025 22:24:11 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:16:18 GMT, Ioi Lam wrote: >> src/hotspot/share/oops/method.cpp line 736: >> >>> 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { >>> 735: count++; >>> 736: } >> >> Shouldn't we do this at construction time and store into a field and return that here? > > That would increase footprint. It looks like we don't need this function after all. Oops I forgot to push the commit that caches this value ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539805680 From dlong at openjdk.org Wed Nov 19 00:47:18 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 00:47:18 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/ci/ciMethodData.cpp line 557: > 555: mdo->set_arg_stack(_arg_stack); > 556: mdo->set_arg_returned(_arg_returned); > 557: int arg_count = mdo->method()->number_of_parameters(); The actual size allocated seems to be based on the argument size in slots, not count. https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). src/hotspot/share/prims/whitebox.cpp line 1272: > 1270: mdo->init(); > 1271: ResourceMark rm(THREAD); > 1272: int arg_count = mdo->method()->number_of_parameters(); See my comment for ciMethodData::update_escape_info(). Same issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2540064259 PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2540067055 From mdoerr at openjdk.org Wed Nov 19 09:08:34 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 19 Nov 2025 09:08:34 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v3] In-Reply-To: References: <-srlt_N0wyBwCwOmZTJBdGNFm66doGBHr6Yx83pqSpQ=.be8e4031-f542-49ad-8271-ac9a2c8b9128@github.com> <8NWK4oQBOjqg1Z7D-NsWuWn_ZhU9E6jWSWedJJQSJ08=.b0c9ef61-7880-4300-a90a-9b89d0a1ec8f@github.com> <4RdGgD3PA7s5RYgaZsHA-V2pgqh8BrP19FczkmVYDbM=.f9cb08ec-2fb7-4af6-9ad2-f232fb4a9004@github.com> Message-ID: On Tue, 18 Nov 2025 21:40:01 GMT, Valerie Peng wrote: >> Thanks for looking at it! I've merged my additional proposal. `K` is removed, now. Does the Java part look ok? I'll ask for a hotspot review once the Java part is fine. > > Java part looks fine to me. Thanks for reviewing! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2541140164 From mdoerr at openjdk.org Wed Nov 19 09:13:42 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 19 Nov 2025 09:13:42 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> References: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> Message-ID: <-IeLB8uRff2Hu6rXMg_C1kr1vF46RUzULxFjMastVE8=.1aa135cc-38b1-4c37-966c-59cc51d300f5@github.com> On Tue, 18 Nov 2025 21:48:12 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove K from AES_Crypt @vnkozlov: May I ask you to take a look at the C2 part? I had to adapt the library_call code to the new Java implementation which stores the key in "sessionKe" and "sessionKd", now. I think the hotspot part is also more comprehensive this way because it makes it clear which key is used for what. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3551612586 From coleenp at openjdk.org Wed Nov 19 12:34:30 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 19 Nov 2025 12:34:30 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert a couple more InstanceKlass::casts also to get GHA to restart. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/e8973f59..1060463b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From alanb at openjdk.org Wed Nov 19 13:54:03 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 19 Nov 2025 13:54:03 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 20:19:58 GMT, Patricio Chilano Mateo wrote: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by usi ng a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and unmount cases, and a... src/java.base/share/classes/java/lang/VirtualThread.java line 1390: > 1388: } > 1389: > 1390: // -- JVM TI support -- We'll need to update is comment as it no longer only for JVMTI. This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes. Maybe I could contribute a block comment to include here? src/java.base/share/native/libjava/VirtualThread.c line 38: > 36: { "startFinalTransition", "()V", (void *)&JVM_VirtualThreadEnd }, > 37: { "startTransition", "(Z)V", (void *)&JVM_VirtualThreadStartTransition }, > 38: { "endTransition", "(Z)V", (void *)&JVM_VirtualThreadEndTransition }, I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition. test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWhenParking.java line 94: > 92: }); > 93: } > 94: // wait for all virtual threads to start so all have a non-empty stack This reminds me the loom repo has a small update to to the DumpThreadsWithEliminatedLock.java test to ensure that the virtual thread starts execution before doing the thread dump. This was noticed with test-repeat runs of the new test to ensure it was stable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542097138 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542016761 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542034248 From matsaave at openjdk.org Wed Nov 19 15:53:23 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 19 Nov 2025 15:53:23 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> On Wed, 19 Nov 2025 00:42:54 GMT, Dean Long wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/ci/ciMethodData.cpp line 557: > >> 555: mdo->set_arg_stack(_arg_stack); >> 556: mdo->set_arg_returned(_arg_returned); >> 557: int arg_count = mdo->method()->number_of_parameters(); > > The actual size allocated seems to be based on the argument size in slots, not count. > https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 > To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2542604276 From iklam at openjdk.org Wed Nov 19 18:09:35 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 19 Nov 2025 18:09:35 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> References: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> Message-ID: On Wed, 19 Nov 2025 15:50:52 GMT, Matias Saavedra Silva wrote: >> src/hotspot/share/ci/ciMethodData.cpp line 557: >> >>> 555: mdo->set_arg_stack(_arg_stack); >>> 556: mdo->set_arg_returned(_arg_returned); >>> 557: int arg_count = mdo->method()->number_of_parameters(); >> >> The actual size allocated seems to be based on the argument size in slots, not count. >> https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 >> To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). > > So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? I think it's OK to rename `arg_count` to `arg_size`. There's quite a lot of existing code that does this. `arg_size` is understood to be the "number of slots". https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/opto/graphKit.cpp#L2365-L2366 https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/ci/bcEscapeAnalyzer.cpp#L192-L196 I am not sure about adding "slot" to "arg_modified". While there are some use of the word "slot" in the compiler APIs, it's not common. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2543024462 From pchilanomate at openjdk.org Wed Nov 19 19:06:34 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 19 Nov 2025 19:06:34 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v2] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by usi ng a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and unmount cases, and a... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Add fixes to DumpThreadsWithEliminatedLock.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/70f96a7d..976486cd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 19 19:06:37 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 19 Nov 2025 19:06:37 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v2] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 13:50:44 GMT, Alan Bateman wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add fixes to DumpThreadsWithEliminatedLock.java > > src/java.base/share/classes/java/lang/VirtualThread.java line 1390: > >> 1388: } >> 1389: >> 1390: // -- JVM TI support -- > > We'll need to update is comment as it no longer only for JVMTI. > > This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes. Maybe I could contribute a block comment to include here? That would be great. > src/java.base/share/native/libjava/VirtualThread.c line 38: > >> 36: { "startFinalTransition", "()V", (void *)&JVM_VirtualThreadEnd }, >> 37: { "startTransition", "(Z)V", (void *)&JVM_VirtualThreadStartTransition }, >> 38: { "endTransition", "(Z)V", (void *)&JVM_VirtualThreadEndTransition }, > > I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition. How about removing these methods and just have an extra boolean parameter in `start/endTransition`? https://github.com/pchilano/jdk/compare/JDK-8364343...pchilano:jdk:startEndTransitionsOnly > test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWhenParking.java line 94: > >> 92: }); >> 93: } >> 94: // wait for all virtual threads to start so all have a non-empty stack > > This reminds me the loom repo has a small update to to the DumpThreadsWithEliminatedLock.java test to ensure that the virtual thread starts execution before doing the thread dump. This was noticed with test-repeat runs of the new test to ensure it was stable. Added the fixes to `DumpThreadsWithEliminatedLock.java` from the loom repo. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543217770 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543212884 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543215701 From pchilanomate at openjdk.org Thu Nov 20 20:52:05 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Nov 2025 20:52:05 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v3] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable tra nsitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and unm... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Add Alan's comment in VirtualThread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/976486cd..205ae77b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=01-02 Stats: 12 lines in 1 file changed: 11 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Thu Nov 20 20:55:22 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Nov 2025 20:55:22 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v3] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 19:03:13 GMT, Patricio Chilano Mateo wrote: >> src/java.base/share/classes/java/lang/VirtualThread.java line 1390: >> >>> 1388: } >>> 1389: >>> 1390: // -- JVM TI support -- >> >> We'll need to update is comment as it no longer only for JVMTI. >> >> This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes. Maybe I could contribute a block comment to include here? > > That would be great. Thanks, added the suggested comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547640337 From dholmes at openjdk.org Thu Nov 20 22:16:18 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Nov 2025 22:16:18 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 20:52:05 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Add Alan's comment in VirtualThread As this involves a fairly significant design change I suggest updating the JBS issue to have a more informative title e.g. "Virtual Thread transition management needs to be independent of JVM TI" ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560304232 From pchilanomate at openjdk.org Thu Nov 20 23:10:48 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Nov 2025 23:10:48 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Rename VM methods for endFirstTransition/startFinalTransition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/205ae77b..10534b33 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=02-03 Stats: 20 lines in 8 files changed: 0 ins; 0 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Thu Nov 20 23:10:49 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Nov 2025 23:10:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:14:15 GMT, David Holmes wrote: > As this involves a fairly significant design change I suggest updating the JBS issue to have a more informative title e.g. "Virtual Thread transition management needs to be independent of JVM TI" > Yes, that's better. Updated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560533185 From pchilanomate at openjdk.org Thu Nov 20 23:10:52 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Nov 2025 23:10:52 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 19:01:18 GMT, Patricio Chilano Mateo wrote: >> src/java.base/share/native/libjava/VirtualThread.c line 38: >> >>> 36: { "startFinalTransition", "()V", (void *)&JVM_VirtualThreadEnd }, >>> 37: { "startTransition", "(Z)V", (void *)&JVM_VirtualThreadStartTransition }, >>> 38: { "endTransition", "(Z)V", (void *)&JVM_VirtualThreadEndTransition }, >> >> I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition. > > How about removing these methods and just have an extra boolean parameter in `start/endTransition`? > https://github.com/pchilano/jdk/compare/JDK-8364343...pchilano:jdk:startEndTransitionsOnly I renamed the methods as suggested. I remembered that we separated ThreadStart/ThreadEnd in 8306028 for future improvements related to JVMTI. Not sure if that?s still relevant but in any case probably better to leave that discussion for a separate bug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547987864 From vlivanov at openjdk.org Thu Nov 20 23:36:43 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 20 Nov 2025 23:36:43 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com> On Wed, 19 Nov 2025 12:34:30 GMT, Coleen Phillimore wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert a couple more InstanceKlass::casts also to get GHA to restart. src/hotspot/share/opto/compile.cpp line 1729: > 1727: if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) > 1728: alias_type(idx)->set_rewritable(false); > 1729: if (flat->isa_instklassptr() && flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) I'd place the check separately. Otherwise, looks good. diff --git a/src/hotspot/share/opto/compile.cpp b/src/hotspot/share/opto/compile.cpp index 6babc13e1b3..9215c0fc03f 100644 --- a/src/hotspot/share/opto/compile.cpp +++ b/src/hotspot/share/opto/compile.cpp @@ -1726,8 +1726,6 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr } if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) alias_type(idx)->set_rewritable(false); - if (flat->offset() == in_bytes(Klass::access_flags_offset())) - alias_type(idx)->set_rewritable(false); if (flat->offset() == in_bytes(Klass::misc_flags_offset())) alias_type(idx)->set_rewritable(false); if (flat->offset() == in_bytes(Klass::java_mirror_offset())) @@ -1735,6 +1733,11 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr if (flat->offset() == in_bytes(Klass::secondary_super_cache_offset())) alias_type(idx)->set_rewritable(false); } + if (flat->isa_instklassptr()) { + if (flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) { + alias_type(idx)->set_rewritable(false); + } + } // %%% (We would like to finalize JavaThread::threadObj_offset(), // but the base pointer type is not distinctive enough to identify // references into JavaThread.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2548046511 From dholmes at openjdk.org Fri Nov 21 01:00:58 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:00:58 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition Hi Patricio, this is another significant piece of work. I have taken an initial pass through trying to digest the main parts - can't comment on the C2 code or the Java side. I have made a few minor comments/suggestions. Thanks src/hotspot/share/prims/jvm.cpp line 3668: > 3666: if (!DoJVMTIVirtualThreadTransitions) { > 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 3668: return; Does this not still need checking somewhere? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: > 160: // be executed once we go back to Java. If this is an unmount, the handshake that the > 161: // disabler executed against this carrier thread already provided the needed synchronization. > 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277: > 275: > 276: // Start of the critical region. Prevent future memory > 277: // operations to be ordered before we read the transition flag. Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so perhaps that should internally perform the `load_acquire`? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 278: > 276: // Start of the critical region. Prevent future memory > 277: // operations to be ordered before we read the transition flag. > 278: // This matches the release fence in end_transition(). Suggestion: // This pairs with the release fence in end_transition(). src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: > 305: // Block while some mount/unmount transitions are in progress. > 306: // Debug version fails and prints diagnostic information. > 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316: > 314: // operations to be ordered before we read the transition flags. > 315: // This matches the release fence in end_transition(). > 316: OrderAccess::acquire(); Surely the use of the iterator already provides the necessary ordering guarantee here as well. ? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327: > 325: // End of the critical section. Prevent previous memory operations to > 326: // be ordered after we clear the clear the disable transition flag. > 327: // This matches the equivalent acquire fence in start_transition(). Suggestion: // This pairs with the acquire in start_transition(). I just realized you are using "fence" to describe release and acquire memory barrier semantics. Given we have an operation `fence` I find this confusing for the reader - especially when we also have a `release_store_fence` operation which might be confused with "release fence". src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370: > 368: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); > 369: assert(_global_start_transition_disable_count >= 0, ""); > 370: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count + 1); Suggestion: AtomicAccess::inc(&_global_start_transition_disable_count); src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376: > 374: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); > 375: assert(_global_start_transition_disable_count > 0, ""); > 376: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count - 1); Suggestion: AtomicAccess::dec(&_global_start_transition_disable_count); src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: > 50: // parameter is_SR: suspender or resumer > 51: MountUnmountDisabler(bool exlusive = false); > 52: MountUnmountDisabler(oop thread_oop); What does the comment mean here? ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3490207826 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547887801 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548145054 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548157390 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548150552 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548160373 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548161340 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548168223 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548169846 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548170787 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548174392 From dholmes at openjdk.org Fri Nov 21 01:01:00 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:01:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 20:52:05 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Add Alan's comment in VirtualThread src/hotspot/share/classfile/javaClasses.cpp line 1757: > 1755: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); > 1756: int val = AtomicAccess::load(addr); > 1757: AtomicAccess::store(addr, val + 1); Suggestion: AtomicAccess::inc(addr); src/hotspot/share/classfile/javaClasses.cpp line 1764: > 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); > 1763: int val = AtomicAccess::load(addr); > 1764: AtomicAccess::store(addr, val - 1); Suggestion: AtomicAccess::dec(addr); src/hotspot/share/opto/runtime.hpp line 740: > 738: return vthread_transition_Type(); > 739: } > 740: I do not know C2 but this looks really strange - 4 different functions all return the same thing. ??? src/hotspot/share/runtime/handshake.cpp line 374: > 372: JavaThread* target = java_lang_Thread::thread(carrier_thread); > 373: assert(target != nullptr, ""); > 374: // Technically there is need for a ThreadsListHandle since the target Suggestion: // Technically there is no need for a ThreadsListHandle since the target ? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: > 145: MonitorLocker ml(VTMSTransition_lock); > 146: while (is_start_transition_disabled(current, vth())) { > 147: ml.wait(200); I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547864726 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547863852 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547884313 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547900707 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547963241 From dholmes at openjdk.org Fri Nov 21 01:04:59 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:04:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:35:38 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: > >> 160: // be executed once we go back to Java. If this is an unmount, the handshake that the >> 161: // disabler executed against this carrier thread already provided the needed synchronization. >> 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). > > Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? I also forgot to suggest a wording change: say "pairs with" rather than "matches". Reading back through I realize now I have misunderstood many of these comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548189421 From dholmes at openjdk.org Fri Nov 21 01:26:21 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:26:21 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition > we follow the classic Dekker pattern for the required synchronization. My understanding is that Dekker requires a "full fence" between the accesses, not just ordering memory barriers. The two variables involved must be published to all readers for the algorithm to work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560918527 From kvn at openjdk.org Fri Nov 21 01:28:23 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 01:28:23 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> Message-ID: On Wed, 19 Nov 2025 17:53:10 GMT, Ioi Lam wrote: >> So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? > > I think it's OK to rename `arg_count` to `arg_size`. There's quite a lot of existing code that does this. `arg_size` is understood to be the "number of slots". > > https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/opto/graphKit.cpp#L2365-L2366 > > https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/ci/bcEscapeAnalyzer.cpp#L192-L196 > > I am not sure about adding "slot" to "arg_modified". While there are some use of the word "slot" in the compiler APIs, it's not common. I vote for `size_of_args` suggested by Dean. I would rename local variable too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2548222005 From coleenp at openjdk.org Fri Nov 21 14:09:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 21 Nov 2025 14:09:55 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3] In-Reply-To: <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com> Message-ID: <_1-urAH6nhGRC5fXZnBvC60QvUAA7KA3ekz5sRD9MpQ=.edd7e8df-5459-4b89-a02d-5da88ce76c59@github.com> On Thu, 20 Nov 2025 23:33:30 GMT, Vladimir Ivanov wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert a couple more InstanceKlass::casts also to get GHA to restart. > > src/hotspot/share/opto/compile.cpp line 1729: > >> 1727: if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) >> 1728: alias_type(idx)->set_rewritable(false); >> 1729: if (flat->isa_instklassptr() && flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) > > I'd place the check separately. Otherwise, looks good. > > diff --git a/src/hotspot/share/opto/compile.cpp b/src/hotspot/share/opto/compile.cpp > index 6babc13e1b3..9215c0fc03f 100644 > --- a/src/hotspot/share/opto/compile.cpp > +++ b/src/hotspot/share/opto/compile.cpp > @@ -1726,8 +1726,6 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr > } > if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) > alias_type(idx)->set_rewritable(false); > - if (flat->offset() == in_bytes(Klass::access_flags_offset())) > - alias_type(idx)->set_rewritable(false); > if (flat->offset() == in_bytes(Klass::misc_flags_offset())) > alias_type(idx)->set_rewritable(false); > if (flat->offset() == in_bytes(Klass::java_mirror_offset())) > @@ -1735,6 +1733,11 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr > if (flat->offset() == in_bytes(Klass::secondary_super_cache_offset())) > alias_type(idx)->set_rewritable(false); > } > + if (flat->isa_instklassptr()) { > + if (flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) { > + alias_type(idx)->set_rewritable(false); > + } > + } > // %%% (We would like to finalize JavaThread::threadObj_offset(), > // but the base pointer type is not distinctive enough to identify > // references into JavaThread.) Yes that looks better. There aren't enough {} in that bit of code but I won't add more to existing code. Thanks for your help with the C2 code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2549873241 From coleenp at openjdk.org Fri Nov 21 14:53:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 21 Nov 2025 14:53:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Reformatting compile.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/1060463b..06d6a186 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=02-03 Stats: 8 lines in 1 file changed: 6 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From kvn at openjdk.org Fri Nov 21 20:34:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 20:34:54 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. Tobias submitted testing for these changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28241#issuecomment-3564501490 From kvn at openjdk.org Fri Nov 21 20:47:01 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 20:47:01 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. src/hotspot/share/code/nmethod.cpp line 1508: > 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER > 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. > 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. `Instead` is wrong word here I think. May be `Otherwise`. Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2550915038 From duke at openjdk.org Fri Nov 21 22:25:55 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Fri, 21 Nov 2025 22:25:55 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 20:43:33 GMT, Vladimir Kozlov wrote: >> [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) >> >> This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. >> >> --- >> >> #### 1. Test Bug >> >> It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). >> >> The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. >> >> This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` >> >> >> After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. >> >> --- >> >> #### 2. Implementation Bug >> >> `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. >> >> Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. >> >> The fix ensures that all call sites are patched **before** the `nmethod` is registered. >> >> In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. > > src/hotspot/share/code/nmethod.cpp line 1508: > >> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER >> 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. >> 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. > > `Instead` is wrong word here I think. May be `Otherwise`. > > Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. We do not add trampolines to the new nmethod if they were not present in the original. Does this comment better describe the need to do this? // A direct call whose destination was within the maximum branch range may now // be out of range after the nmethod is moved. // // CallRelocation::fix_relocation_after_move() does not perform range checks and // assumes that the call target is always directly reachable. If we were to call // it unconditionally, it could incorrectly rewrite a call site whose target now // requires a trampoline, leaving the call out of range. // // When a call site has an associated trampoline, we skip the normal call // relocation here. The corresponding trampoline_stub_Relocation will handle both // the call site and the trampoline, including performing the required range // checks and updating the call to branch through the trampoline if required. // // If no trampoline exists for the call, we know the target remains within the // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551119626 From kvn at openjdk.org Sat Nov 22 00:08:48 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 00:08:48 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:22:53 GMT, Chad Rakoczy wrote: >> src/hotspot/share/code/nmethod.cpp line 1508: >> >>> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER >>> 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. >>> 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. >> >> `Instead` is wrong word here I think. May be `Otherwise`. >> >> Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. > > We do not add trampolines to the new nmethod if they were not present in the original. > > Does this comment better describe the need to do this? > > // A direct call whose destination was within the maximum branch range may now > // be out of range after the nmethod is moved. > // > // CallRelocation::fix_relocation_after_move() does not perform range checks and > // assumes that the call target is always directly reachable. If we were to call > // it unconditionally, it could incorrectly rewrite a call site whose target now > // requires a trampoline, leaving the call out of range. > // > // When a call site has an associated trampoline, we skip the normal call > // relocation here. The corresponding trampoline_stub_Relocation will handle both > // the call site and the trampoline, including performing the required range > // checks and updating the call to branch through the trampoline if required. > // > // If no trampoline exists for the call, we know the target remains within the > // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. Okay, there are 3 cases as I understand: 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551327903 From kvn at openjdk.org Sat Nov 22 00:08:49 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 00:08:49 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:04:33 GMT, Vladimir Kozlov wrote: >> We do not add trampolines to the new nmethod if they were not present in the original. >> >> Does this comment better describe the need to do this? >> >> // A direct call whose destination was within the maximum branch range may now >> // be out of range after the nmethod is moved. >> // >> // CallRelocation::fix_relocation_after_move() does not perform range checks and >> // assumes that the call target is always directly reachable. If we were to call >> // it unconditionally, it could incorrectly rewrite a call site whose target now >> // requires a trampoline, leaving the call out of range. >> // >> // When a call site has an associated trampoline, we skip the normal call >> // relocation here. The corresponding trampoline_stub_Relocation will handle both >> // the call site and the trampoline, including performing the required range >> // checks and updating the call to branch through the trampoline if required. >> // >> // If no trampoline exists for the call, we know the target remains within the >> // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. > > Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). > I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. > > Okay, there are 3 cases as I understand: > 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. > 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. > 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. > > My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? I also assume that trampoline's code instructions can construct far distance address. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551333425 From duke at openjdk.org Sat Nov 22 00:46:39 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Sat, 22 Nov 2025 00:46:39 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:06:06 GMT, Vladimir Kozlov wrote: >> Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). >> I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. >> >> Okay, there are 3 cases as I understand: >> 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. >> 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. >> 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. >> >> My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? > > I also assume that trampoline's code instructions can construct far distance address. > My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? We should never run into the 3rd case. If a trampoline _may_ be needed it will be there. A trampoline will not be generated only if the destination is known to always be reachable. Here are some situations where this could happen: - no far branches (code cache size <= branch range) - runtime call is reachable from anywhere in code cache - (code cache begin - runtime call <= branch range) && (code cache end - runtime call <= branch range) Whether or not a trampoline is generated is dependent on the callee destination not the caller address. So we can't have the case where a trampoline is needed for a given call but it doesn't exist. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551465689 From duke at openjdk.org Sat Nov 22 01:18:37 2025 From: duke at openjdk.org (Shawn M Emery) Date: Sat, 22 Nov 2025 01:18:37 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> References: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> Message-ID: On Tue, 18 Nov 2025 21:48:12 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove K from AES_Crypt The updated intrinsics changes looks good as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3565218590 From serb at openjdk.org Sat Nov 22 02:31:53 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Sat, 22 Nov 2025 02:31:53 GMT Subject: RFR: 8365071: ARM32: JFR intrinsic jvm_commit triggers C2 regalloc assert In-Reply-To: References: <6MHwDW0E9bOzpj5B3pzlNmOCRPtFtnrk55NmTTxbhLM=.f0026c26-2c80-4766-8984-da9f34a31c8d@github.com> Message-ID: On Tue, 19 Aug 2025 04:38:54 GMT, Boris Ulasevich wrote: >> On 32-bit ARM, the jvm_commit JFR intrinsic builder feeds null (RegP) into a TypeLong Phi, causing mixed long/pointer register sizing and triggering the C2 register allocator assert(_num_regs == reg || !_num_regs). >> >> The fix is trivial: use an appropriate ConL constant instead. This has no effect on 64-bit systems (the generated assembly is identical) but resolves a JFR issue on 32-bit systems. > > Thanks! Hi @bulasevich, do you plan to backport this patch to jdk21u-dev? Seems it is also affected, I have encounter the same crash. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26684#issuecomment-3565352748 From sspitsyn at openjdk.org Sat Nov 22 09:00:52 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:52 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:52:05 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: > >> 50: // parameter is_SR: suspender or resumer >> 51: MountUnmountDisabler(bool exlusive = false); >> 52: MountUnmountDisabler(oop thread_oop); > > What does the comment mean here? This comment is stale now and must be removed. The parameter `is_SR` is being replaced with the `exclusive`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552577039 From sspitsyn at openjdk.org Sat Nov 22 09:00:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition src/hotspot/share/runtime/javaThread.cpp line 1173: > 1171: bool JavaThread::java_suspend(bool register_vthread_SR) { > 1172: #if INCLUDE_JVMTI > 1173: // Suspending a JavaThread in VTMS transition or disabling VTMS transitions can cause deadlocks. Q: I wonder if the `#if INCLUDE_JVMTI` and `#endif` can be removed here. src/hotspot/share/runtime/mountUnmountDisabler.cpp line 126: > 124: || global_start_transition_disable_count() > base_disable_count > 125: JVMTI_ONLY(|| (JvmtiVTSuspender::is_vthread_suspended(java_lang_Thread::thread_id(vthread)) || thread->is_suspended())); > 126: } I like this approach with the JVMTIStartTransition and JVMTIEndTransition helper classes. It is a nice way to decouple the JVMTI part of the protocol. Introducing the `is_start_transition_disabled()` function was also long desired. Also, I like the functions `start_transition()` and `end_transition()` became pretty simple and clean! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552502964 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552624330 From sspitsyn at openjdk.org Sat Nov 22 09:00:50 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:50 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:35 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Alan's comment in VirtualThread > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: > >> 145: MonitorLocker ml(VTMSTransition_lock); >> 146: while (is_start_transition_disabled(current, vth())) { >> 147: ml.wait(200); > > I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. The timeouts are for reliability purposes only. Technically, they are not needed and can be removed after this code becomes stable. The `wait()` calls are inside while loop which rechecks the loop-ending conditions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552610777 From sspitsyn at openjdk.org Sat Nov 22 09:15:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:15:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition I've completed my first pass trough this update, and it looks pretty solid in general. I'm going to make another pass next week. src/hotspot/share/prims/jvm.cpp line 3682: > 3680: JVM_ENTRY(void, JVM_VirtualThreadEndTransition(JNIEnv* env, jobject vthread, jboolean is_mount)) > 3681: oop vt = JNIHandles::resolve_external_guard(vthread); > 3682: MountUnmountDisabler::end_transition(thread, vt, is_mount, false /*is_thread_start*/); The `JVM_VirtualThread*` functions have been nicely simplified. ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3496249775 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552682664 From mdoerr at openjdk.org Sat Nov 22 11:04:53 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 22 Nov 2025 11:04:53 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: References: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> Message-ID: On Sat, 22 Nov 2025 01:15:11 GMT, Shawn M Emery wrote: > The updated intrinsics changes looks good as well, except why are lines 7456 and 8631 not changing in src/hotspot/share/opto/library_call.cpp? Thanks a lot for reviewing! These two lines use the default `is_decrypt = false` because `inline_counterMode_AESCrypt()` and `inline_galoisCounterMode_AESCrypt()` only do encryption. We could make that explicit if you prefer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3566535235 From kvn at openjdk.org Sat Nov 22 16:57:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 16:57:50 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:43:59 GMT, Chad Rakoczy wrote: >> I also assume that trampoline's code instructions can construct far distance address. > >> My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? > > We should never run into the 3rd case. If a trampoline _may_ be needed it will be there. > > A trampoline will not be generated only if the destination is known to always be reachable. Here are some situations where this could happen: > - no far branches (code cache size <= branch range) > - runtime call is reachable from anywhere in code cache > - (code cache begin - runtime call <= branch range) && (code cache end - runtime call <= branch range) > > Whether or not a trampoline is generated is dependent on the callee destination not the caller address. So we can't have the case where a trampoline is needed for a given call but it doesn't exist. May be we should change the assert to guarantee in `Relocation::pd_set_call_destination()` to make sure we catch incorrect patching it product VM. Looking on `NativeCall::set_destination_mt_safe` and `reachable` is calculated based on distance between address of call instruction and destination. Which could be different for cloned nmethod. On x86 were have guarantee in `NativeCall::set_destination()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2553244181 From kvn at openjdk.org Sat Nov 22 17:04:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 17:04:53 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. Tobias's testing dod not find any new failures. I am still concern about patching. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28241#issuecomment-3566888592 From duke at openjdk.org Sun Nov 23 05:30:47 2025 From: duke at openjdk.org (Shawn M Emery) Date: Sun, 23 Nov 2025 05:30:47 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> References: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> Message-ID: On Tue, 18 Nov 2025 21:48:12 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove K from AES_Crypt It looks like the S390 architecture uses it for encryption and decryption in AES/CTR mode, but S390 only needs the symmetric key to derive the encryption and decryption schedules. This can be found for both in the first round. For x86, yes, encryption is only performed for both AES/CTR and AES/GCM. So, yes, I think having the 'is_decrypt' argument explicit would be ideal. nit: some comments still refer to the 'K' array in src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp, src/hotspot/share/opto/library_call.cpp, src/hotspot/cpu/riscv/stubGenerator_riscv.cpp (line 2446), and src/hotspot/cpu/ppc/stubGenerator_ppc.cpp. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3567503379 From duke at openjdk.org Mon Nov 24 15:46:20 2025 From: duke at openjdk.org (Zihao Lin) Date: Mon, 24 Nov 2025 15:46:20 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: Fix test failed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/329e290a..35ec9135 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=11-12 Stats: 21 lines in 1 file changed: 14 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From mdoerr at openjdk.org Mon Nov 24 17:00:04 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 24 Nov 2025 17:00:04 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v5] In-Reply-To: References: Message-ID: > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Address review comments. - Merge remote-tracking branch 'origin' into 8371820_AES_Crypt - Remove K from AES_Crypt - More minor cleanup. - Improve comment and minor cleanup. - 8371820: Further AES performance improvements for key schedule generation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28299/files - new: https://git.openjdk.org/jdk/pull/28299/files/2b981288..30b5b531 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=03-04 Stats: 65976 lines in 954 files changed: 44668 ins; 14938 del; 6370 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From mdoerr at openjdk.org Mon Nov 24 17:00:06 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 24 Nov 2025 17:00:06 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v4] In-Reply-To: References: <0HJnSUSQA8RuwnNxu-SiGvZTzHYLJ5kY0_B6lG2EbAQ=.10868fac-1516-4a80-b4e5-9ff14997ba01@github.com> Message-ID: On Sun, 23 Nov 2025 05:27:23 GMT, Shawn M Emery wrote: > It looks like the S390 architecture uses it for encryption and decryption in AES/CTR mode, but S390 only needs the symmetric key to derive the encryption and decryption schedules. This can be found for both in the first round. For x86, yes, encryption is only performed for both AES/CTR and AES/GCM. So, yes, I think having the 'is_decrypt' argument explicit would be ideal. nit: some comments still refer to the 'K' array in src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp, src/hotspot/share/opto/library_call.cpp, src/hotspot/cpu/riscv/stubGenerator_riscv.cpp (line 2446), and src/hotspot/cpu/ppc/stubGenerator_ppc.cpp. Thanks for your feedback! I've made the changes with https://github.com/openjdk/jdk/pull/28299/commits/30b5b531338c225e505a30dcca3453e35b68b256. I hope I found all places. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3571788446 From duke at openjdk.org Mon Nov 24 22:03:39 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 24 Nov 2025 22:03:39 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 16:54:42 GMT, Vladimir Kozlov wrote: > May be we should change the assert to guarantee in Relocation::pd_set_call_destination() to make sure we catch incorrect patching it product VM. I'm not opposed to changing this. Is this the main concern? > Looking on `NativeCall::set_destination_mt_safe` and `reachable` is calculated based on distance between address of call instruction and destination. Which could be different for cloned nmethod. I'm not sure I understand what you're saying here. I agree the offset is most likely different after the nmethod is cloned. The offset gets fixed by `trampoline_stub_Relocation::fix_relocation_after_move` since it could be out of range. Since `CallRelocation::fix_relocation_after_move` sets the destination to whatever was passed (regardless of if it is in range or not) it does not make sense to call this on the relocated nmethod which is why we skip it. I believe `Relocation::pd_set_call_destination` for aarch64 could use `set_destination_mt_safe` instead of `set_destination` which was an alternative approach in the original PR. The original discussion is [here](https://github.com/openjdk/jdk/pull/23573#discussion_r2123618495). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2557844143 From vlivanov at openjdk.org Mon Nov 24 22:59:36 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 24 Nov 2025 22:59:36 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Fri, 21 Nov 2025 14:53:03 GMT, Coleen Phillimore wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Reformatting compile.cpp src/hotspot/share/opto/library_call.cpp line 4100: > 4098: // Other types can report the actual _super. > 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) > 4100: if (generate_interface_guard(kls, region) != nullptr) BTW why did you decide to change the order of the checks? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2557971918 From dlong at openjdk.org Tue Nov 25 01:05:01 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 25 Nov 2025 01:05:01 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Mon, 24 Nov 2025 22:57:18 GMT, Vladimir Ivanov wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Reformatting compile.cpp > > src/hotspot/share/opto/library_call.cpp line 4100: > >> 4098: // Other types can report the actual _super. >> 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) >> 4100: if (generate_interface_guard(kls, region) != nullptr) > > BTW why did you decide to change the order of the checks? I noticed that too. It is necessary for correctness now. It is incorrect and unsafe to use generate_interface_guard() on array after this change, because an array klass is not an InstanceKlass. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2558168496 From vlivanov at openjdk.org Tue Nov 25 01:28:47 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 25 Nov 2025 01:28:47 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 25 Nov 2025 01:01:17 GMT, Dean Long wrote: >> src/hotspot/share/opto/library_call.cpp line 4100: >> >>> 4098: // Other types can report the actual _super. >>> 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) >>> 4100: if (generate_interface_guard(kls, region) != nullptr) >> >> BTW why did you decide to change the order of the checks? > > I noticed that too. It is necessary for correctness now. It is incorrect and unsafe to use generate_interface_guard() on array after this change, because an array klass is not an InstanceKlass. Oh, that's subtle... It deserves a comment at least. We could also change `LibraryCallKit::generate_interface_guard()` to require `kls` to be of type `TypeInstKlassPtr`, but then we would need a cast before calling it from `LibraryCallKit::inline_native_Class_query()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2558217038 From duke at openjdk.org Tue Nov 25 07:20:00 2025 From: duke at openjdk.org (Shawn M Emery) Date: Tue, 25 Nov 2025 07:20:00 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v5] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 17:00:04 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Address review comments. > - Merge remote-tracking branch 'origin' into 8371820_AES_Crypt > - Remove K from AES_Crypt > - More minor cleanup. > - Improve comment and minor cleanup. > - 8371820: Further AES performance improvements for key schedule generation src/hotspot/share/opto/library_call.cpp line 7236: > 7234: address stubAddr = nullptr; > 7235: const char *stubName = nullptr; > 7236: bool is_decrypt= false; nit: s/is_decrypt= false/is_decrypt = false/ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2558797828 From mdoerr at openjdk.org Tue Nov 25 09:25:25 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 25 Nov 2025 09:25:25 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v6] In-Reply-To: References: Message-ID: <_BiA3wQ_PuxbuapWJg0uG2PSv0_0AAPOmznFOTH4hcU=.08997b37-2cde-417f-891a-779bd7291b1f@github.com> > This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. > > The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Fix missing whitespace. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28299/files - new: https://git.openjdk.org/jdk/pull/28299/files/30b5b531..ae84912d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28299&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28299.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28299/head:pull/28299 PR: https://git.openjdk.org/jdk/pull/28299 From mdoerr at openjdk.org Tue Nov 25 09:25:31 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 25 Nov 2025 09:25:31 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v5] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 07:16:08 GMT, Shawn M Emery wrote: >> Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Address review comments. >> - Merge remote-tracking branch 'origin' into 8371820_AES_Crypt >> - Remove K from AES_Crypt >> - More minor cleanup. >> - Improve comment and minor cleanup. >> - 8371820: Further AES performance improvements for key schedule generation > > src/hotspot/share/opto/library_call.cpp line 7236: > >> 7234: address stubAddr = nullptr; >> 7235: const char *stubName = nullptr; >> 7236: bool is_decrypt= false; > > nit: s/is_decrypt= false/is_decrypt = false/ Thanks for catching this! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28299#discussion_r2559217516 From roland at openjdk.org Tue Nov 25 12:52:35 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 25 Nov 2025 12:52:35 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v4] In-Reply-To: References: Message-ID: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: - review - review - Merge branch 'master' into JDK-8354282 - review - infinite loop in gvn fix - renaming - merge - Merge branch 'master' into JDK-8354282 - fix & test ------------- Changes: https://git.openjdk.org/jdk/pull/24575/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=03 Stats: 353 lines in 13 files changed: 252 ins; 27 del; 74 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From duke at openjdk.org Tue Nov 25 16:33:46 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 25 Nov 2025 16:33:46 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 13:12:59 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: >> >> - fix assert >> - add more assert >> - rid of access.addr().type() >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > src/hotspot/share/opto/callnode.cpp line 1740: > >> 1738: Node* klass_node = in(AllocateNode::KlassNode); >> 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); >> 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); > > We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw I give it a try, but it won't pass the test. Is it possible the original version is wrong? The class mark will not be `TypeRawPtr::BOTTOM`, it should equal to Klass slice index. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2560657848 From pchilanomate at openjdk.org Tue Nov 25 19:58:40 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:40 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> On Thu, 20 Nov 2025 22:17:53 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Alan's comment in VirtualThread > > src/hotspot/share/classfile/javaClasses.cpp line 1757: > >> 1755: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >> 1756: int val = AtomicAccess::load(addr); >> 1757: AtomicAccess::store(addr, val + 1); > > Suggestion: > > AtomicAccess::inc(addr); Same here. > src/hotspot/share/classfile/javaClasses.cpp line 1764: > >> 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >> 1763: int val = AtomicAccess::load(addr); >> 1764: AtomicAccess::store(addr, val - 1); > > Suggestion: > > AtomicAccess::dec(addr); I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > src/hotspot/share/opto/runtime.hpp line 740: > >> 738: return vthread_transition_Type(); >> 739: } >> 740: > > I do not know C2 but this looks really strange - 4 different functions all return the same thing. ??? We need to define them because the `GEN_C2_STUB` macro will look for the type of the C function based on its name (`C2_STUB_TYPEFUNC(name)`), otherwise we get a compilation failure. The four C functions have the same type though so they all return `_vthread_transition_Type`. > src/hotspot/share/runtime/handshake.cpp line 374: > >> 372: JavaThread* target = java_lang_Thread::thread(carrier_thread); >> 373: assert(target != nullptr, ""); >> 374: // Technically there is need for a ThreadsListHandle since the target > > Suggestion: > > // Technically there is no need for a ThreadsListHandle since the target > > ? Yes, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561198741 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561198549 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561200538 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561202212 From pchilanomate at openjdk.org Tue Nov 25 19:58:34 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:34 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v5] In-Reply-To: References: Message-ID: <8QdmTglMpQwGWG0QeQLbeduPrF1qZkah-9RzQwSOQuY=.fa030def-7674-48e3-bc25-23358009ed87@github.com> > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: David's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/10534b33..b54594c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=03-04 Stats: 46 lines in 4 files changed: 18 ins; 1 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Tue Nov 25 19:58:43 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:43 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v5] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:26:26 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> David's comments > > src/hotspot/share/prims/jvm.cpp line 3668: > >> 3666: if (!DoJVMTIVirtualThreadTransitions) { >> 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); >> 3668: return; > > Does this not still need checking somewhere? The check for `DoJVMTIVirtualThreadTransitions` was moved to the `JVMTIStartTransition\JVMTIEndTransition` classes, but I guess you refer to the assert: I missed to move it. Added now too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561201839 From pchilanomate at openjdk.org Tue Nov 25 19:58:45 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:45 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 08:37:39 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: >> >>> 145: MonitorLocker ml(VTMSTransition_lock); >>> 146: while (is_start_transition_disabled(current, vth())) { >>> 147: ml.wait(200); >> >> I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. > > The timeouts are for reliability purposes only. Technically, they are not needed and can be removed after this code becomes stable. The `wait()` calls are inside while loop which rechecks the loop-ending conditions. I tried to minimize the changes with respect to the current code so I kept the timed-waits. As Serguei points out we should be able to remove this particular one. As for the ones executed by the disablers, we could make them poll for the transition bits in a loop with backoff, similar to how we do it in safepoint and handshake cases. But I agree with Serguei we should do it in a separate bug once the code is stable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561204423 From pchilanomate at openjdk.org Tue Nov 25 19:58:47 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:47 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:01:48 GMT, David Holmes wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: >> >>> 160: // be executed once we go back to Java. If this is an unmount, the handshake that the >>> 161: // disabler executed against this carrier thread already provided the needed synchronization. >>> 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). >> >> Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? > > I also forgot to suggest a wording change: say "pairs with" rather than "matches". Reading back through I realize now I have misunderstood many of these comments. Changed to `pairs with`. I rewrote the comments so hopefully?they are more clear now. I also added a comment in `VirtualThread.mount/unmount` where the memory barriers should be. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561205763 From pchilanomate at openjdk.org Tue Nov 25 19:58:53 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:53 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:42:32 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277: > >> 275: >> 276: // Start of the critical region. Prevent future memory >> 277: // operations to be ordered before we read the transition flag. > > Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so perhaps that should internally perform the `load_acquire`? Yes, but that would also call for doing the same with `JavaThread::_is_in_VTMS_transition` for the `VTMS_transition_disable_for_all` case, and also have the pairing release stores by the virtual thread in `end_transition` on those same addresses, otherwise it would be confusing. And same with the other side, i.e doing load_acquire by the virtual thread of `_VTMS_transition_disable_count` and `_global_start_transition_disable_count` on `start_transition` and release store by the disabler when enabling transitions again. But I wanted to avoid unnecessary barriers in the virtual thread transition side, so I kept them as plain load/stores with separate memory barriers when necessary. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: > >> 305: // Block while some mount/unmount transitions are in progress. >> 306: // Debug version fails and prints diagnostic information. >> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { > > This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316: > >> 314: // operations to be ordered before we read the transition flags. >> 315: // This matches the release fence in end_transition(). >> 316: OrderAccess::acquire(); > > Surely the use of the iterator already provides the necessary ordering guarantee here as well. ? We still need it because we need to prevent reordering of loads from the critical section with loads of `jt->is_in_VTMS_transition()`. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327: > >> 325: // End of the critical section. Prevent previous memory operations to >> 326: // be ordered after we clear the clear the disable transition flag. >> 327: // This matches the equivalent acquire fence in start_transition(). > > Suggestion: > > // This pairs with the acquire in start_transition(). > > I just realized you are using "fence" to describe release and acquire memory barrier semantics. Given we have an operation `fence` I find this confusing for the reader - especially when we also have a `release_store_fence` operation which might be confused with "release fence". Right, I changed it now to use the terms acquire and release barrier respectively. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370: > >> 368: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); >> 369: assert(_global_start_transition_disable_count >= 0, ""); >> 370: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count + 1); > > Suggestion: > > AtomicAccess::inc(&_global_start_transition_disable_count); I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376: > >> 374: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); >> 375: assert(_global_start_transition_disable_count > 0, ""); >> 376: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count - 1); > > Suggestion: > > AtomicAccess::dec(&_global_start_transition_disable_count); Same here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561208616 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561210899 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561216984 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561219344 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561219842 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561220292 From pchilanomate at openjdk.org Tue Nov 25 19:59:38 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:59:38 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:23:17 GMT, David Holmes wrote: > > we follow the classic Dekker pattern for the required synchronization. > > My understanding is that Dekker requires a "full fence" between the accesses, not just ordering memory barriers. The two variables involved must be published to all readers for the algorithm to work. > No need to argue too much about this one because `StoreLoad` is implemented as a full fence so we can easily change it, but from reading the definitions in `OrderAccess` my understanding was that technically it should be enough. The `StoreStore` comment clarifies the meaning of word `completes` (used later in `StoreLoad`) as `the effect on memory of Store1 is made visible to other processors`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3577347289 From pchilanomate at openjdk.org Tue Nov 25 20:10:56 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:10:56 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v6] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Remove INCLUDE_JVMTI macro ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/b54594c4..4c598ad4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Tue Nov 25 20:11:00 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:11:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> References: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> Message-ID: On Sat, 22 Nov 2025 07:52:40 GMT, Serguei Spitsyn wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/javaThread.cpp line 1173: > >> 1171: bool JavaThread::java_suspend(bool register_vthread_SR) { >> 1172: #if INCLUDE_JVMTI >> 1173: // Suspending a JavaThread in VTMS transition or disabling VTMS transitions can cause deadlocks. > > Q: I wonder if the `#if INCLUDE_JVMTI` and `#endif` can be removed here. Yes, removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561254972 From pchilanomate at openjdk.org Tue Nov 25 20:11:03 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:11:03 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 08:22:34 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: >> >>> 50: // parameter is_SR: suspender or resumer >>> 51: MountUnmountDisabler(bool exlusive = false); >>> 52: MountUnmountDisabler(oop thread_oop); >> >> What does the comment mean here? > > This comment is stale now and must be removed. The parameter `is_SR` is being replaced with the `exclusive`. Right, removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561250374 From pchilanomate at openjdk.org Tue Nov 25 22:59:59 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 22:59:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: keep preexisting rebind order for mount ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/4c598ad4..dee2b843 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=05-06 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From coleenp at openjdk.org Wed Nov 26 00:01:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: Message-ID: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> On Tue, 25 Nov 2025 22:59:59 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > keep preexisting rebind order for mount First read through, mostly questions and plea for comments. This is a nice refactoring and cleanup of some very difficult code. You don't have to do the renaming that I requested now if you don't want to. src/hotspot/share/classfile/javaClasses.cpp line 1688: > 1686: int java_lang_Thread::_jvmti_thread_state_offset; > 1687: int java_lang_Thread::_VTMS_transition_disable_count_offset; > 1688: int java_lang_Thread::_is_in_VTMS_transition_offset; Since you're renaming these anyway, can we drop the VTMS part? Just call it vthread_transition_disable_count_offset and is_in_vthread_transition_offset? There are other VTMS named things that aren't these flags but they can stay. Maybe migrate other names at some future point. src/hotspot/share/opto/library_call.cpp line 3046: > 3044: } > 3045: > 3046: bool LibraryCallKit::inline_native_vthread_start_transition(address funcAddr, const char* funcName, bool is_final_transition) { Would it be helpful to add a comment above this to say what this does? This is supposed to match some non-intrinsic code and might be helpful if you referenced that here. src/hotspot/share/prims/jvm.cpp line 3671: > 3669: > 3670: JVM_ENTRY(void, JVM_VirtualThreadStartFinalTransition(JNIEnv* env, jobject vthread)) > 3671: oop vt = JNIHandles::resolve_external_guard(vthread); Why do the opto runtime versions set is_in_VTMTS_transition in both the java.lang.Thread and JavaThread and these don't? src/hotspot/share/prims/jvmtiEnv.cpp line 1827: > 1825: JvmtiEnv::ClearAllFramePops(jthread thread) { > 1826: ResourceMark rm; > 1827: MountUnmountDisabler disabler(thread); Not for this change but I thought JVMTI had some xml code that generated prefixes for these functions. This seems like something that could be unified somewhere tbd. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1772: > 1770: > 1771: assert(java_thread != nullptr, "sanity check"); > 1772: assert(!java_thread->is_in_VTMS_transition(), "sanity check"); Why don't you need these asserts anymore? src/hotspot/share/runtime/javaThread.cpp line 1152: > 1150: bool JavaThread::is_in_VTMS_transition() const { > 1151: return AtomicAccess::load(&_is_in_VTMS_transition); > 1152: } Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? src/hotspot/share/runtime/mountUnmountDisabler.hpp line 34: > 32: > 33: class MountUnmountDisabler : public AnyObj { > 34: static volatile int _global_start_transition_disable_count; Can you describe this variable - when is it set and why is there a global disabler? What does it mean to have 'n' active disablers? A comment at the beginning of MountUnmountDisabler to say something of the effect that during virtual thread mounting and unmounting, JVMTI and operations that need to examine thread state need to be disabled. Or is it the converse? During JVMTI and operations that examine the state of threads, virtual thread mounting and unmounting must wait until these operations are complete. This class is for the latter right? src/hotspot/share/runtime/mutexLocker.cpp line 52: > 50: Mutex* JvmtiThreadState_lock = nullptr; > 51: Monitor* EscapeBarrier_lock = nullptr; > 52: Monitor* VTMSTransition_lock = nullptr; oh you could drop the name VTMS and call it VThreadTransitionLock can't you? ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3507302896 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561864174 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561876549 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561897865 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561904709 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561910057 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561926510 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561943253 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561945493 From coleenp at openjdk.org Wed Nov 26 00:01:01 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:01 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:32:40 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> keep preexisting rebind order for mount > > src/hotspot/share/runtime/javaThread.cpp line 1152: > >> 1150: bool JavaThread::is_in_VTMS_transition() const { >> 1151: return AtomicAccess::load(&_is_in_VTMS_transition); >> 1152: } > > Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? Why is there the same flag with the same name in both the Java class and C++ JavaThread? Might be an efficient cache, so something should say that (if true). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561962218 From coleenp at openjdk.org Wed Nov 26 00:01:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:03 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> References: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> Message-ID: On Sat, 22 Nov 2025 08:43:07 GMT, Serguei Spitsyn wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 126: > >> 124: || global_start_transition_disable_count() > base_disable_count >> 125: JVMTI_ONLY(|| (JvmtiVTSuspender::is_vthread_suspended(java_lang_Thread::thread_id(vthread)) || thread->is_suspended())); >> 126: } > > I like this approach with the JVMTIStartTransition and JVMTIEndTransition helper classes. It is a nice way to decouple the JVMTI part of the protocol. Introducing the `is_start_transition_disabled()` function was also long desired. Also, I like the functions `start_transition()` and `end_transition()` became pretty simple and clean! This is the function that needs a comment why you're testing all these things (and why base_disable_count is one for JVMTI). It's nice as a function that tests all the different values. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561977191 From coleenp at openjdk.org Wed Nov 26 00:01:06 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:06 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> On Tue, 25 Nov 2025 19:50:06 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: >> >>> 305: // Block while some mount/unmount transitions are in progress. >>> 306: // Debug version fails and prints diagnostic information. >>> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { >> >> This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. > > Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) This could be a simple cleanup of all these occurrences later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561991623 From duke at openjdk.org Wed Nov 26 04:44:50 2025 From: duke at openjdk.org (Shawn M Emery) Date: Wed, 26 Nov 2025 04:44:50 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v5] In-Reply-To: References: Message-ID: <_q96s_UbnHbgnVHqppMwnZ7J-_WEslZk7J3E0GQVbW0=.e4d9cfd3-5196-4aa3-9509-e2c309a33740@github.com> On Mon, 24 Nov 2025 17:00:04 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Address review comments. > - Merge remote-tracking branch 'origin' into 8371820_AES_Crypt > - Remove K from AES_Crypt > - More minor cleanup. > - Improve comment and minor cleanup. > - 8371820: Further AES performance improvements for key schedule generation The internal testing came back clean and performed AESReinit benchmarks on aarch64, where a 8.9% performance gain was observed with the complete set of changes! ------------- Marked as reviewed by smemery at github.com (no known OpenJDK username). PR Review: https://git.openjdk.org/jdk/pull/28299#pullrequestreview-3508635701 From dholmes at openjdk.org Wed Nov 26 07:32:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 07:32:52 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> References: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> Message-ID: On Tue, 25 Nov 2025 19:45:08 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/classfile/javaClasses.cpp line 1764: >> >>> 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >>> 1763: int val = AtomicAccess::load(addr); >>> 1764: AtomicAccess::store(addr, val - 1); >> >> Suggestion: >> >> AtomicAccess::dec(addr); > > I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. But it isn't then an atomic update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2563699480 From dholmes at openjdk.org Wed Nov 26 07:35:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 07:35:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> References: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> Message-ID: On Tue, 25 Nov 2025 23:53:40 GMT, Coleen Phillimore wrote: >> Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) > > This could be a simple cleanup of all these occurrences later. Yes this is terribly obscure (doing the assignment in the loop condition check - surprised that is even allowed) and also violates the style-guide in relation to implicit booleans. But frankly it is an awful use of a for-loop in my opinion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2563714065 From duke at openjdk.org Wed Nov 26 08:36:09 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Nov 2025 08:36:09 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 13:20:26 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: >> >> - fix assert >> - add more assert >> - rid of access.addr().type() >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > Can we remove `C2AccessValuePtr` entirely and use: > > Node* _addr; > > where, currently, there's: > > C2AccessValuePtr& _addr; > > ? Hi @rwestrel , I removed C2AccessValuePtr, Could you please take a look, thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24258#issuecomment-3580115736 From duke at openjdk.org Wed Nov 26 08:36:12 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Nov 2025 08:36:12 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 16:30:09 GMT, Zihao Lin wrote: >> src/hotspot/share/opto/callnode.cpp line 1740: >> >>> 1738: Node* klass_node = in(AllocateNode::KlassNode); >>> 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); >>> 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); >> >> We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw > > Hi, I give it a try, but it failed pass the test. Is it possible the original version is wrong? > The mark word will not be `TypeRawPtr::BOTTOM`, it should equal to Klass slice index. One dump is ` 1368 AddP === _ 196 196 1367 [[ ]] Klass:precise java/util/LinkedHashMap$Entry: 0x0000000918349ca0 (java/util/Map$Entry):Constant:exact+168 *` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2563948581 From roland at openjdk.org Wed Nov 26 08:38:05 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 26 Nov 2025 08:38:05 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 08:30:18 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: >> >> - review >> - infinite loop in gvn fix >> - renaming > > @rwestrel Sorry I dropped the review on this one for a long time :/ > > I left quite a few comments. But on the whole I'm really happy with the direction you are taking. It's getting much clearer. I would still see some more clear explanations/comments. That way, we can make our previously implicit assumptions even more explicit :) @eme64 updated change should address your comments ------------- PR Comment: https://git.openjdk.org/jdk/pull/24575#issuecomment-3580124357 From mdoerr at openjdk.org Wed Nov 26 09:25:51 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 26 Nov 2025 09:25:51 GMT Subject: RFR: 8371820: Further AES performance improvements for key schedule generation [v6] In-Reply-To: <_BiA3wQ_PuxbuapWJg0uG2PSv0_0AAPOmznFOTH4hcU=.08997b37-2cde-417f-891a-779bd7291b1f@github.com> References: <_BiA3wQ_PuxbuapWJg0uG2PSv0_0AAPOmznFOTH4hcU=.08997b37-2cde-417f-891a-779bd7291b1f@github.com> Message-ID: On Tue, 25 Nov 2025 09:25:25 GMT, Martin Doerr wrote: >> This fix simplifies the hotspot intrinsics for some platforms and optimizes the key computation for encryption. We can save the `genInvRoundKeys` computation when we only do encryption. >> >> The micro:org.openjdk.bench.javax.crypto.AESReinit benchmark results are improved by 17% for ppc64 and 26% for x86_64. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing whitespace. Thanks a lot for benchmarking and for your review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28299#issuecomment-3580390542 From jbhateja at openjdk.org Wed Nov 26 11:34:11 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 26 Nov 2025 11:34:11 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v5] In-Reply-To: References: Message-ID: > Add a new Float16lVector type and corresponding concrete vector classes, in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added Float16Vector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected Float16Vector benchmarking kernels compared to equivalent auto-vectorized Float16OperationsBenchmark kernels. > > image > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Cleanups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28002/files - new: https://git.openjdk.org/jdk/pull/28002/files/aca6cc5d..756a0d0c Webrevs: - full: Webrev is not available because diff is too large - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28002&range=03-04 Stats: 26 lines in 9 files changed: 5 ins; 7 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From chagedorn at openjdk.org Wed Nov 26 14:31:58 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 26 Nov 2025 14:31:58 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v4] In-Reply-To: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> References: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> Message-ID: <4QQp7C7iIVfVs1MoUMC56KCgVGpXu5ziTHfZ-f2pk6o=.4ca7e1a8-3f31-44d3-aaec-30429ed7e2b0@github.com> On Tue, 25 Nov 2025 12:52:35 GMT, Roland Westrelin wrote: >> This is a variant of 8332827. In 8332827, an array access becomes >> dependent on a range check `CastII` for another array access. When, >> after loop opts are over, that RC `CastII` was removed, the array >> access could float and an out of bound access happened. With the fix >> for 8332827, RC `CastII`s are no longer removed. >> >> With this one what happens is that some transformations applied after >> loop opts are over widen the type of the RC `CastII`. As a result, the >> type of the RC `CastII` is no longer narrower than that of its input, >> the `CastII` is removed and the dependency is lost. >> >> There are 2 transformations that cause this to happen: >> >> - after loop opts are over, the type of the `CastII` nodes are widen >> so nodes that have the same inputs but a slightly different type can >> common. >> >> - When pushing a `CastII` through an `Add`, if of the type both inputs >> of the `Add`s are non constant, then we end up widening the type >> (the resulting `Add` has a type that's wider than that of the >> initial `CastII`). >> >> There are already 3 types of `Cast` nodes depending on the >> optimizations that are allowed. Either the `Cast` is floating >> (`depends_only_test()` returns `true`) or pinned. Either the `Cast` >> can be removed if it no longer narrows the type of its input or >> not. We already have variants of the `CastII`: >> >> - if the Cast can float and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and can't be removed when it doesn't narrow >> the type of its input. >> >> What we need here, I think, is the 4th combination: >> >> - if the Cast can float and can't be removed when it doesn't narrow >> the type of its input. >> >> Anyway, things are becoming confusing with all these different >> variants named in ways that don't always help figure out what >> constraints one of them operate under. So I refactored this and that's >> the biggest part of this change. The fix consists in marking `Cast` >> nodes when their type is widen in a way that prevents them from being >> optimized out. >> >> Tobias ran performance testing with a slightly different version of >> this change and there was no regression. > > Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - review > - review > - Merge branch 'master' into JDK-8354282 > - review > - infinite loop in gvn fix > - renaming > - merge > - Merge branch 'master' into JDK-8354282 > - fix & test Introducing a 4th dependency type looks reasonable. It's also nice to see one more refactoring in that area which makes it very expressive now. Thanks for doing that! I left some suggestions to possibly further improve the code. src/hotspot/share/opto/castnode.cpp line 40: > 38: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::FloatingNonNarrowing(true, false, "floating non narrowing dependency"); // not pinned, doesn't narrow type > 39: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNarrowing(false, true, "now floating narrowing dependency"); // pinned, narrows type > 40: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNonNarrowing(false, false, "non floating non narrowing dependency"); // pinned, doesn't narrow type Adding `-`: Suggestion: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::FloatingNonNarrowing(true, false, "floating non-narrowing dependency"); // not pinned, doesn't narrow type const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNarrowing(false, true, "non-floating narrowing dependency"); // pinned, narrows type const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNonNarrowing(false, false, "non-floating non-narrowing dependency"); // pinned, doesn't narrow type src/hotspot/share/opto/castnode.cpp line 50: > 48: if (!_dependency.narrows_type()) { > 49: return this; > 50: } I suggest to split the comment to make it more clear: Suggestion: if (!_dependency.narrows_type()) { // If this cast doesn't carry a type dependency (i.e. not used for type narrowing), we cannot optimize it. return this; } // This cast node carries a type depedency. We can remove it if: // - Its input has a narrower type // - There's a dominating cast with same input but narrower type src/hotspot/share/opto/castnode.cpp line 634: > 632: if (wide_t != bottom_t) { > 633: // Widening the type of the Cast (to allow some commoning) causes the Cast to change how it can be optimized (if > 634: // type of its input is narrower than the Cast's type, we can't remove it to not loose the dependency). Suggestion: // type of its input is narrower than the Cast's type, we can't remove it to not loose the control dependency). src/hotspot/share/opto/castnode.hpp line 101: > 99: } > 100: return NonFloatingNonNarrowing; > 101: } Just a side note: We seem to mix the terms "(non-)pinned" with "(non-)floating" freely. Should we stick to just one? But maybe it's justified to use both depending on the situation/code context. src/hotspot/share/opto/castnode.hpp line 120: > 118: // be removed in any case otherwise the sunk node floats back into the loop. > 119: static const DependencyType NonFloatingNonNarrowing; > 120: I needed a moment to completely understand all these combinations. I rewrote the definitions in this process a little bit. Feel free to take some of it over: // All the possible combinations of floating/narrowing with example use cases: // Use case example: Range Check CastII // Floating: The Cast is only dependent on the single range check. // Narrowing: The Cast narrows the type to a positive index. If the input to the Cast is narrower, we can safely // remove the cast because the array access will be safe. static const DependencyType FloatingNarrowing; // Use case example: Widening Cast nodes' types after loop opts: We want to common Casts with slightly different types. // Floating: These Casts only depend on the single control. // NonNarrowing: Even when the input type is narrower, we are not removing the Cast. Otherwise, the dependency // to the single control is lost, and an array access could float above its range check because we // just removed the dependency to the range check by removing the Cast. This could lead to an // out-of-bounds access. static const DependencyType FloatingNonNarrowing; // Use case example: An array accesses that is no longer dependent on a single range check (e.g. range check smearing). // NonFloating: The array access must be pinned below all the checks it depends on. If the check it directly depends // on with a control input is hoisted, we do hoist the Cast as well. If we allowed the Cast to float, // we risk that the array access ends up above another check it depends on (we cannot model two control // dependencies for a node in the IR). This could lead to an out-of-bounds access. // Narrowing: If the Cast does not narrow the input type, then it's safe to remove the cast because the array access // will be safe. static const DependencyType NonFloatingNarrowing; // Use case example: Sinking nodes out of a loop // Non-Floating & Non-Narrowing: We don't want the Cast that forces the node to be out of loop to be removed in any // case. Otherwise, the sunk node could float back into the loop, undoing the sinking. // This Cast is only used for pinning without caring about narrowing types. static const DependencyType NonFloatingNonNarrowing; test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java line 100: > 98: @Run(test = "test3") > 99: public static void test3_runner() { > 100: i = RANDOM.nextInt(3, length-1); Suggestion: i = RANDOM.nextInt(3, length - 1); ------------- PR Review: https://git.openjdk.org/jdk/pull/24575#pullrequestreview-3510584501 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565071692 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565111822 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565208320 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565130012 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565000528 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565211189 From roland at openjdk.org Wed Nov 26 16:14:43 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 26 Nov 2025 16:14:43 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v5] In-Reply-To: References: Message-ID: > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request incrementally with four additional commits since the last revision: - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24575/files - new: https://git.openjdk.org/jdk/pull/24575/files/3569280e..2aa918e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=03-04 Stats: 13 lines in 2 files changed: 5 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From pchilanomate at openjdk.org Wed Nov 26 22:34:37 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:34:37 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v8] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: - More changes from Coleen's review - Drop VTMS from names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/dee2b843..623bc518 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=06-07 Stats: 203 lines in 16 files changed: 41 ins; 0 del; 162 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 26 22:44:17 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:44:17 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v9] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: missing to initialize _is_disabler_at_start ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/623bc518..7aa02a46 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=07-08 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 26 22:44:18 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:44:18 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> Message-ID: On Wed, 26 Nov 2025 07:29:37 GMT, David Holmes wrote: >> I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > > But it isn't then an atomic update. Only the disablers write to this counter while holding `VThreadTransition_lock` (verified in the assert above). But we still need to use `AtomicAccess` for the store because it can be concurrently read by the virtual thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566654448 From pchilanomate at openjdk.org Wed Nov 26 22:51:55 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:55 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:10:43 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> keep preexisting rebind order for mount > > src/hotspot/share/classfile/javaClasses.cpp line 1688: > >> 1686: int java_lang_Thread::_jvmti_thread_state_offset; >> 1687: int java_lang_Thread::_VTMS_transition_disable_count_offset; >> 1688: int java_lang_Thread::_is_in_VTMS_transition_offset; > > Since you're renaming these anyway, can we drop the VTMS part? Just call it vthread_transition_disable_count_offset and is_in_vthread_transition_offset? There are other VTMS named things that aren't these flags but they can stay. Maybe migrate other names at some future point. I dropped VTMS from all names. > src/hotspot/share/opto/library_call.cpp line 3046: > >> 3044: } >> 3045: >> 3046: bool LibraryCallKit::inline_native_vthread_start_transition(address funcAddr, const char* funcName, bool is_final_transition) { > > Would it be helpful to add a comment above this to say what this does? This is supposed to match some non-intrinsic code and might be helpful if you referenced that here. Added a comment. > src/hotspot/share/prims/jvm.cpp line 3671: > >> 3669: >> 3670: JVM_ENTRY(void, JVM_VirtualThreadStartFinalTransition(JNIEnv* env, jobject vthread)) >> 3671: oop vt = JNIHandles::resolve_external_guard(vthread); > > Why do the opto runtime versions set is_in_VTMTS_transition in both the java.lang.Thread and JavaThread and these don't? Because we set them in the intrinsic when trying to start the transition. Method `MountUnmountDisabler::start_transition` expects them to be false so we need to clear them in the opto versions. > src/hotspot/share/runtime/mountUnmountDisabler.hpp line 34: > >> 32: >> 33: class MountUnmountDisabler : public AnyObj { >> 34: static volatile int _global_start_transition_disable_count; > > Can you describe this variable - when is it set and why is there a global disabler? What does it mean to have 'n' active disablers? > > A comment at the beginning of MountUnmountDisabler to say something of the effect that during virtual thread mounting and unmounting, JVMTI and operations that need to examine thread state need to be disabled. Or is it the converse? During JVMTI and operations that examine the state of threads, virtual thread mounting and unmounting must wait until these operations are complete. This class is for the latter right? Added a comment for the class and this counter. > src/hotspot/share/runtime/mutexLocker.cpp line 52: > >> 50: Mutex* JvmtiThreadState_lock = nullptr; >> 51: Monitor* EscapeBarrier_lock = nullptr; >> 52: Monitor* VTMSTransition_lock = nullptr; > > oh you could drop the name VTMS and call it VThreadTransitionLock can't you? Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566661560 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566662105 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566663466 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566666104 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566666238 From pchilanomate at openjdk.org Wed Nov 26 22:51:58 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:58 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v9] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: <9R5lVpD1GBtUw9g9Bc5X7wSEI2a-oFM2Q29HUmyqSmc=.5fb087cf-4305-4bf1-b730-8a3bda7fbe9a@github.com> On Tue, 25 Nov 2025 23:27:47 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> missing to initialize _is_disabler_at_start > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1772: > >> 1770: >> 1771: assert(java_thread != nullptr, "sanity check"); >> 1772: assert(!java_thread->is_in_VTMS_transition(), "sanity check"); > > Why don't you need these asserts anymore? We can?t assert this because it could be temporarily set by the target while trying to transition. Previously we had two fields in JavaThread, `_VTMS_transition_mark` and `_is_in_VTMS_transition`. `_VTMS_transition_mark` was set first (checked by the disabler), and if transitions were disabled we waited. Once the transition could start we set `_is_in_VTMS_transition`. Going over the changes I see I removed one assert in `JvmtiEnvBase::get_vthread_jvf` that should be okay to keep, so I restored it. Also added an assert in `JavaThread::is_in_VTMS_transition()` (now `is_in_vthread_transition`) to verify that if it?s accessed from another thread then it has to be done from a safe context where the value will not change?right after checking. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566664593 From pchilanomate at openjdk.org Wed Nov 26 22:51:59 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:45:14 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/javaThread.cpp line 1152: >> >>> 1150: bool JavaThread::is_in_VTMS_transition() const { >>> 1151: return AtomicAccess::load(&_is_in_VTMS_transition); >>> 1152: } >> >> Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? > > Why is there the same flag with the same name in both the Java class and C++ JavaThread? Might be an efficient cache, so something should say that (if true). The one in `JavaThread` is needed for the `disable_transition_for_all` case. Processing each vthread is not viable, so we instead process all `JavaThreads`. If no `JavaThread` is in a transition then it implies no vthread is in a transition. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566665257